Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

the hardest part of scrapping is bypassing Cloudflare/captchas/fingerprinting etc


The hardest part is not telling anyone how you're bypassing it!


I can talk about this bypass because they've fixed it: a site I was scraping rolled their own custom captcha that was just multiple choice. But they didn't have a nonce, so I would just attempt all the choices, and one of them would let me in.


The captcha put you on notice that your scraping wasn't authorized. Depending on the details and circumstances, bypassing it and scraping anyways may have been a crime.


Definitely. What are your thoughts on the CloudFlare agent identity




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: