Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I have a script for concurrent web scraping: https://github.com/mateuszbuda/webscraping-benchmark It takes a file with urls and scrapes the content. For more demanding websites it can use web scraping API that handles rotating proxies. I add some logic to process the output as needed.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: