Ongoing Web Scraping from 75 Plus Websites
Project Title: Ongoing Web Scraping verified by 75 Plus Websites
I am a private individual as this point and my project is for my own purpose, not a commercial endeavour at this stage.
I am interested to explore the possibility of outsourcing my needs to a company such as your, below is an overview of my requirements.
1. Scrap data and get it verified from 75+ Websites that produce on average 400,000+ web pages
2. Extract data using both CSS tags and were not possible to isolate data sought the use of Regex
3. All spiders will run daily and all results required in a CSV file format.
4. All spiders are currently written in Python, Scrapy. I would prefer to continue using my spiders if possible for 2 reasons
a. Keep costs down
b. I use several other tools to generate my Spiders in Python.
Currently I have no requirements for IP Pool’s however this is something that maybe requires in the future for one or two of the spiders but most spiders operate on websites with whom I have permission to scrape.
The number of websites will grow over time and some will become redundant too.
For similar work requirement feel free to email us on firstname.lastname@example.org.