I own a couple of book websites on which I publish 30-50 books every day. My own websites are built with Wordpress.
I find a lot of the books on one particular site which lists new books every day.
So, it would save me time if I had an automation which scraped this one site for new books each day, then submitted them to my 2 Wordpress sites.
Project in summary:
I want a script/scraper which can visit this site, collect a 'list' of all the books listed each day, then visit Amazon for each book, and scrape each book's details, including the book cover image, and submit them to my sites as individual posts.
(I will not post the name of the website I want to scrape but I can send it to you if you are interested.)
Project Details:
It is very important to understand these elements of this scraper project:
#1 The site I want to scrape does not have an RSS feed but updates daily.
= therefore, the scraper/script must visit the site, make a record of ALL books scraped so it will ONLY scrape new books (today's books) each day.
#2 The site is only a 'listing' of books, with links to book pages on Amazon, so the scraper must go to the site, collect links to Amazon, then visit those links and scrape each Amazon page.
IMPORTANT:
Many people have read this job description, and focussed on the 'scraping' aspect of it. But after discussion with them, they have no plan for how to submit the scraped data to my sites.
I want the scraper hosted on my server.
It must run every day, and have a 'manual' trigger also (so I can manually run it, if necessary)
The scraped data can be saved to a database, then submitted to my sites, or data can be sent directly to the Wordpress db.
Either way, you must have an understanding of how to take the scraped data and turn it into published posts on my TWO Wordpress sites, or else there is no point applying to do this work.
I don't mind whether this is done in php or python (or any other platform) but it must be robust and able to perform the functions described.
Additional info:
I already have 2 scrapers on my site's server account, one of them visits a site like this each day and collects links, another visits Amazon each day, they are in a subdirectory of my Wordpress site, so it might make sense to do something similar with this script/scraper.
My existing scrapers have a 'feature' which allows me to manually submit books, from the scrape results, to my sites. So that code already exists and could be used for this. I cannot tell you if this is the right or best way to do this, and simply mention it because it is there.
I can provide some additional precise details to the person who creates this. I have worked with a few developers for this kind of scraper / submitter, so I already know most of the details / issues which relate to it so this should be a fairly simple job for someone who knows what they are doing.
Future Projects:
There are also another 3-5 sites I want to scrape / submit from in the same way, so there may be additional similar projects for the right worker.
I also want to build a 'custom dashboard' to sort/filter/view all scraped results, this would be after I have the scraper / submitter tools running satisfactorily.
Please note, these are book sites, they are not high-earners. I do not have large budgets for any of my work. So I am looking for low cost solutions. But I write very good reviews for good workers!
Please write to me with an accurate assessment of how much time this would take you, when you can start and how much you will charge. I will ensure I provide as many details as possible, so the job doesn't have any 'unexpected surprises'.
I've added a low price because I don't want highly inflated proposals.
Thank you.
Budget: $60
Posted On: February 01, 2021 22:15 UTC Category: Data Extraction Skills:Web Scraper