I would like to scrape all the real estate data from the mobile App / Website Leboncoin.fr.
This app/website use an API https://api.leboncoin.fr to return the results.
I previously did a working scraper in ruby that imitated the API requests from the mobile APP, but since few weeks it doesn't work anymore.
1. Milestone 1: 100$ (30 mins) downloading Leboncoin mobile app (https://www.leboncoin.fr/) , monitoring the https network activity for fetching results via the API (using charles proxy or ) and saving the HTTPS requests & responses into a file so I can see and reproduce the sequencce (via curl for instance). The goal is to see all the sequences from the start , when the user search all the sellings in Paris from 50K to 500K ( cateory "real estate selling" as category , "Paris" as city, " price between 50K to 500K) and the results.
The delivery is a charles proxy session when I can get all the sequence between the mobile phone app and the leboncoin API service.
2; Milestone 2 (900$) : Finding a robust way to bypass the protection. the script should be run on ruby or nodejs. Tips: previously I used ip rotation and a bit of JA3 fingerprint (using the npm cycletls module) but quickly got capchas. I checked the website leboncoin.fr , it is protected by datadome (= get a preliminar request to https://dd.leboncoin.fr/js to get valid datadome cookie for the next requests of the api.leboncoin.fr). (maybe it is the same way for the mobile app)
find a discussion talking about that https://github.com/RSS-Bridge/rss-bridge/issues/1820
for milestone2 , you can search a bit on github to see the previous works/attempts e.g; "leboncoin datadome" - https://github.com/RSS-Bridge/rss-bridge/issues/1820 or "datadome jsdata") . Actually mobile app of website I dont care which request to imitate but I want something that's work .
If the only reliable way is solving the captcha (by 3rd workers), I am ok that you test such solution in the last case. (https://github.com/MoterHaker/bypass-captcha-examples/blob/main/geo.captcha-delivery.com.js). if this solution reliably work let me know
Delivering for Milestone 2.
- The wrapper script should be able to take a API request and send the results in json
a scripts
- a short video/text about explaining how you did it.