An effective way is widely used in the mobile development sector to build APIService, which I now bring to the data-related sector.
- Crawl data easily
- Collect any possible data you want, even Tinder! (check robots.txt, the license page,... to make sure you are not violent with any rules)
- Wide range use
The demonstration is from this repo (Crawl data section).
Here I demonstrate how to catch Shopee API, you could use the same technique for every other site:
- APIService
- ItemModel
- GetData
self.headers
) on file APIService.ipynb
to match yours.
📢 Up-to-date: 07/10/2022
Go to the Shopee folder.
Go to the Artstation folder.
💡 TODO: This is the code from November 2021, now October 2022, for some reason, looks like it crawls all the data available with per_page = 3, max_page=1
, so:
- Check the API call
- If you decide to use this code, make sure to check attribute
all_items
to not have any duplicate value, and limit it, because calling the methodcrawlImage()
will take time.
Should I publish it 🤔
Go to the Pexels folder.
Not done yet! Now it works on a JSON file you already downloaded, use the function on GetData
, need to change the APIService.