-
This project is part of our SIH project for which we needed job and internship listings to train our models. This project will scrape all the internship and job postings available on internshala.com.
-
This employs tactics such as exponential cooldown times on failure to scrape data, fixed amount for retries for one posting, and random interval pauses between each request to avoid rate limiting and detection triggers from the server.
-
Through these tactics I successfully scraped all of approx 17000 records from the website.
-
Scope of improvement would include increasing the speed of scraping.
-
Notifications
You must be signed in to change notification settings - Fork 0
sparky0520/Internshala-Scraper
Folders and files
Name | Name | Last commit message | Last commit date | |
---|---|---|---|---|
Repository files navigation
About
This script will scrape all the jobs and internships available on Internshala
Resources
Stars
Watchers
Forks
Releases
No releases published
Packages 0
No packages published