This Python script is a web crawler designed to extract and process real estate information. The script utilizes the Playwright library for web scraping, Aiosqlite for SQLite database interactions, and Rich for console logging.
Saudi.public.real.estate.listings.data.collection.tool.1.mp4
Make sure to install the required dependencies using the following command:
pip install playwright aiosqlite playwright_stealth argparse rich
--sqlite
: Flag to save data to an SQLite database.--json
: Flag to save data to a JSON file.--st
: Starting page number for crawling.
To start crawling from page 1 and save data to SQLite, run the following command:
python aqar.py --sqlite
To start crawling from a specific page (e.g., page 5) and save data to a JSON file, run:
python aqar.py --json --st 5
The script uses the Rich library for logging, providing a visually appealing and informative console output.
The SQLite database schema includes the following fields in the "Aqarat" table:
adid
: Ad ID (Primary Key)title
: Ad titledescription
: Ad descriptionauthor_name
: Author's nameprice
: Ad pricefilters
: Filters applied to the propertygeneric_values
: Values corresponding to the filterscat
: Property categoryauthor_url
: URL of the author's profilecity
: City of the propertycitydir
: City district or partdist
: District of the propertyimgs
: URLs of images associated with the admap_url
: URL of the property on the map
Mohammed Alraddadi
- LinkedIn: https://www.linkedin.com/in/raddadi/
- Email: [email protected]