Skip to content

patkle/scrapy-google-flights

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

24 Commits
 
 
 
 
 
 
 
 

Repository files navigation

scrapy-google-flights

This is a small project that demonstrates how you could scrape flight data from Google Flights using Scrapy and Pyppeteer. Currently, only one-way flights are supported.

requirements

To run this spider, you will need to install pyppeteer, scrapy-poet and spidermon.

starting the spider

After cloning and installing the necessary requirements, you should be able to start the spider with the command scrapy crawl google_flights. You can see more options for running the spider with scrapy crawl --help.

changing the flight searches

The searches performed by the spider are performed in the file searches.json. This file can be found in the folder scrapy_google_flights/resources/. The format looks like this:

{
    "origin": "airport from which you start",
    "destination": "airport at which you arrive",
    "days_to_depart": days until start of flight
}

origin and destination are specified as IATA airport codes. You can use the search engine for IATA airport codes provided here. days_to_depart is an integer which defines in how many days from now the flight starts. For example, if you wanted to travel from "Berlin Brandenburg Airport" (BER) to "Barcelona–El Prat Airport" (BCN) in 30 days, your json would look like this:

{
    "origin": "BER",
    "destination": "BCN",
    "days_to_depart": 30
}

enable Spidermon notifications via Telegram

You can get notifications when the Spider starts and finishes via Telegram. To enable these notifications, you need to create a Telegram bot and obtain your api access token. A decent guide on how to do that can be found here. After creating your bot, you can add and message it via Telegram. Then, you could replace <BOT_TOKEN_HERE> with your token and get the chat id here: https://api.telegram.org/bot<BOT_TOKEN_HERE>/getUpdates.

You can then set the following values in settings.py:

SPIDERMON_TELEGRAM_SENDER_TOKEN = 'your api access token'
SPIDERMON_TELEGRAM_RECIPIENTS = ['chat id']

For more options see this guide: How do I configure a Telegram bot for Spidermon?

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages