This project focuses on creating a data pipeline to crawl the watchlist data from the Tehran Stock Exchange (TSE), transforming it, and loading it into a data lake for further analysis. The pipeline is designed to automate the process of collecting stock data, ensuring it is structured appropriately, and making it readily available for analytical purposes.
Before running the project, ensure you have the following installed:
- Python (version 3.x.x)
- Pip (package manager)
-
Clone the repository:
git clone https://github.com/sinanazem/tehran-stock-exchange.git cd tehran-stock-exchange
-
Install dependencies:
pip install -r requirements.txt
run the following command:
export PYTHONPATH=${PWD}
run the following command:
python src/run.py --startdate 1402-07-01 --enddate 1402-07-10 --delete "True"
Customize the project configuration in the config.yaml
file. Adjust parameters such as API keys, data storage locations, and any other relevant settings.
Contributions are welcome! If you have ideas for improvements or find any issues, please open an issue or submit a pull request.