AIFootballPredictions is a machine learning-based system designed to predict whether a football match will have over 2.5 goals. Leveraging historical data from top European leagues (Serie A, EPL, Bundesliga, La Liga, Ligue 1), it utilizes advanced feature engineering and model training techniques to deliver accurate predictions, making it a valuable tool for sports analytics enthusiasts.
π― AI Football Predictions: Will There Be Over 2.5 Goals? π―
Check out the latest predictions for the upcoming football matches! We've analyzed the data and here are our thoughts: PREDICTIONS DONE: 2024-10-04
Premier League:
- β½ Crystal Palace π Liverpool: Over 2.5 Goals! π₯ (57.64% chance)
- β½ Arsenal π Southampton: Over 2.5 Goals! π₯ (82.33% chance)
- β½ Brentford π Wolves: Over 2.5 Goals! π₯ (51.9% chance)
- β½ Leicester π Bournemouth: Over 2.5 Goals! π₯ (90.72% chance)
- β½ Man City π Fulham: Over 2.5 Goals! π₯ (67.08% chance)
- β½ West Ham π Ipswich: Over 2.5 Goals! π₯ (60.52% chance)
- β½ Everton π Newcastle: Under 2.5 Goals (76.06% chance)
- β½ Aston Villa π Man United: Over 2.5 Goals! π₯ (64.2% chance)
- β½ Chelsea π Nott'm Forest: Under 2.5 Goals (85.56% chance)
- β½ Brighton π Tottenham: Over 2.5 Goals! π₯ (51.9% chance)
Serie A:
- β½ Napoli π Como: Over 2.5 Goals! π₯ (80.44% chance)
- β½ Verona π Venezia: Over 2.5 Goals! π₯ (73.87% chance)
- β½ Udinese π Lecce: Under 2.5 Goals (83.62% chance)
- β½ Atalanta π Genoa: Under 2.5 Goals (86.18% chance)
- β½ Inter π Torino: Over 2.5 Goals! π₯ (91.0% chance)
- β½ Juventus π Cagliari: Under 2.5 Goals (65.56% chance)
- β½ Bologna π Parma: Over 2.5 Goals! π₯ (89.08% chance)
- β½ Lazio π Empoli: Under 2.5 Goals (86.03% chance)
- β½ Monza π Roma: Under 2.5 Goals (79.91% chance)
- β½ Fiorentina π Milan: Over 2.5 Goals! π₯ (63.0% chance)
Bundesliga:
- β½ Augsburg π M'gladbach: Over 2.5 Goals! π₯ (64.12% chance)
- β½ Leverkusen π Holstein Kiel: Over 2.5 Goals! π₯ (98.81% chance)
- β½ Werder Bremen π Freiburg: Over 2.5 Goals! π₯ (61.81% chance)
- β½ Union Berlin π Dortmund: Over 2.5 Goals! π₯ (50.95% chance)
- β½ Bochum π Wolfsburg: Under 2.5 Goals (87.88% chance)
- β½ St Pauli π Mainz: Under 2.5 Goals (72.43% chance)
- β½ Heidenheim π RB Leipzig: Over 2.5 Goals! π₯ (59.4% chance)
- β½ Ein Frankfurt π Bayern Munich: Over 2.5 Goals! π₯ (77.41% chance)
- β½ Stuttgart π Hoffenheim: Over 2.5 Goals! π₯ (82.85% chance)
La Liga:
- β½ Leganes π Valencia: Over 2.5 Goals! π₯ (63.07% chance)
- β½ Espanol π Mallorca: Under 2.5 Goals (68.72% chance)
- β½ Getafe π Osasuna: Under 2.5 Goals (94.63% chance)
- β½ Valladolid π Vallecano: Under 2.5 Goals (68.59% chance)
- β½ Las Palmas π Celta: Under 2.5 Goals (67.8% chance)
- β½ Real Madrid π Villarreal: Over 2.5 Goals! π₯ (79.17% chance)
- β½ Girona π Ath Bilbao: Over 2.5 Goals! π₯ (91.36% chance)
- β½ Alaves π Barcelona: Over 2.5 Goals! π₯ (51.68% chance)
- β½ Sevilla π Betis: Over 2.5 Goals! π₯ (67.28% chance)
- β½ Sociedad π Ath Madrid: Under 2.5 Goals (59.6% chance)
Ligue 1:
- β½ Marseille π Angers: Over 2.5 Goals! π₯ (90.5% chance)
- β½ St Etienne π Auxerre: Under 2.5 Goals (56.77% chance)
- β½ Lille π Toulouse: Over 2.5 Goals! π₯ (65.51% chance)
- β½ Rennes π Monaco: Over 2.5 Goals! π₯ (84.85% chance)
- β½ Lyon π Nantes: Over 2.5 Goals! π₯ (63.52% chance)
- β½ Brest π Le Havre: Over 2.5 Goals! π₯ (73.1% chance)
- β½ Strasbourg π Lens: Under 2.5 Goals (79.27% chance)
- β½ Reims π Montpellier: Over 2.5 Goals! π₯ (91.67% chance)
- β½ Nice π Paris SG: Over 2.5 Goals! π₯ (72.06% chance)
- Project Overview
- Directory Structure
- Setup and Installation
- Data Acquisition
- Data Preprocessing
- Model Training
- Upcoming Matches Acquisition
- Making Predictions
- Supported Leagues
- Contributing
- License
- Support
- Disclaimer
AIFootballPredictions aims to create a predictive model to forecast whether a football match will exceed 2.5 goals. The project is divided into four main stages:
- Data Acquisition: Download and merge historical football match data from multiple European leagues.
- Data Preprocessing: Process the raw data to engineer features, handle missing values, and select the most relevant features.
- Model Training: Train several machine learning models, perform hyperparameter tuning, and combine the best models into a voting classifier to make predictions.
- Making Predictions: Use the trained models to predict outcomes for upcoming matches and generate a formatted message for sharing.
The project is organized into the following directories:
ββββ `AIFootballPredictions`
ββββ `conda`: all the conda environemnts
ββββ `data`: the folder for the data
β ββββ `processed`
β ββββ `raw`
ββββ `models`: the folder with the saved and trained models
ββββ `notebooks`: all the notebooks if any
ββββ `scripts`: all the python scripts
ββββ `data_acquisition.py`
ββββ `data_preprocessing.py`
ββββ `train_models.py`
ββββ `acquire_next_matches.py`
ββββ `make_predictions.py`
data_acquisition.py
: Downloads and merges football match data from specified leagues and seasons.data_preprocessing.py
: Preprocesses the raw data, performs feature engineering, and selects the most relevant features.train_models.py
: Trains machine learning models, performs hyperparameter tuning, and saves the best models.acquire_next_matches.py
: Acquires the next football matches data, updates team names using a mapping file, and saves the results to a JSON file.make_predictions.py
: Uses the trained models to predict outcomes for upcoming matches and formats the results into a readable txt message.
Note: it is suggested to avoid path error, to execute all the scripts in the root folder.
To set up the environment for this project, follow these steps:
-
Clone the repository:
git clone https://github.com/yourusername/AIFootballPredictions.git cd AIFootballPredictions
-
Create a conda environment
conda env create -f conda/aifootball_predictions.yaml conda activate aifootball_predictions
To download and merge football match data, run the data_acquisition.py
script:
python scripts/data_acquisition.py --leagues E0 I1 SP1 F1 D1 --seasons 2425 2324 2223 --raw_data_output_dir data/raw
This script downloads match data from football-data.co.uk for the specified leagues and seasons, merges them, and saves the results to the specified output directory.
To avoid error please see the Supported Leagues sections.
Once the raw data is downloaded, preprocess it by running the data_preprocessing.py
script:
python scripts/data_preprocessing.py --raw_data_input_dir data/raw --processed_data_output_dir data/processed --num_features 20 --clustering_threshold 0.5
This script processes each CSV file in the input folder, performs feature engineering, selects relevant features while addressing feature correlation, handles missing values, and saves the processed data.
To train machine learning models and create a voting classifier, use the train_models.py
script:
python scripts/train_models.py --processed_data_input_dir data/processed --trained_models_output_dir models --metric_choice accuracy --n_splits 10 --voting soft
This script processes each CSV file individually, trains several machine learning models, performs hyperparameter tuning, combines the best models into a voting classifier, and saves the trained voting classifier for each league.
To acquire the next football matches data and update the team names, run the acquire_next_matches.py
script:
python scripts/acquire_next_matches.py --get_teams_names_dir data/processed --next_matches_output_file data/next_matches.json
This script will:
- Fetch the next matches data from the football-data.org API.
- Read the unique team names from the processed data files.
- Update the team names in the next matches data using the mapping file.
- This step is necessary because the teams' names acquired with the football-data.org API differ from the teams' names acquired from football-data.co.uk, which've been used to train the ML models.
- Save the updated next matches to a JSON file.
In order to properly execute the acquire_next_matches.py
script it is first necessary to set up the API_KEY to gather the next matches information. Below the procedure on how to properly set up the variable:
-
Register for an API Key:
- Go to the Football-Data.org website and register to get your personal API key.
-
Create a
~/.env
File:- This file will be used by the
load_dotenv
library to set up theAPI_FOOTBALL_DATA
environment variable. - To create the file:
- Open your terminal and run the command:
vim ~/.env
- This will create a new
~/.env
file if it doesn't already exist.
- Open your terminal and run the command:
- This file will be used by the
-
Insert the API Key:
- After running the
vim
command, press thei
key (for "insert mode"). - Write down the following line, replacing
your_personal_key
with your actual API key:API_FOOTBALL_DATA=your_personal_key
- After running the
-
Save and Exit:
- Press the
Esc
key to exit insert mode. - Then, type
:wq!
and pressEnter
to save the changes and exit the editor.
- Press the
-
Verify the Variable:
- To check if the variable has been properly set, run the following command from the terminal:
cat ~/.env
- You should see the
API_FOOTBALL_DATA
variable listed with your API key.
- To check if the variable has been properly set, run the following command from the terminal:
To predict the outcomes for upcoming matches and generate a formatted message for sharing, run the make_predictions.py
script:
python scripts/make_predictions.py --models_dir models --data_dir data/processed --output_file final_predictions.txt --json_competitions data/next_matches.json
This script will:
- Load the pre-trained models and the processed data.
- Make predictions for upcoming matches based on the next matches data.
- Format the predictions into a redable
.txt
message and save it to the specified output file.
For the moment, the team name mapping has been done manually. The predictions currently support the following leagues:
- Premier League: E0
- Serie A: I1
- Ligue 1: F1
- La Liga (Primera Division): SP1
- Bundesliga: D1
For this reason be carful when executing the data acquisition step.
If you want to contribute to this project, please fork the repository and submit a pull request. For major changes, please open an issue first to discuss what you would like to change.
This project is licensed under the BSD-3-Claude license - see the LICENSE
file for details.
If you find this project helpful and would like to support its development, you can buy me a coffee! Your support is greatly appreciated! βοΈπ
This project is intended for educational and informational purposes only. While the AIFootballPredictions system aims to provide accurate predictions for football matches, it is important to understand that predictions are inherently uncertain and should not be used as the sole basis for any decision-making, including betting or financial investments.
The predictions generated by this system can be used as an additional tool during the decision-making process. However, they should be considered alongside other factors and sources of information.
The authors of this project do not guarantee the accuracy, reliability, or completeness of any information provided. Use the predictions at your own risk, and always consider the unpredictability of sports events.
By using this software, you agree that the authors and contributors are not responsible or liable for any losses or damages of any kind incurred as a result of using the software or relying on the predictions made by the system.