AviMate
a Pilot Cockpit Assistance GEN AI Tool involves creating an application that aids pilots by summarizing Flight manuals, Air Traffic Control (ATC) communications and providing relevant flight information in real-time. This tool aims to enhance situational awareness, reduce workload, and improve safety during flight operations.
𝐴𝓋𝒾𝑀𝒶𝓉𝑒, a groundbreaking Pilot Cockpit Assistance GEN AI Tool. Inspired by the idea of building an in-flight assistant akin to Iron Man's 𝐽𝒶𝓇𝓋𝒾𝓈, AviMate aims to revolutionize how pilots interact with flight data and manage cockpit operations..
To see a demo of the project Click the below ⬇:
AviMate a pilot Assistant is a RAG application designed to assist captains in their flight duration.
The main use cases include:
- Flight Performance Monitoring and Feedback: AviMate tracks various flight manual procedures, performance metrics, such as speed, altitude, fuel consumption, and engine performance.
- Communication Support with Air Traffic Control (ATC): AviMate assists pilots in maintaining clear and efficient communication with ATC.
- Emergency Procedure Assistance: In the event of an in-flight emergency, AviMate provides step-by-step guidance tailored to the specific situation. Whether it's an engine failure, navigation system malfunction, or medical emergency, AviMate offers actionable instructions to help pilots manage and mitigate the crisis effectively.
- Conversational Interaction: Making it easy to get information without remembering. etc.
The dataset used in this project contains information about flight_manuals, flight_performance, atc_communications, weather_predictions etc you can find the data inside the data folder.
The dataset was generated using ChatGPT and contains 300 flight manual records. It serves as the foundation for the Pilot Assistant RAG Application.
- Python 3.11
- Docker and Docker Compose for containerization
- ElasticSearch for full-text search
- Flask as the API interface (see Background for more information on Flask)
- Grafana for monitoring and PostgreSQL as the backend for it
- OpenAI as an LLM
- streamlit as a web interference
Since we use OpenAI, you need to provide the API key:
- Install
direnv
. If you use Ubuntu, runsudo apt install direnv
and thendirenv hook bash >> ~/.bashrc
. - Copy
.envrc_template
into.envrc
and insert your key there. - For OpenAI, it's recommended to create a new project and use a separate key.
- Run
direnv allow
to load the key into your environment.
Create Venv with python 3.11
pipenv --python 3.11
pipenv install ipython openai scikit-learn pandas flask streamlit
pipenv shell
pipenv install elasticsearch openpyxl --dev
pipenv install spacy dateparser --dev
pipenv run python -m spacy download en_core_web_sm
pipenv install sentence-transformers --dev
pipenv install openai-whisper --dev
pipenv install SpeechRecognition --dev
pipenv install streamlit-audiorecorder --dev
docker run -it \
--rm \
--name elasticsearch \
-p 9200:9200 \
-p 9300:9300 \
-e "discovery.type=single-node" \
-e "xpack.security.enabled=false" \
docker.elastic.co/elasticsearch/elasticsearch:8.4.3
docker-compose up
### Running the ollama Model
#### Pulling the model using service-name from the docker-compose.yaml file
```bash
docker-compose exec ollama bash
ollama pull phi3
For experiments, we use Jupyter notebooks.
They are in the notebooks
folder.
To start Jupyter, run:
cd notebooks
pipenv run jupyter notebook
We have the following notebooks:
rag-test.ipynb
: The RAG flow and evaluating the system.evaluation-data-generation.ipynb
: Generating the ground truth dataset for retrieval evaluation.evaluate-vector-retrieval.ipynb
: Elastic serach result evaluation.rag-evaluation.ipynb
: to evaluate the LLM answers by LLM as a Judge approach.
The basic approach - using ElasticSearch
without any boosting - gave the following metrics:
- Hit Rate: 0.550
- MRR: 0.259
The improved version (with tuned boosting): Boosts: {'text': 1.22, 'scenario': 2.81, 'manual_section': 1.95, 'instructions': 2.61} - Hit Rate: 0.550 - MRR: 0.259
- Offline Evaluation and Online Evaluation
- Offline Evaluation: for each question from the ground truth data, answers have generated and classified as NON_RELEVANT" | "PARTLY_RELEVANT" | "RELEVANT using gpt-4o-mini and data is stored in
flight-manuals-evaluations-qa.csv
. Classification carried out by LLM as a Judge approach.
- out of 150 answers, 149 are classified Relevant answer to the given question.
- Online Evaluation: Added +1 and -1 buttons for the relevant/irrelevent responses. User Feedback and conversations data is stored in the Postgres databse.
We use Streamlit for creating the WEB interference for our application.
streamlit run app.py
add grafana in docker-compose.yaml
docker-compose up -d grafana
Go to http://localhost:3000. Enter the default credentials (admin / admin) and follow the prompt to change the password.
click Add datasource postgres:
Host: postgres:5432 (This is the service name defined in your Docker Compose)
Database:
use below URL to access the public dashboard. http://localhost:3000/public-dashboards/c600f179721049098fac390df75346d0
run the application using:
docker-compose -up