Skip to content

Commit

Permalink
Update READMEs (iusztinpaul#42)
Browse files Browse the repository at this point in the history
* docs: Add contributors

* docs: Update README

* docs: Add contributors

* docs: Add contributors

* docs: Update building blocks summary

* docs: Finish main README

* docs: Refine main README

* feat: Add diagrams

* docs: Update LICENSE

* docs: Update diagrams with the right logos

* docs: Adapt training pipeline README

* docs: Refine Training README

* docs: Refine Streaming Pipeline README

* docs: Refine financial bot README

* docs: Add Gradio UI image

* docs: Update Gradio UI image
  • Loading branch information
iusztinpaul authored Nov 21, 2023
1 parent 060f8a1 commit dd75844
Show file tree
Hide file tree
Showing 11 changed files with 235 additions and 127 deletions.
2 changes: 1 addition & 1 deletion LICENSE.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
MIT License

Copyright (c) 2023 Paul Iusztin & Pau Labarta Bajo
Copyright (c) 2023 Paul Emil Iusztin & Pau Labarta Bajo

Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
Expand Down
146 changes: 107 additions & 39 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
<div align="center">
<h2>Hands-on LLMOps</h2>
<h2>Hands-on LLMs Course </h2>
<h1>Train and Deploy a Real-Time Financial Advisor</h1>
<i>by <a href="https://github.com/iusztinpaul">Paul Iusztin</a> and <a href="https://github.com/Paulescu">Pau Labarta Bajo</a></i>
<i>by <a href="https://github.com/iusztinpaul">Paul Iusztin</a>, <a href="https://github.com/Paulescu">Pau Labarta Bajo</a> and <a href="https://github.com/Joywalker">Alexandru Razvant</a></i>
</div>

## Table of Contents
Expand All @@ -10,24 +10,50 @@
- [2. Setup External Services](#2-setup-external-services)
- [3. Install & Usage](#3-install--usage)
- [4. Video lectures](#4-video-lectures)
- [5. License](#5-license)
- [6. Contributors & Teachers](#6-contributors--teachers)

------


## 1. Building Blocks

### Training pipeline
- [x] Fine-tune Falcon 7B using our own [Q&A generated dataset](/modules/q_and_a_dataset_generator/) containing investing questions and answers based on Alpaca News.
- It seems that 1 GPU is enough if we use [Lit-Parrot](https://lightning.ai/pages/blog/falcon-a-guide-to-finetune-and-inference/)
### 1.1. Training pipeline 🖋️

### Real-time data pipeline
- [x] Build real-time feature pipeline, that ingests data form Alpaca, computes embeddings, and stores them into a serverless Vector DB.
Training pipeline that:
- loads a proprietary Q&A dataset
- fine-tunes an open-source LLM using QLoRA
- logs the training experiments on [Comet ML's](https://www.comet.com?utm_source=thepauls&utm_medium=partner&utm_content=github) experiment tracker & the inference results on [Comet ML's](https://www.comet.com?utm_source=thepauls&utm_medium=partner&utm_content=github) LLMOps dashboard
- stores the best model on [Comet ML's](https://www.comet.com/site/products/llmops/?utm_source=thepauls&utm_medium=partner&utm_content=github) model registry

### Inference pipeline
- [ ] REST API for inference, that
1. receives a question (e.g. "Is it a good time to invest in renewable energy?"),
2. finds the most relevant documents in the VectorDB (aka context)
3. sends a prompt with question and context to our fine-tuned Falcon and return response.
The **training pipeline** is **deployed** using [Beam](https://docs.beam.cloud/getting-started/quickstart?utm_source=thepauls&utm_medium=partner&utm_content=github) as a serverless GPU infrastructure.

-> Found under the `modules/training_pipeline` directory.

### 1.2. Streaming real-time pipeline 🚰

Real-time feature pipeline that:
- ingests financial news from [Alpaca](https://alpaca.markets/docs/api-references/market-data-api/news-data/)
- cleans & transforms the news documents into embeddings in real-time using [Bytewax](https://github.com/bytewax/bytewax?utm_source=thepauls&utm_medium=partner&utm_content=github)
- stores the embeddings into the [Qdrant Vector DB](https://qdrant.tech/?utm_source=thepauls&utm_medium=partner&utm_content=github)

The **streaming pipeline** is **automatically deployed** on an AWS EC2 machine using a CI/CD pipeline built in GitHub actions.

-> Found under the `modules/streaming_pipeline` directory.

### 1.3. Inference pipeline 🤖

Inference pipeline that uses [LangChain](https://github.com/langchain-ai/langchain) to create a chain that:
* downloads the fine-tuned model from [Comet's](https://www.comet.com?utm_source=thepauls&utm_medium=partner&utm_content=github) model registry
* takes user questions as input
* queries the [Qdrant Vector DB](https://qdrant.tech/?utm_source=thepauls&utm_medium=partner&utm_content=github) and enhances the prompt with related financial news
* calls the fine-tuned LLM for financial advice using the initial query, the context from the vector DB, and the chat history
* persists the chat history into memory
* logs the prompt & answer into [Comet ML's](https://www.comet.com/site/products/llmops/?utm_source=thepauls&utm_medium=partner&utm_content=github) LLMOps monitoring feature

The **inference pipeline** is **deployed** using [Beam](https://docs.beam.cloud/deployment/rest-api?utm_source=thepauls&utm_medium=partner&utm_content=github) as a serverless GPU infrastructure, as a RESTful API. Also, it is wrapped under a UI for demo purposes, implemented in [Gradio](https://www.gradio.app/).

-> Found under the `modules/financial_bot` directory.

<br/>

Expand All @@ -36,37 +62,49 @@

## 2. Setup External Services

Before diving into the modules, you have to set up a couple of additional tools for the course.
Before diving into the modules, you have to set up a couple of additional external tools for the course.

**NOTE:** You can set them up as you go for every module, as we will point you in every module what you need.

### 2.1. Alpaca
`financial news data source`

Follow this [document](https://alpaca.markets/docs/market-data/getting-started/), showing you how to create a FREE account, generate the API Keys, and put them somewhere safe.
Follow this [document](https://alpaca.markets/docs/market-data/getting-started/) to show you how to create a FREE account and generate the API Keys you will need within this course.

**Note:** 1x Alpaca data connection is FREE.

### 2.2. Qdrant
`vector DB`
`serverless vector DB`

Go to [Qdrant](https://qdrant.tech/?utm_source=thepauls&utm_medium=partner&utm_content=github), create a FREE account, and follow [this document](https://qdrant.tech/documentation/cloud/authentication/?utm_source=thepauls&utm_medium=partner&utm_content=github) on how to generate the API Keys.
Go to [Qdrant](https://qdrant.tech/?utm_source=thepauls&utm_medium=partner&utm_content=github) and create a FREE account.

After, follow [this document](https://qdrant.tech/documentation/cloud/authentication/?utm_source=thepauls&utm_medium=partner&utm_content=github) on how to generate the API Keys you will need within this course.

**Note:** We will use only Qdrant's freemium plan.

### 2.3. Comet ML
`ML platform`
`serverless ML platform`

Go to [Comet ML](https://www.comet.com/signup?utm_source=thepauls&utm_medium=partner&utm_content=github) and create a FREE account.

Go to [Comet ML](https://www.comet.com/signup?utm_source=thepauls&utm_medium=partner&utm_content=github), create a FREE account, a project, and an API KEY. We will show you in every module how to add these credentials.
After, [follow this guide](https://www.comet.com/docs/v2/guides/getting-started/quickstart/) to generate an API KEY and a new project, which you will need within the course.

**Note:** We will use only Comet ML's freemium plan.

### 2.4. Beam
`cloud compute`
`serverless GPU compute | training & inference pipelines`

Go to [Beam](https://www.beam.cloud?utm_source=thepauls&utm_medium=partner&utm_content=github) and create a FREE account.

Go to [Beam](https://www.beam.cloud?utm_source=thepauls&utm_medium=partner&utm_content=github) and follow their quick setup/get started tutorial. You must create a FREE account, install their CLI and configure your credentials on your local machine.
After, you must follow their [installation guide](https://docs.beam.cloud/getting-started/installation?utm_source=thepauls&utm_medium=partner&utm_content=github) to install their CLI & configure it with your Beam credentials.

- [Introduction guide](https://docs.beam.cloud/getting-started/introduction?utm_source=thepauls&utm_medium=partner&utm_content=github)
- [Installation guide](https://docs.beam.cloud/getting-started/installation?utm_source=thepauls&utm_medium=partner&utm_content=github)
To read more about Beam, here is an [introduction guide](https://docs.beam.cloud/getting-started/introduction?utm_source=thepauls&utm_medium=partner&utm_content=github).

**Note:** You have ~10 free compute hours. Afterward, you pay only for what you use. If you have an Nvidia GPU >8 GB VRAM & don't want to deploy the training & inference pipelines, using Beam is optional.

#### Troubleshooting

When using Poetry, we had issues locating the Beam CLI when using it inside the Poetry virtual environment. To fix this, after installing Beam, create a symlink that points to Poetry's binaries, as follows:
When using Poetry, we had issues locating the Beam CLI inside a Poetry virtual environment. To fix this, after installing Beam, we create a symlink that points to Poetry's binaries, as follows:
```shell
export COURSE_MODULE_PATH=<your-course-module-path> # e.g., modules/training_pipeline
cd $COURSE_MODULE_PATH
Expand All @@ -77,11 +115,13 @@ When using Poetry, we had issues locating the Beam CLI when using it inside the


### 2.5. AWS
`cloud compute`
`cloud compute | feature pipeline`

Go to [AWS](https://aws.amazon.com/console/), create an account, and generate a pair of credentials.

After, download and install their [AWS CLI](https://docs.aws.amazon.com/cli/latest/userguide/getting-started-install.html) and [configure it](https://docs.aws.amazon.com/cli/latest/userguide/cli-chap-configure.html) with your credentials.
After, download and install their [AWS CLI v2.11.22](https://docs.aws.amazon.com/cli/latest/userguide/getting-started-install.html) and [configure it](https://docs.aws.amazon.com/cli/latest/userguide/cli-chap-configure.html) with your credentials.

**Note:** You will pay only for what you use. You will deploy only a `t2.small` EC2 VM, which is only `~$0.023` / hour. If you don't want to deploy the feature pipeline, using AWS is optional.


## 3. Install & Usage
Expand All @@ -93,19 +133,6 @@ Thus, check out the README for every module individually to see how to install &
3. [streaming_pipeline](/modules/streaming_pipeline/)
4. [inference_pipeline](/modules/financial_bot/)


### 3.1 Run Notebooks Server
If you want to run a notebook server inside a virtual environment, follow the next steps.

First, expose the virtual environment as a notebook kernel:
```shell
python -m ipykernel install --user --name hands-on-llms --display-name "hands-on-llms"
```
Now run the notebook server:
```shell
jupyter notebook notebooks/ --ip 0.0.0.0 --port 8888
```

## 4. Video lectures

### 4.0 Intro to the course
Expand Down Expand Up @@ -134,4 +161,45 @@ jupyter notebook notebooks/ --ip 0.0.0.0 --port 8888
<p>Click here to watch the video 🎬</p>
<img src="media/youtube_thumbnails/02_fine_tuning_pipeline_hands_on.png" alt="Hands-on Fine Tuning an LLM" style="width:75%;">
</a>
</div>
</div>

## 5. License

This course is an open-source project released under the MIT license. Thus, as long you distribute our LICENSE and acknowledge our work, you can safely clone or fork this project and use it as a source of inspiration for whatever you want (e.g., university projects, college degree projects, etc.).

## 6. Contributors & Teachers

<table>
<tr>
<td><img src="https://github.com/Paulescu.png" width="100" style="border-radius:50%;"/></td>
<td>
<strong>Pau Labarta Bajo | Senior ML & MLOps Engineer </strong><br />
<i>Main teacher. The guy from the video lessons.</i><br /><br />
<a href="https://www.linkedin.com/in/pau-labarta-bajo-4432074b/">LinkedIn</a><br />
<a href="https://twitter.com/paulabartabajo_">Twitter/X</a><br />
<a href="https://www.youtube.com/@realworldml">Youtube</a><br />
<a href="https://www.realworldml.xyz/subscribe">Real-World ML Newsletter</a><br />
<a href="https://www.realworldml.xyz/subscribe">Real-World ML Site</a>
</td>
</tr>
<tr>
<td><img src="https://github.com/Joywalker.png" width="100" style="border-radius:50%;"/></td>
<td>
<strong>Alexandru Razvant | Senior ML Engineer </strong><br />
<i>Second chef. The engineer behind the scenes.</i><br /><br />
<a href="https://www.linkedin.com/in/arazvant/">LinkedIn</a><br />
<a href="https://www.neuraleaps.com/">Neura Leaps</a>
</td>
</tr>
<tr>
<td><img src="https://github.com/iusztinpaul.png" width="100" style="border-radius:50%;"/></td>
<td>
<strong>Paul Iusztin | Senior ML & MLOps Engineer </strong><br />
<i>Main chef. The guys who randomly pop in the video lessons.</i><br /><br />
<a href="https://www.linkedin.com/in/pauliusztin/">LinkedIn</a><br />
<a href="https://twitter.com/iusztinpaul">Twitter/X</a><br />
<a href="https://pauliusztin.substack.com/">Decoding ML Newsletter</a><br />
<a href="https://www.pauliusztin.me/">Personal Site | ML & MLOps Hub</a>
</td>
</tr>
</table>
Binary file added media/feature_pipeline_architecture.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added media/financial_bot_gradio_ui.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added media/github_actions_cd.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added media/github_actions_secrets.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added media/inference_pipeline_architecture.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added media/training_pipeline_architecture.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
55 changes: 28 additions & 27 deletions modules/financial_bot/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,17 +4,18 @@ Inference pipeline that uses [LangChain](https://github.com/langchain-ai/langcha
* downloads the fine-tuned model from [Comet's](https://www.comet.com?utm_source=thepauls&utm_medium=partner&utm_content=github) model registry
* takes user questions as input
* queries the [Qdrant Vector DB](https://qdrant.tech/?utm_source=thepauls&utm_medium=partner&utm_content=github) and enhances the prompt with related financial news
* calls the fine-tuned LLM for the final answer
* calls the fine-tuned LLM for financial advice using the initial query, the context from the vector DB, and the chat history
* persists the chat history into memory
* logs the prompt & answer into [Comet ML's](https://www.comet.com/site/products/llmops/?utm_source=thepauls&utm_medium=partner&utm_content=github) LLMOps monitoring feature

The **inference pipeline** is **deployed** using [Beam](https://docs.beam.cloud/deployment/rest-api?utm_source=thepauls&utm_medium=partner&utm_content=github) as a serverless GPU infrastructure, as a RESTful API. Also, it is wrapped under a UI for demo purposes, implemented in [Gradio](https://www.gradio.app/).

## Table of Contents

- [1. Motivation](#1-motivation)
- [2. Install](#2-install)
- [2.1. Dependencies](#21-dependencies)
- [2.2. Qdrant](#21-qdrant)
- [2.3. Beam](#21-beam)
- [2.2. Qdrant & Beam](#21-qdrant--beam)
- [3. Usage](#3-usage)
- [3.1. Local](#31-local)
- [3.2. Deploy to Beam as a RESTful API](#32-deploy-to-beam)
Expand All @@ -34,6 +35,8 @@ Thus, using [LangChain](https://github.com/langchain-ai/langchain), we will crea

Also, the final step is to put the financial assistant to good use and deploy it as a serverless RESTful API using [Beam](https://www.beam.cloud?utm_source=thepauls&utm_medium=partner&utm_content=github).

![architecture](../../media/feature_pipeline_architecture.png)

# 2. Install

## 2.1. Dependencies
Expand All @@ -43,12 +46,12 @@ Main dependencies you have to install yourself:
* Poetry 1.5.1
* GNU Make 4.3

Install dependencies:
Installing all the other dependencies is as easy as running:
```shell
make install
```

For developing run:
When developing run:
```shell
make install_dev
```
Expand All @@ -57,74 +60,72 @@ Prepare credentials:
```shell
cp .env.example .env
```
--> and complete the `.env` file with your credentials.

## 2.2. Qdrant

You must create a FREE account in Qdrant and generate the `QDRANT_API_KEY` and `QDRANT_URL` environment variables. After, be sure to add them to your `.env` file.

-> [Check out this document to see how.](https://qdrant.tech/documentation/cloud/authentication/?utm_source=thepauls&utm_medium=partner&utm_content=github)

--> and complete the `.env` file with your [external services credentials](https://github.com/iusztinpaul/hands-on-llms/tree/main#2-setup-external-services).

## 2.3. Beam
`optional step in case you want to use Beam`
## 2.2. Qdrant & Beam

Create and configure a free Beam account to deploy it as a serverless RESTful API and show it to your friends. You will pay only for what you use.

-> [Create a Beam account & configure it.](https://www.beam.cloud?utm_source=thepauls&utm_medium=partner&utm_content=github)
Check out the [Setup External Services](https://github.com/iusztinpaul/hands-on-llms/tree/main#2-setup-external-services) section to see how to create API keys for them.


# 3. Usage

## 3.1. Local

Run bot locally:
Run the bot locally with a predefined question:
```shell
make run
```

Run bot locally in dev mode:
For debugging & testing, run the bot locally with a predefined question, while mocking the LLM:
```shell
make run_dev
```

## 3.2. Deploy to Beam as a RESTful API
## 3.2. Beam | RESTful API
`deploy the financial bot as a RESTful API to Beam [optional]`

**First**, you must set up Beam, as explained in the [Setup External Services](https://github.com/iusztinpaul/hands-on-llms/tree/main#2-setup-external-services) section.

Deploy the bot under a RESTful API using Beam:
```shell
make deploy_beam
```

Deploy the bot under a RESTful API using Beam in dev mode:
For debugging & testing, deploy the bot under a RESTful API using Beam while mocking the LLM:
```shell
make deploy_beam_dev
```

To test the deployment, make a request to the bot calling the RESTful API, as follows:
To test the deployment, make a request to the bot calling the RESTful API as follows (the first request will take a while as the LLM needs to load):
```shell
export BEAM_DEPLOYMENT_ID=<BEAM_DEPLOYMENT_ID> # e.g., <xxxxx> from https://<xxxxx>.apps.beam.cloud
export BEAM_AUTH_TOKEN=<BEAM_AUTH_TOKEN> # e.g., <xxxxx> from Authorization: Basic <xxxxx>

make call_restful_api DEPLOYMENT_ID=${BEAM_DEPLOYMENT_ID} TOKEN=${BEAM_AUTH_TOKEN}
```

**Note:** To find out `BEAM_DEPLOYMENT_ID` and `BEAM_AUTH_TOKEN` navigate to your `financial_bot` or `financial_bot_dev` Beam app.
**Note:** To find out `BEAM_DEPLOYMENT_ID` and `BEAM_AUTH_TOKEN` navigate to your `financial_bot` or `financial_bot_dev` [Beam app](https://www.beam.cloud/dashboard/apps?utm_source=thepauls&utm_medium=partner&utm_content=github).

**IMPORTANT:** After you finish testing your project, don't forget to stop your Beam deployment.
**IMPORTANT NOTE 1:** After you finish testing your project, don't forget to stop your Beam deployment.
**IMPORTANT NOTE 2:** The financial bot will work only on CUDA-enabled Nvidia GPUs with ~8 GB VRAM. If you don't have one and wish to run the code, you must deploy it to [Beam](https://www.beam.cloud?utm_source=thepauls&utm_medium=partner&utm_content=github).

## 3.3. Gradio UI

To test out & play with the financial bot, you can run it locally under a Gradio UI.

Start the Gradio UI:
```shell
make run_ui
```

Start the Gradio UI in dev mode:
Start the Gradio UI in debug mode while mocking the LLM:
```shell
make run_ui_dev
```

**NOTE:** Running the commands from above will host the UI on your computer. To run them, **you need an Nvidia GPU with enough resources** (e.g., to run the inference using Falcon 7B, you need ~8 GB VRAM). If you don't have that available, you can deploy it to `Gradio Spaces` on HuggingFace. It is pretty straightforward to do so. [Here are some docs to get you started](https://huggingface.co/docs/hub/spaces-sdks-gradio).
![Financial Bot Gradio UI](../../media/financial_bot_gradio_ui.png)

**NOTE:** Running the commands from above will host the UI on your computer. To run them, **you need a CUDA-enabled Nvidia GPU with enough resources** (e.g., to run the inference using Falcon 7B, you need ~8 GB VRAM). If you don't have that available, you can deploy it to `Gradio Spaces` on HuggingFace. It is straightforward to do so. [Here are some docs to get you started](https://huggingface.co/docs/hub/spaces-sdks-gradio).

## 3.4. Linting & Formatting

Expand Down
Loading

0 comments on commit dd75844

Please sign in to comment.