Update READMEs (iusztinpaul#42)

* docs: Add contributors * docs: Update README * docs: Add contributors * docs: Add contributors * docs: Update building blocks summary * docs: Finish main README * docs: Refine main README * feat: Add diagrams * docs: Update LICENSE * docs: Update diagrams with the right logos * docs: Adapt training pipeline README * docs: Refine Training README * docs: Refine Streaming Pipeline README * docs: Refine financial bot README * docs: Add Gradio UI image * docs: Update Gradio UI image
dtragoud · Nov 21, 2023 · dd75844 · dd75844
1 parent 060f8a1
commit dd75844
Show file tree

Hide file tree

Showing 11 changed files with 235 additions and 127 deletions.
diff --git a/LICENSE.md b/LICENSE.md
@@ -1,6 +1,6 @@
 MIT License
 
-Copyright (c) 2023 Paul Iusztin & Pau Labarta Bajo
+Copyright (c) 2023 Paul Emil Iusztin & Pau Labarta Bajo
 
 Permission is hereby granted, free of charge, to any person obtaining a copy
 of this software and associated documentation files (the "Software"), to deal

diff --git a/README.md b/README.md
@@ -1,7 +1,7 @@
 <div align="center">
-    <h2>Hands-on LLMOps</h2>
+    <h2>Hands-on LLMs Course </h2>
     <h1>Train and Deploy a Real-Time Financial Advisor</h1>
-    <i>by <a href="https://github.com/iusztinpaul">Paul Iusztin</a> and <a href="https://github.com/Paulescu">Pau Labarta Bajo</a></i>
+    <i>by <a href="https://github.com/iusztinpaul">Paul Iusztin</a>, <a href="https://github.com/Paulescu">Pau Labarta Bajo</a> and <a href="https://github.com/Joywalker">Alexandru Razvant</a></i>
 </div>
 
 ## Table of Contents
@@ -10,24 +10,50 @@
 - [2. Setup External Services](#2-setup-external-services)
 - [3. Install & Usage](#3-install--usage)
 - [4. Video lectures](#4-video-lectures)
+- [5. License](#5-license)
+- [6. Contributors & Teachers](#6-contributors--teachers)
 
 ------
 
 
 ## 1. Building Blocks
 
-### Training pipeline
-- [x] Fine-tune Falcon 7B using our own [Q&A generated dataset](/modules/q_and_a_dataset_generator/) containing investing questions and answers based on Alpaca News.
-    - It seems that 1 GPU is enough if we use [Lit-Parrot](https://lightning.ai/pages/blog/falcon-a-guide-to-finetune-and-inference/)
+### 1.1. Training pipeline 🖋️ 
 
-### Real-time data pipeline
-- [x] Build real-time feature pipeline, that ingests data form Alpaca, computes embeddings, and stores them into a serverless Vector DB.
+Training pipeline that:
+- loads a proprietary Q&A dataset 
+- fine-tunes an open-source LLM using QLoRA
+- logs the training experiments on [Comet ML's](https://www.comet.com?utm_source=thepauls&utm_medium=partner&utm_content=github) experiment tracker & the inference results on [Comet ML's](https://www.comet.com?utm_source=thepauls&utm_medium=partner&utm_content=github) LLMOps dashboard
+- stores the best model on [Comet ML's](https://www.comet.com/site/products/llmops/?utm_source=thepauls&utm_medium=partner&utm_content=github) model registry
 
-### Inference pipeline
-- [ ] REST API for inference, that
-    1. receives a question (e.g. "Is it a good time to invest in renewable energy?"),
-    2. finds the most relevant documents in the VectorDB (aka context)
-    3. sends a prompt with question and context to our fine-tuned Falcon and return response.
+The **training pipeline** is **deployed** using [Beam](https://docs.beam.cloud/getting-started/quickstart?utm_source=thepauls&utm_medium=partner&utm_content=github) as a serverless GPU infrastructure.
+
+-> Found under the `modules/training_pipeline` directory.
+
+### 1.2. Streaming real-time pipeline 🚰
+
+Real-time feature pipeline that:
+- ingests financial news from [Alpaca](https://alpaca.markets/docs/api-references/market-data-api/news-data/)
+- cleans & transforms the news documents into embeddings in real-time using [Bytewax](https://github.com/bytewax/bytewax?utm_source=thepauls&utm_medium=partner&utm_content=github)
+- stores the embeddings into the [Qdrant Vector DB](https://qdrant.tech/?utm_source=thepauls&utm_medium=partner&utm_content=github)
+
+The **streaming pipeline** is **automatically deployed** on an AWS EC2 machine using a CI/CD pipeline built in GitHub actions.
+
+-> Found under the `modules/streaming_pipeline` directory.
+
+### 1.3. Inference pipeline 🤖
+
+Inference pipeline that uses [LangChain](https://github.com/langchain-ai/langchain) to create a chain that:
+* downloads the fine-tuned model from [Comet's](https://www.comet.com?utm_source=thepauls&utm_medium=partner&utm_content=github) model registry
+* takes user questions as input
+* queries the [Qdrant Vector DB](https://qdrant.tech/?utm_source=thepauls&utm_medium=partner&utm_content=github) and enhances the prompt with related financial news
+* calls the fine-tuned LLM for financial advice using the initial query, the context from the vector DB, and the chat history
+* persists the chat history into memory 
+* logs the prompt & answer into [Comet ML's](https://www.comet.com/site/products/llmops/?utm_source=thepauls&utm_medium=partner&utm_content=github) LLMOps monitoring feature
+
+The **inference pipeline** is **deployed** using [Beam](https://docs.beam.cloud/deployment/rest-api?utm_source=thepauls&utm_medium=partner&utm_content=github) as a serverless GPU infrastructure, as a RESTful API. Also, it is wrapped under a UI for demo purposes, implemented in [Gradio](https://www.gradio.app/).
+
+-> Found under the `modules/financial_bot` directory.
 
 <br/>
 
@@ -36,37 +62,49 @@
 
 ## 2. Setup External Services
 
-Before diving into the modules, you have to set up a couple of additional tools for the course.
+Before diving into the modules, you have to set up a couple of additional external tools for the course.
+
+**NOTE:** You can set them up as you go for every module, as we will point you in every module what you need.
 
 ### 2.1. Alpaca
 `financial news data source`
 
-Follow this [document](https://alpaca.markets/docs/market-data/getting-started/), showing you how to create a FREE account, generate the API Keys, and put them somewhere safe.
+Follow this [document](https://alpaca.markets/docs/market-data/getting-started/) to show you how to create a FREE account and generate the API Keys you will need within this course.
 
+**Note:** 1x Alpaca data connection is FREE.
 
 ### 2.2. Qdrant
-`vector DB`
+`serverless vector DB`
 
-Go to [Qdrant](https://qdrant.tech/?utm_source=thepauls&utm_medium=partner&utm_content=github), create a FREE account, and follow [this document](https://qdrant.tech/documentation/cloud/authentication/?utm_source=thepauls&utm_medium=partner&utm_content=github) on how to generate the API Keys.
+Go to [Qdrant](https://qdrant.tech/?utm_source=thepauls&utm_medium=partner&utm_content=github) and create a FREE account.
 
+After, follow [this document](https://qdrant.tech/documentation/cloud/authentication/?utm_source=thepauls&utm_medium=partner&utm_content=github) on how to generate the API Keys you will need within this course.
+
+**Note:** We will use only Qdrant's freemium plan. 
 
 ### 2.3. Comet ML
-`ML platform`
+`serverless ML platform`
+
+Go to [Comet ML](https://www.comet.com/signup?utm_source=thepauls&utm_medium=partner&utm_content=github) and create a FREE account.
 
-Go to [Comet ML](https://www.comet.com/signup?utm_source=thepauls&utm_medium=partner&utm_content=github), create a FREE account, a project, and an API KEY. We will show you in every module how to add these credentials.
+After, [follow this guide](https://www.comet.com/docs/v2/guides/getting-started/quickstart/) to generate an API KEY and a new project, which you will need within the course.
 
+**Note:** We will use only Comet ML's freemium plan. 
 
 ### 2.4. Beam
-`cloud compute`
+`serverless GPU compute | training & inference pipelines`
+
+Go to [Beam](https://www.beam.cloud?utm_source=thepauls&utm_medium=partner&utm_content=github) and create a FREE account.
 
-Go to [Beam](https://www.beam.cloud?utm_source=thepauls&utm_medium=partner&utm_content=github) and follow their quick setup/get started tutorial. You must create a FREE account, install their CLI and configure your credentials on your local machine.
+After, you must follow their [installation guide](https://docs.beam.cloud/getting-started/installation?utm_source=thepauls&utm_medium=partner&utm_content=github) to install their CLI & configure it with your Beam credentials.
 
-- [Introduction guide](https://docs.beam.cloud/getting-started/introduction?utm_source=thepauls&utm_medium=partner&utm_content=github)
-- [Installation guide](https://docs.beam.cloud/getting-started/installation?utm_source=thepauls&utm_medium=partner&utm_content=github)
+To read more about Beam, here is an [introduction guide](https://docs.beam.cloud/getting-started/introduction?utm_source=thepauls&utm_medium=partner&utm_content=github).
+
+**Note:** You have ~10 free compute hours. Afterward, you pay only for what you use. If you have an Nvidia GPU >8 GB VRAM & don't want to deploy the training & inference pipelines, using Beam is optional. 
 
 #### Troubleshooting
 
-When using Poetry, we had issues locating the Beam CLI when using it inside the Poetry virtual environment. To fix this, after installing Beam, create a symlink that points to Poetry's binaries, as follows:
+When using Poetry, we had issues locating the Beam CLI inside a Poetry virtual environment. To fix this, after installing Beam, we create a symlink that points to Poetry's binaries, as follows:
  ```shell
   export COURSE_MODULE_PATH=<your-course-module-path> # e.g., modules/training_pipeline
   cd $COURSE_MODULE_PATH
@@ -77,11 +115,13 @@ When using Poetry, we had issues locating the Beam CLI when using it inside the
 
 
  ### 2.5. AWS
- `cloud compute`
+ `cloud compute | feature pipeline`
 
  Go to [AWS](https://aws.amazon.com/console/), create an account, and generate a pair of credentials.
 
- After, download and install their [AWS CLI](https://docs.aws.amazon.com/cli/latest/userguide/getting-started-install.html) and [configure it](https://docs.aws.amazon.com/cli/latest/userguide/cli-chap-configure.html) with your credentials.
+ After, download and install their [AWS CLI v2.11.22](https://docs.aws.amazon.com/cli/latest/userguide/getting-started-install.html) and [configure it](https://docs.aws.amazon.com/cli/latest/userguide/cli-chap-configure.html) with your credentials.
+
+ **Note:** You will pay only for what you use. You will deploy only a `t2.small` EC2 VM, which is only `~$0.023` / hour. If you don't want to deploy the feature pipeline, using AWS is optional.
 
 
 ## 3. Install & Usage
@@ -93,19 +133,6 @@ Thus, check out the README for every module individually to see how to install &
 3. [streaming_pipeline](/modules/streaming_pipeline/)
 4. [inference_pipeline](/modules/financial_bot/)
 
-
-### 3.1 Run Notebooks Server
-If you want to run a notebook server inside a virtual environment, follow the next steps.
-
-First, expose the virtual environment as a notebook kernel:
-```shell
-python -m ipykernel install --user --name hands-on-llms --display-name "hands-on-llms"
-```
-Now run the notebook server:
-```shell
-jupyter notebook notebooks/ --ip 0.0.0.0 --port 8888
-```
-
 ## 4. Video lectures
 
 ### 4.0 Intro to the course
@@ -134,4 +161,45 @@ jupyter notebook notebooks/ --ip 0.0.0.0 --port 8888
       <p>Click here to watch the video 🎬</p>
     <img src="media/youtube_thumbnails/02_fine_tuning_pipeline_hands_on.png" alt="Hands-on Fine Tuning an LLM" style="width:75%;">
   </a>
-</div>
+</div>
+
+## 5. License
+
+This course is an open-source project released under the MIT license. Thus, as long you distribute our LICENSE and acknowledge our work, you can safely clone or fork this project and use it as a source of inspiration for whatever you want (e.g., university projects, college degree projects, etc.).
+
+## 6. Contributors & Teachers
+
+<table>
+  <tr>
+    <td><img src="https://github.com/Paulescu.png" width="100" style="border-radius:50%;"/></td>
+    <td>
+      <strong>Pau Labarta Bajo | Senior ML & MLOps Engineer </strong><br />
+      <i>Main teacher. The guy from the video lessons.</i><br /><br />
+      <a href="https://www.linkedin.com/in/pau-labarta-bajo-4432074b/">LinkedIn</a><br />
+      <a href="https://twitter.com/paulabartabajo_">Twitter/X</a><br />
+      <a href="https://www.youtube.com/@realworldml">Youtube</a><br />
+      <a href="https://www.realworldml.xyz/subscribe">Real-World ML Newsletter</a><br />
+      <a href="https://www.realworldml.xyz/subscribe">Real-World ML Site</a>
+    </td>
+  </tr>
+  <tr>
+    <td><img src="https://github.com/Joywalker.png" width="100" style="border-radius:50%;"/></td>
+    <td>
+      <strong>Alexandru Razvant | Senior ML Engineer </strong><br />
+      <i>Second chef. The engineer behind the scenes.</i><br /><br />
+      <a href="https://www.linkedin.com/in/arazvant/">LinkedIn</a><br />
+      <a href="https://www.neuraleaps.com/">Neura Leaps</a>
+    </td>
+  </tr>
+  <tr>
+    <td><img src="https://github.com/iusztinpaul.png" width="100" style="border-radius:50%;"/></td>
+    <td>
+      <strong>Paul Iusztin | Senior ML & MLOps Engineer </strong><br />
+      <i>Main chef. The guys who randomly pop in the video lessons.</i><br /><br />
+      <a href="https://www.linkedin.com/in/pauliusztin/">LinkedIn</a><br />
+      <a href="https://twitter.com/iusztinpaul">Twitter/X</a><br />
+      <a href="https://pauliusztin.substack.com/">Decoding ML Newsletter</a><br />
+      <a href="https://www.pauliusztin.me/">Personal Site | ML & MLOps Hub</a>
+    </td>
+  </tr>
+</table>
diff --git a/media/feature_pipeline_architecture.png b/media/feature_pipeline_architecture.png
diff --git a/media/financial_bot_gradio_ui.png b/media/financial_bot_gradio_ui.png
diff --git a/media/github_actions_cd.png b/media/github_actions_cd.png
diff --git a/media/github_actions_secrets.png b/media/github_actions_secrets.png
diff --git a/media/inference_pipeline_architecture.png b/media/inference_pipeline_architecture.png
diff --git a/media/training_pipeline_architecture.png b/media/training_pipeline_architecture.png
diff --git a/modules/financial_bot/README.md b/modules/financial_bot/README.md
@@ -4,17 +4,18 @@ Inference pipeline that uses [LangChain](https://github.com/langchain-ai/langcha
 * downloads the fine-tuned model from [Comet's](https://www.comet.com?utm_source=thepauls&utm_medium=partner&utm_content=github) model registry
 * takes user questions as input
 * queries the [Qdrant Vector DB](https://qdrant.tech/?utm_source=thepauls&utm_medium=partner&utm_content=github) and enhances the prompt with related financial news
-* calls the fine-tuned LLM for the final answer
+* calls the fine-tuned LLM for financial advice using the initial query, the context from the vector DB, and the chat history
 * persists the chat history into memory 
+* logs the prompt & answer into [Comet ML's](https://www.comet.com/site/products/llmops/?utm_source=thepauls&utm_medium=partner&utm_content=github) LLMOps monitoring feature
 
+The **inference pipeline** is **deployed** using [Beam](https://docs.beam.cloud/deployment/rest-api?utm_source=thepauls&utm_medium=partner&utm_content=github) as a serverless GPU infrastructure, as a RESTful API. Also, it is wrapped under a UI for demo purposes, implemented in [Gradio](https://www.gradio.app/).
 
 ## Table of Contents
 
 - [1. Motivation](#1-motivation)
 - [2. Install](#2-install)
     - [2.1. Dependencies](#21-dependencies)
-    - [2.2. Qdrant](#21-qdrant)
-    - [2.3. Beam](#21-beam)
+    - [2.2. Qdrant & Beam](#21-qdrant--beam)
 - [3. Usage](#3-usage)
     - [3.1. Local](#31-local)
     - [3.2. Deploy to Beam as a RESTful API](#32-deploy-to-beam)
@@ -34,6 +35,8 @@ Thus, using [LangChain](https://github.com/langchain-ai/langchain), we will crea
 
 Also, the final step is to put the financial assistant to good use and deploy it as a serverless RESTful API using [Beam](https://www.beam.cloud?utm_source=thepauls&utm_medium=partner&utm_content=github). 
 
+![architecture](../../media/feature_pipeline_architecture.png)
+
 # 2. Install 
 
 ## 2.1. Dependencies
@@ -43,12 +46,12 @@ Main dependencies you have to install yourself:
 * Poetry 1.5.1
 * GNU Make 4.3
 
-Install dependencies:
+Installing all the other dependencies is as easy as running:
 ```shell
 make install
 ```
 
-For developing run:
+When developing run:
 ```shell
 make install_dev
 ```
@@ -57,74 +60,72 @@ Prepare credentials:
 ```shell
 cp .env.example .env
 ```
---> and complete the `.env` file with your credentials.
-
-## 2.2. Qdrant
-
-You must create a FREE account in Qdrant and generate the `QDRANT_API_KEY` and `QDRANT_URL` environment variables. After, be sure to add them to your `.env` file.
-
--> [Check out this document to see how.](https://qdrant.tech/documentation/cloud/authentication/?utm_source=thepauls&utm_medium=partner&utm_content=github)
-
+--> and complete the `.env` file with your [external services credentials](https://github.com/iusztinpaul/hands-on-llms/tree/main#2-setup-external-services).
 
-## 2.3. Beam
-`optional step in case you want to use Beam` 
+## 2.2. Qdrant & Beam
 
-Create and configure a free Beam account to deploy it as a serverless RESTful API and show it to your friends. You will pay only for what you use. 
-
--> [Create a Beam account & configure it.](https://www.beam.cloud?utm_source=thepauls&utm_medium=partner&utm_content=github)
+Check out the [Setup External Services](https://github.com/iusztinpaul/hands-on-llms/tree/main#2-setup-external-services) section to see how to create API keys for them.
 
 
 # 3. Usage
 
 ## 3.1. Local
 
-Run bot locally:
+Run the bot locally with a predefined question:
 ```shell
 make run
 ```
 
-Run bot locally in dev mode:
+For debugging & testing, run the bot locally with a predefined question, while mocking the LLM:
 ```shell
 make run_dev
 ```
 
-## 3.2. Deploy to Beam as a RESTful API
+## 3.2. Beam | RESTful API
+`deploy the financial bot as a RESTful API to Beam [optional]` 
+
+**First**, you must set up Beam, as explained in the [Setup External Services](https://github.com/iusztinpaul/hands-on-llms/tree/main#2-setup-external-services) section.
 
 Deploy the bot under a RESTful API using Beam:
 ```shell
 make deploy_beam
 ```
 
-Deploy the bot under a RESTful API using Beam in dev mode:
+For debugging & testing, deploy the bot under a RESTful API using Beam while mocking the LLM:
 ```shell
 make deploy_beam_dev
 ```
 
-To test the deployment, make a request to the bot calling the RESTful API, as follows:
+To test the deployment, make a request to the bot calling the RESTful API as follows (the first request will take a while as the LLM needs to load):
 ```shell
 export BEAM_DEPLOYMENT_ID=<BEAM_DEPLOYMENT_ID> # e.g., <xxxxx> from https://<xxxxx>.apps.beam.cloud
 export BEAM_AUTH_TOKEN=<BEAM_AUTH_TOKEN> # e.g., <xxxxx> from Authorization: Basic <xxxxx>
 
 make call_restful_api DEPLOYMENT_ID=${BEAM_DEPLOYMENT_ID} TOKEN=${BEAM_AUTH_TOKEN} 
 ```
 
-**Note:** To find out `BEAM_DEPLOYMENT_ID` and `BEAM_AUTH_TOKEN` navigate to your `financial_bot` or `financial_bot_dev` Beam app.
+**Note:** To find out `BEAM_DEPLOYMENT_ID` and `BEAM_AUTH_TOKEN` navigate to your `financial_bot` or `financial_bot_dev` [Beam app](https://www.beam.cloud/dashboard/apps?utm_source=thepauls&utm_medium=partner&utm_content=github).
 
-**IMPORTANT:** After you finish testing your project, don't forget to stop your Beam deployment. 
+**IMPORTANT NOTE 1:** After you finish testing your project, don't forget to stop your Beam deployment. 
+**IMPORTANT NOTE 2:** The financial bot will work only on CUDA-enabled Nvidia GPUs with ~8 GB VRAM. If you don't have one and wish to run the code, you must deploy it to [Beam](https://www.beam.cloud?utm_source=thepauls&utm_medium=partner&utm_content=github). 
 
 ## 3.3. Gradio UI
 
+To test out & play with the financial bot, you can run it locally under a Gradio UI.
+
 Start the Gradio UI:
 ```shell
 make run_ui
 ```
 
-Start the Gradio UI in dev mode:
+Start the Gradio UI in debug mode while mocking the LLM:
 ```shell
 make run_ui_dev
 ```
 
-**NOTE:** Running the commands from above will host the UI on your computer. To run them, **you need an Nvidia GPU with enough resources** (e.g., to run the inference using Falcon 7B, you need ~8 GB VRAM). If you don't have that available, you can deploy it to `Gradio Spaces` on HuggingFace. It is pretty straightforward to do so. [Here are some docs to get you started](https://huggingface.co/docs/hub/spaces-sdks-gradio).
+![Financial Bot Gradio UI](../../media/financial_bot_gradio_ui.png)
+
+**NOTE:** Running the commands from above will host the UI on your computer. To run them, **you need a CUDA-enabled Nvidia GPU with enough resources** (e.g., to run the inference using Falcon 7B, you need ~8 GB VRAM). If you don't have that available, you can deploy it to `Gradio Spaces` on HuggingFace. It is straightforward to do so. [Here are some docs to get you started](https://huggingface.co/docs/hub/spaces-sdks-gradio).
 
 ## 3.4. Linting & Formatting