This is a LlamaIndex project using FastAPI bootstrapped with create-llama
.
First, setup the environment with poetry:
Note: This step is not needed if you are using the dev-container.
poetry install
poetry shell
Then check the parameters that have been pre-configured in the .env
file in this directory. (E.g. you might need to configure an OPENAI_API_KEY
if you're using OpenAI as model provider).
If you are using any tools or data sources, you can update their config files in the config
folder.
Second, generate the embeddings of the documents in the ./data
directory:
poetry run generate
Third, run the app:
poetry run dev
Open http://localhost:8000 with your browser to start the app.
The example provides two different API endpoints:
/api/chat
- a streaming chat endpoint/api/chat/request
- a non-streaming chat endpoint
You can test the streaming endpoint with the following curl request:
curl --location 'localhost:8000/api/chat' \
--header 'Content-Type: application/json' \
--data '{ "messages": [{ "role": "user", "content": "Hello" }] }'
And for the non-streaming endpoint run:
curl --location 'localhost:8000/api/chat/request' \
--header 'Content-Type: application/json' \
--data '{ "messages": [{ "role": "user", "content": "Hello" }] }'
You can start editing the API endpoints by modifying app/api/routers/chat.py
. The endpoints auto-update as you save the file. You can delete the endpoint you're not using.
To start the app optimized for production, run:
poetry run prod
For production deployments, check the DEPLOY.md file.
To learn more about LlamaIndex, take a look at the following resources:
- LlamaIndex Documentation - learn about LlamaIndex.
You can check out the LlamaIndex GitHub repository - your feedback and contributions are welcome!