Rest API Server for Llama 2
Before run this server, you have to download Llama 2 model first.
See https://dev.to/choonho/llama-2-in-apple-silicon-macbook-13-54h
Move ggsm model file to models/7B/ggml-model-q4_0.bin (default MODEL_PATH)
pip3 install llama-cpp-python langchain
pip3 install fastapi uvicorn
python3 server.py