A chatbot server.
pip3 install -e .
# Install the main branch of huggingface/transformers
pip3 install git+https://github.com/huggingface/transformers
# Launch a controller
python3 -m chatserver.serve.controller
# Launch a model worker
python3 -m chatserver.serve.model_worker --model facebook/opt-350m
# Send a test message
python3 -m chatserver.serve.test_message
# Luanch a gradio web server.
python3 -m chatserver.serve.gradio_web_server
# You can open your brower and chat with a model now.
python3 -m chatserver.serve.cli --model facebook/opt-350m
- Install skypilot and setup the credentials locally following the instructions here
# Need this version of skypilot, for the fix of `--env` flag.
pip install git+https://github.com/skypilot-org/skypilot.git
- Train the model
sky launch -c vicuna -s scripts/train-vicuna.yaml --env WANDB_API_KEY
# Launch it on managed spot
sky spot launch -n vicuna scripts/train-vicuna.yaml --env WANDB_API_KEY
# Train a 7B model
sky launch -c vicuna -s scripts/train-vicuna.yaml --env WANDB_API_KEY --env MODEL_SIZE=7
Launch the training job with the following line (will be launched on a single node with 4 A100-80GB GPUs)
# WANDB API KEY is required for logging. We use the key in your local environment.
sky launch -c alpaca -s scripts/train-alpaca.yaml --env WANDB_API_KEY
# Train the 13B model
sky launch -c alpaca -s scripts/train-alpaca.yaml --env WANDB_API_KEY --env MODEL_SIZE=13
# You can use a manged spot instance.
sky spot launch -n alpaca scripts/train-alpaca.yaml --env WANDB_API_KEY
- We assume SkyPilot is installed and the model checkpoint is stored on some cloud storage (e.g., GCS).
- Launch the controller server (default to a cheap CPU VM):
sky launch -c controller scripts/serving/controller.yaml
- Find the IP address of the controller server on the cloud console. Make sure the ports are open (default port 21001 for controller, 21002 for model workers).
- Launch a model worker (default to A100):
You can use spot instances to save 3x cost. SkyPilot will automatically recover the spot instance if it is preempted (more details).
sky launch -c model-worker scripts/serving/model_worker.yaml --env CONTROLLER_IP=<controller-ip>
sky spot launch scripts/serving/model_worker.yaml --env CONTROLLER_IP=<controller-ip>
- Click the link generated from step 2 and chat with AI :)