Skip to content

Latest commit

 

History

History
 
 

kubernetes

Folders and files

NameName
Last commit message
Last commit date

parent directory

..
 
 
 
 
 
 

Deploy Ollama to Kubernetes

Prerequisites

Steps

  1. Create the Ollama namespace, deployment, and service

    kubectl apply -f cpu.yaml

(Optional) Hardware Acceleration

Hardware acceleration in Kubernetes requires NVIDIA's k8s-device-plugin which is deployed in Kubernetes in form of daemonset. Follow the link for more details.

Once configured, create a GPU enabled Ollama deployment.

kubectl apply -f gpu.yaml

Test

  1. Port forward the Ollama service to connect and use it locally

    kubectl -n ollama port-forward service/ollama 11434:80
  2. Pull and run a model, for example orca-mini:3b

    ollama run orca-mini:3b