Name		Name	Last commit message	Last commit date
parent directory ..
README.md		README.md
cpu.yaml		cpu.yaml
gpu.yaml		gpu.yaml

README.md

Deploy Ollama to Kubernetes

Prerequisites

Ollama: https://ollama.com/download
Kubernetes cluster. This example will use Google Kubernetes Engine.

Steps

Create the Ollama namespace, deployment, and service
```
kubectl apply -f cpu.yaml
```

(Optional) Hardware Acceleration

Hardware acceleration in Kubernetes requires NVIDIA's k8s-device-plugin which is deployed in Kubernetes in form of daemonset. Follow the link for more details.

Once configured, create a GPU enabled Ollama deployment.

kubectl apply -f gpu.yaml

Test

Port forward the Ollama service to connect and use it locally
```
kubectl -n ollama port-forward service/ollama 11434:80
```
Pull and run a model, for example orca-mini:3b
```
ollama run orca-mini:3b
```

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

kubernetes

kubernetes

README.md

Deploy Ollama to Kubernetes

Prerequisites

Steps

(Optional) Hardware Acceleration

Test

Files

kubernetes

Directory actions

More options

Directory actions

More options

Latest commit

History

kubernetes

Folders and files

parent directory

README.md

Deploy Ollama to Kubernetes

Prerequisites

Steps

(Optional) Hardware Acceleration

Test