Skip to content

antonpp/kubernetes-engine-samples

 
 

Repository files navigation

Google Kubernetes Engine (GKE) Samples

This repository contains sample applications used in Google Kubernetes Engine (GKE) tutorials.

See the following resources to get started:

Or click on the following button to open the interactive quickstart demo:

Open in Cloud Shell

Contributing

Licensing

MME Ray demo

git clone https://github.com/antonpp/kubernetes-engine-samples.git
cd kubernetes-engine-samples/ai-ml/gke-ray/rayserve/llm
export TUTORIAL_HOME=`pwd`

gcloud config set project <edit>
export PROJECT_ID=$(gcloud config get project)
export COMPUTE_REGION=europe-west1
export CLUSTER_VERSION=1.30.5
export HF_TOKEN=<edit>

kubectl create secret generic hf-secret \
  --from-literal=hf_api_token=${HF_TOKEN} \
  --dry-run=client -o yaml | kubectl apply -f -

gcloud container clusters create std-rayserve-cluster \
    --addons=RayOperator \
    --enable-ray-cluster-logging --enable-ray-cluster-monitoring \
    --cluster-version=${CLUSTER_VERSION} \
    --machine-type=g2-standard-24 \
    --location=${COMPUTE_REGION} \
    --node-locations=${COMPUTE_REGION}-c \
    --num-nodes=1 \
    --accelerator type=nvidia-l4,count=2,gpu-driver-version=latest

kubectl apply -f mme-ray-service.yaml

kubectl port-forward svc/gemma-2b-it-serve-svc 8000:8000

curl -X POST http://localhost:8000/ -H "Content-Type: application/json" --header "serve_multiplexed_model_id: bar" -d '{"prompt": "What are the top 5 most popular programming languages? Please be brief.", "max_tokens": 200}'

About

Sample applications for Google Kubernetes Engine (GKE)

Resources

License

Code of conduct

Security policy

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • HCL 51.3%
  • Go 15.5%
  • Python 13.9%
  • Jupyter Notebook 4.5%
  • Dockerfile 2.7%
  • Shell 2.4%
  • Other 9.7%