Documentation

Configuring Ollama

Ollama allows you to run LLMs locally on your computer or in a Kubernetes cluster. Configuring Ollama in kagent follows the same pattern as for other providers.

Let's give an example of how to run Ollama on a Kubernetes cluster:

  1. Create a namespace for Ollama deployment and service:
kubectl create ns ollama
  1. Create the deployment and service:
apiVersion: apps/v1
kind: Deployment
metadata:
name: ollama
namespace: ollama
spec:
selector:
matchLabels:
name: ollama
template:
metadata:
labels:
name: ollama
spec:
containers:
- name: ollama
image: ollama/ollama:latest
ports:
- name: http
containerPort: 11434
protocol: TCP
---
apiVersion: v1
kind: Service
metadata:
name: ollama
namespace: ollama
spec:
type: ClusterIP
selector:
name: ollama
ports:
- port: 80
name: http
targetPort: http
protocol: TCP

You can run kubectl get pod -n ollama and wait until the pod has started.

Once the pod has started, you can port-forward to the Ollama service and use ollama run [model-name] to download/run the model. You can download Ollama binary here.

As kagent relies on calling tools, sure you're using a model that allows function calling.

Let's assume we've downloaded the llama3 model, you can then use the following ModelConfig to configure the model:

apiVersion: kagent.dev/v1alpha1
kind: ModelConfig
metadata:
name: llama3-model-config
namespace: kagent
spec:
apiKeySecretKey: OPENAI_API_KEY
apiKeySecretRef: kagent-openai
model: llama3
provider: Ollama
ollama:
host: http://ollama.ollama.svc.cluster.local