Configuring Ollama models in kagent

Documentation

Configuring Ollama

Ollama allows you to run LLMs locally on your computer or in a Kubernetes cluster. Configuring Ollama in kagent follows the same pattern as for other providers.

Let's give an example of how to run Ollama on a Kubernetes cluster:

Create a namespace for Ollama deployment and service:

kubectl create ns ollama

Create the deployment and service:

apiVersion: apps/v1
kind: Deployment
metadata:
  name: ollama
  namespace: ollama
spec:
  selector:
    matchLabels:
      name: ollama
  template:
    metadata:
      labels:
        name: ollama
    spec:
      containers:
      - name: ollama
        image: ollama/ollama:latest
        ports:
        - name: http
          containerPort: 11434
          protocol: TCP
---
apiVersion: v1
kind: Service
metadata:
  name: ollama
  namespace: ollama
spec:
  type: ClusterIP
  selector:
    name: ollama
  ports:
  - port: 80
    name: http
    targetPort: http
    protocol: TCP

You can run kubectl get pod -n ollama and wait until the pod has started.

Once the pod has started, you can port-forward to the Ollama service and use ollama run [model-name] to download/run the model. You can download Ollama binary here.

As kagent relies on calling tools, sure you're using a model that allows function calling.

Let's assume we've downloaded the llama3 model, you can then use the following ModelConfig to configure the model:

apiVersion: kagent.dev/v1alpha1
kind: ModelConfig
metadata:
  name: llama3-model-config
  namespace: kagent
spec:
  apiKeySecretKey: OPENAI_API_KEY
  apiKeySecretRef: kagent-openai
  model: llama3
  provider: Ollama
  ollama:
    host: http://ollama.ollama.svc.cluster.local