Documentation

Agent Substrate#

In this guide, you install Agent Substrate and kagent on a local kind cluster, then deploy a declarative SandboxAgent that runs inside a gVisor actor. All components come from published OCI charts — no source builds or repo clones required.

By the end, you will have:

Agent Substrate v0.0.6 running in the ate-system namespace.
kagent v0.9.9 or later installed with the substrate integration enabled. kagent v0.9.9 is the minimum version — earlier releases do not include the controller wiring that lets a SandboxAgent target substrate.
A SandboxAgent running on substrate, reachable from the kagent UI.

For background on what substrate is and how it differs from a per-pod agent runtime, see the Agent Substrate concept page. This guide does not cover the AgentHarness path on substrate.

Before you begin#

You need:

kind, kubectl, and helm on your PATH.
A running Docker daemon (Docker Desktop or equivalent).
An OpenAI API key exported in your shell.

export OPENAI_API_KEY="sk-..."

Step 1: Create a kind cluster#

kind create cluster --name kagent-substrate

The substrate v0.0.6 chart defaults to JWT auth backed by Kubernetes ServiceAccount tokens, so a vanilla kind cluster works — no feature gates or custom kind config are required.

Step 2: Install Agent Substrate#

Install the CRDs first, then the substrate control plane and data plane.

helm upgrade --install substrate-crds \
  oci://ghcr.io/kagent-dev/substrate/helm/substrate-crds \
  --version 0.0.6 \
  --namespace ate-system --create-namespace --wait

helm upgrade --install substrate \
  oci://ghcr.io/kagent-dev/substrate/helm/substrate \
  --version 0.0.6 \
  --namespace ate-system --wait --timeout 10m

Verify the substrate pods are running.

kubectl get pods -n ate-system

You should see ate-api-server, ate-controller, atelet-*, atenet-router, valkey-cluster-{0..5}, and rustfs all Running, plus a few Completed init jobs.

Step 3: Install kagent with substrate enabled#

Confirm the OpenAI key is set in this shell before you run Helm. If the key is empty, the install runs silently with no kagent-openai Secret and the default agent pods land in CreateContainerConfigError.

[[ -n "${OPENAI_API_KEY:-}" ]] && echo "key is set (len=${#OPENAI_API_KEY})" || echo "OPENAI_API_KEY is empty — export it first"

Install the CRDs, then kagent with the substrate flags.

helm upgrade --install kagent-crds \
  oci://ghcr.io/kagent-dev/kagent/helm/kagent-crds \
  --version 0.9.9 \
  --namespace kagent --create-namespace --wait

helm upgrade --install kagent \
  oci://ghcr.io/kagent-dev/kagent/helm/kagent \
  --version 0.9.9 \
  --namespace kagent --timeout 10m --wait \
  --set providers.openAI.apiKey="${OPENAI_API_KEY}" \
  --set providers.default=openAI \
  --set controller.substrate.enabled=true \
  --set controller.substrate.ateApiEndpoint=dns:///api.ate-system.svc:443 \
  --set controller.substrate.ateApiInsecure=true \
  --set substrateWorkerPool.create=true \
  --set substrateWorkerPool.replicas=1 \
  --set substrateWorkerPool.ateomImage=ghcr.io/kagent-dev/substrate/ateom-gvisor:v0.0.6

The controller.substrate.* and substrateWorkerPool.* flags turn on the substrate integration. The rest is a standard kagent install.

If Helm hits its --timeout 10m while waiting on the cold-start pod startup race (the controller restarts a couple of times waiting on postgres), wait for the controller manually and continue.

kubectl wait deploy/kagent-controller -n kagent --for=condition=Available --timeout=10m

Sanity-check that the key landed and the default agents are healthy.

kubectl get secret kagent-openai -n kagent         # should exist with 1 data entry
kubectl get pods -n kagent | grep -v Running       # only header + Completed jobs expected

If you see CreateContainerConfigError on the default agent pods, the secret did not get created — re-run the kagent Helm command with --reuse-values --set providers.openAI.apiKey="$(cat ~/path/to/key)" to patch it in. The deployments roll to new pods automatically.

Tune the WorkerPool size#

substrateWorkerPool.replicas=1 is the chart default. One worker is enough for a declarative-only walkthrough: session actors release their slot the moment they snapshot back to object storage, so a single worker can serve many sequential sessions. Increase the replica count when:

You add a long-lived AgentHarness. The ahr-<...> actor pins a slot for the lifetime of the CR, so you need at least 1 + (number of harnesses).
You want simultaneous, overlapping declarative sessions.

You can change the size three ways, depending on how permanent you want the change.

# 1) Quick, ephemeral — scale the live CR. Reverts on the next helm upgrade.
kubectl scale workerpool kagent-default -n kagent --replicas=3

# 2) Stick it into the helm release — survives upgrades.
helm upgrade kagent oci://ghcr.io/kagent-dev/kagent/helm/kagent \
  --version 0.9.9 --namespace kagent --reuse-values \
  --set substrateWorkerPool.replicas=3

# 3) Fresh install — change the value on the Step 3 install command above.

Step 4: Open the kagent UI#

kubectl port-forward -n kagent svc/kagent-ui 8001:8080

Open http://localhost:8001. Skip the first-run wizard if it appears.

Step 5: Create a declarative agent on substrate#

A SandboxAgent with platform: substrate and a substrate.workerPoolRef runs as a substrate actor instead of a plain Deployment. Pick one of the two paths below.

Option A: Via the UI#

Create → Agent → choose Declarative as the type.
Set the basics:
- Name: hello-substrate
- Namespace: kagent
- Model config: default-model-config
- Runtime: Go (required — the Python ADK is not supported on substrate today)
- System message: You are a friendly assistant living inside an Agent Substrate sandbox. When asked who you are, say "I am hello-substrate, a Go ADK declarative agent running inside a gVisor actor."
In the Sandbox / Platform section (the label depends on UI version), set Platform to substrate and select the worker pool kagent-default.
Save.

Option B: Via kubectl#

kubectl apply -f - <<EOF
apiVersion: kagent.dev/v1alpha2
kind: SandboxAgent
metadata:
  name: hello-substrate
  namespace: kagent
spec:
  type: Declarative
  description: Tiny declarative agent running inside a substrate actor
  declarative:
    runtime: go
    modelConfig: default-model-config
    systemMessage: |
      You are a friendly assistant living inside an Agent Substrate sandbox.
      When asked who you are, say "I am hello-substrate, a Go ADK declarative
      agent running inside a gVisor actor."
  platform: substrate
  substrate:
    workerPoolRef:
      name: kagent-default
EOF

Wait for the agent to become ready. The first-time golden snapshot takes about 60–90 seconds.

kubectl wait sandboxagent/hello-substrate -n kagent --for=condition=Ready --timeout=5m

Step 6: Chat with the agent#

In the UI at http://localhost:8001, pick kagent/hello-substrate from the Agents list and send:

What are you, and where are you running? Answer in one sentence.

Expected reply:

I am hello-substrate, a Go ADK declarative agent running inside a gVisor actor.

Behind the scenes, a per-session gVisor actor was restored from the golden snapshot, ran the LLM call, and snapshotted itself back to object storage. Open View → Substrate to see the actor in the inventory — between requests it sits Suspended.

Cleanup#

kind delete cluster --name kagent-substrate

Next steps#

Agent Substrate concept page — runtime architecture and how snapshots, actors, and worker pools fit together.
AgentHarness — provision long-running OpenClaw, NemoClaw, or Hermes sandboxes. Set runtime: substrate to run a harness on Agent Substrate too.
Agent Sandbox — the upstream kubernetes-sigs/agent-sandbox backend for SandboxAgent, a separate path from substrate.

Kagent Lab: Discover kagent and kmcp

Free, on‑demand lab: build custom AI agents with kagent and integrate tools via kmcp on Kubernetes.

Start the Lab