Documentation

Agent Substrate#

In this guide, you install Agent Substrate and kagent on a local kind cluster, then deploy a declarative SandboxAgent that runs inside a gVisor actor. All components come from published OCI charts — no source builds or repo clones required.

By the end, you will have:

  • Agent Substrate v0.0.6 running in the ate-system namespace.
  • kagent v0.9.9 or later installed with the substrate integration enabled. kagent v0.9.9 is the minimum version — earlier releases do not include the controller wiring that lets a SandboxAgent target substrate.
  • A SandboxAgent running on substrate, reachable from the kagent UI.

For background on what substrate is and how it differs from a per-pod agent runtime, see the Agent Substrate concept page. This guide does not cover the AgentHarness path on substrate.

Before you begin#

You need:

  • kind, kubectl, and helm on your PATH.
  • A running Docker daemon (Docker Desktop or equivalent).
  • An OpenAI API key exported in your shell.
export OPENAI_API_KEY="sk-..."

Step 1: Create a kind cluster#

kind create cluster --name kagent-substrate

The substrate v0.0.6 chart defaults to JWT auth backed by Kubernetes ServiceAccount tokens, so a vanilla kind cluster works — no feature gates or custom kind config are required.

Step 2: Install Agent Substrate#

Install the CRDs first, then the substrate control plane and data plane.

helm upgrade --install substrate-crds \
oci://ghcr.io/kagent-dev/substrate/helm/substrate-crds \
--version 0.0.6 \
--namespace ate-system --create-namespace --wait
helm upgrade --install substrate \
oci://ghcr.io/kagent-dev/substrate/helm/substrate \
--version 0.0.6 \
--namespace ate-system --wait --timeout 10m

Verify the substrate pods are running.

kubectl get pods -n ate-system

You should see ate-api-server, ate-controller, atelet-*, atenet-router, valkey-cluster-{0..5}, and rustfs all Running, plus a few Completed init jobs.

Step 3: Install kagent with substrate enabled#

Confirm the OpenAI key is set in this shell before you run Helm. If the key is empty, the install runs silently with no kagent-openai Secret and the default agent pods land in CreateContainerConfigError.

[[ -n "${OPENAI_API_KEY:-}" ]] && echo "key is set (len=${#OPENAI_API_KEY})" || echo "OPENAI_API_KEY is empty — export it first"

Install the CRDs, then kagent with the substrate flags.

helm upgrade --install kagent-crds \
oci://ghcr.io/kagent-dev/kagent/helm/kagent-crds \
--version 0.9.9 \
--namespace kagent --create-namespace --wait
helm upgrade --install kagent \
oci://ghcr.io/kagent-dev/kagent/helm/kagent \
--version 0.9.9 \
--namespace kagent --timeout 10m --wait \
--set providers.openAI.apiKey="${OPENAI_API_KEY}" \
--set providers.default=openAI \
--set controller.substrate.enabled=true \
--set controller.substrate.ateApiEndpoint=dns:///api.ate-system.svc:443 \
--set controller.substrate.ateApiInsecure=true \
--set substrateWorkerPool.create=true \
--set substrateWorkerPool.replicas=1 \
--set substrateWorkerPool.ateomImage=ghcr.io/kagent-dev/substrate/ateom-gvisor:v0.0.6

The controller.substrate.* and substrateWorkerPool.* flags turn on the substrate integration. The rest is a standard kagent install.

If Helm hits its --timeout 10m while waiting on the cold-start pod startup race (the controller restarts a couple of times waiting on postgres), wait for the controller manually and continue.

kubectl wait deploy/kagent-controller -n kagent --for=condition=Available --timeout=10m

Sanity-check that the key landed and the default agents are healthy.

kubectl get secret kagent-openai -n kagent # should exist with 1 data entry
kubectl get pods -n kagent | grep -v Running # only header + Completed jobs expected

If you see CreateContainerConfigError on the default agent pods, the secret did not get created — re-run the kagent Helm command with --reuse-values --set providers.openAI.apiKey="$(cat ~/path/to/key)" to patch it in. The deployments roll to new pods automatically.

Tune the WorkerPool size#

substrateWorkerPool.replicas=1 is the chart default. One worker is enough for a declarative-only walkthrough: session actors release their slot the moment they snapshot back to object storage, so a single worker can serve many sequential sessions. Increase the replica count when:

  • You add a long-lived AgentHarness. The ahr-<...> actor pins a slot for the lifetime of the CR, so you need at least 1 + (number of harnesses).
  • You want simultaneous, overlapping declarative sessions.

You can change the size three ways, depending on how permanent you want the change.

# 1) Quick, ephemeral — scale the live CR. Reverts on the next helm upgrade.
kubectl scale workerpool kagent-default -n kagent --replicas=3
# 2) Stick it into the helm release — survives upgrades.
helm upgrade kagent oci://ghcr.io/kagent-dev/kagent/helm/kagent \
--version 0.9.9 --namespace kagent --reuse-values \
--set substrateWorkerPool.replicas=3
# 3) Fresh install — change the value on the Step 3 install command above.

Step 4: Open the kagent UI#

kubectl port-forward -n kagent svc/kagent-ui 8001:8080

Open http://localhost:8001. Skip the first-run wizard if it appears.

Step 5: Create a declarative agent on substrate#

A SandboxAgent with platform: substrate and a substrate.workerPoolRef runs as a substrate actor instead of a plain Deployment. Pick one of the two paths below.

Option A: Via the UI#

  1. CreateAgent → choose Declarative as the type.
  2. Set the basics:
    • Name: hello-substrate
    • Namespace: kagent
    • Model config: default-model-config
    • Runtime: Go (required — the Python ADK is not supported on substrate today)
    • System message: You are a friendly assistant living inside an Agent Substrate sandbox. When asked who you are, say "I am hello-substrate, a Go ADK declarative agent running inside a gVisor actor."
  3. In the Sandbox / Platform section (the label depends on UI version), set Platform to substrate and select the worker pool kagent-default.
  4. Save.

Option B: Via kubectl#

kubectl apply -f - <<EOF
apiVersion: kagent.dev/v1alpha2
kind: SandboxAgent
metadata:
name: hello-substrate
namespace: kagent
spec:
type: Declarative
description: Tiny declarative agent running inside a substrate actor
declarative:
runtime: go
modelConfig: default-model-config
systemMessage: |
You are a friendly assistant living inside an Agent Substrate sandbox.
When asked who you are, say "I am hello-substrate, a Go ADK declarative
agent running inside a gVisor actor."
platform: substrate
substrate:
workerPoolRef:
name: kagent-default
EOF

Wait for the agent to become ready. The first-time golden snapshot takes about 60–90 seconds.

kubectl wait sandboxagent/hello-substrate -n kagent --for=condition=Ready --timeout=5m

Step 6: Chat with the agent#

In the UI at http://localhost:8001, pick kagent/hello-substrate from the Agents list and send:

What are you, and where are you running? Answer in one sentence.

Expected reply:

I am hello-substrate, a Go ADK declarative agent running inside a gVisor actor.

Behind the scenes, a per-session gVisor actor was restored from the golden snapshot, ran the LLM call, and snapshotted itself back to object storage. Open View → Substrate to see the actor in the inventory — between requests it sits Suspended.

Cleanup#

kind delete cluster --name kagent-substrate

Next steps#

  • Agent Substrate concept page — runtime architecture and how snapshots, actors, and worker pools fit together.
  • AgentHarness — provision long-running OpenClaw, NemoClaw, or Hermes sandboxes. Set runtime: substrate to run a harness on Agent Substrate too.
  • Agent Sandbox — the upstream kubernetes-sigs/agent-sandbox backend for SandboxAgent, a separate path from substrate.
Kagent Lab: Discover kagent and kmcp
Free, on‑demand lab: build custom AI agents with kagent and integrate tools via kmcp on Kubernetes.