Production Deployment
This guide covers the hardened, production setup. For a laptop, use the dev overlay.
Prerequisites
- Kubernetes 1.29+ with a CNI that enforces
NetworkPolicy(Cilium, Calico, …). - agent-sandbox v0.4.5+ installed.
- A dedicated sandbox node pool labeled
vibed.dev/sandbox-node: "true", runningcontainerd+containerd-shim-kata-v2, with KVM available (bare metal,*.metalon AWS, or nested-virt images on GCP). - A Kata RuntimeClass —
kata-fc(Firecracker, needs KVM) orkata-qemu. - S3 or MinIO for source tarballs.
- A DNS-01 capable DNS provider for Caddy's wildcard cert on
*.<your-domain>.
1. Container images
CI builds and pushes all component images to GHCR: vibed, vibed-controller, vibed-router, and the five template images. Pin a released tag (never latest) in your values.
2. Production values
image: { repository: ghcr.io/vibed-project/vibed, tag: "v0.4.1", pullPolicy: IfNotPresent }
controller: { image: { tag: "v0.4.1" }, domain: apps.example.com }
router: { image: { tag: "v0.4.1" } }
# Sandbox isolation
runtime:
defaultClass: kata-fc
nodeSelector: { vibed.dev/sandbox-node: "true" }
sandboxNetworkPolicy: Unmanaged # vibeD owns the policy (see below)
networkPolicy:
enabled: true # default-deny + control-plane->sandbox + DNS/S3 egress
# Source store — MUST be s3 in production
config:
storage:
tarball:
backend: s3
s3: { bucket: vibed-sources, region: us-east-1, presignTTL: "15m" }
server: { logFormat: json }
# Wildcard TLS
caddy:
tls:
dns01: { enabled: true, provider: cloudflare, tokenSecret: vibed-cloudflare }
auth:
enabled: true
mode: oidc # or apikey
oidc: { issuer: "https://idp.example.com/...", audience: vibed, adminRole: vibed-admin }
3. The network model (important)
agent-sandbox gives sandbox pods no cluster-internal egress and no cluster DNS once a NetworkPolicy is enforced — this is where enterprise data-egress controls live. Two consequences:
- Source must come from
s3. A pre-signed S3/MinIO URL is reachable over the sandbox's allowed public egress; the in-clusterservedbackend is not.servedis dev-only. - vibeD must own the NetworkPolicy. agent-sandbox's
Managedmode allows ingress only from anapp: sandbox-routerpod it doesn't ship, and blocks all cluster egress — which breaks the controller's probe and Caddy's proxy. Setruntime.sandboxNetworkPolicy: UnmanagedandnetworkPolicy.enabled: true; vibeD then ships a policy that allows exactly: control-plane → sandbox:8080/:9000, DNS, and public egress (for the S3 pull), with cluster-internal CIDRs denied.
4. Install
helm install vibed deploy/helm/vibed/ -n vibed-system --create-namespace -f values-production.yaml
kubectl rollout status deploy/vibed -n vibed-system
kubectl get sandboxwarmpool -n vibed-apps
5. Expose Caddy
Front the vibed-caddy Service with a LoadBalancer or Ingress, and point your wildcard DNS record *.apps.example.com at it. Caddy obtains the wildcard cert via DNS-01 and routes <id>.apps.example.com to each app's per-app Service.
Upgrading
helm upgrade vibed deploy/helm/vibed/ -n vibed-system -f values-production.yaml
Helm installs CRDs from crds/ only on first install. If the release changes the VibedApp schema, apply it by hand or new status fields are silently dropped:
kubectl apply -f deploy/helm/vibed/crds/vibed.dev_vibedapps.yaml
Software supply chain (SBOMs & attestations)
Every tagged release ships a Software Bill of Materials so you can audit what you're deploying:
-
In-registry attestations — each image carries an SBOM attestation generated by BuildKit at push time. Inspect it without downloading the image:
docker buildx imagetools inspect ghcr.io/vibed-project/vibed:<version> --format '{{ json .SBOM }}' -
Downloadable SPDX files — the GitHub Release attaches an SPDX SBOM per image (
<image>.spdx.json) plus a source SBOM (vibed-source.spdx.json), suitable for ingesting into a vulnerability scanner or compliance pipeline.
Both are produced automatically by the release workflow (.github/workflows/release.yaml); no action is needed beyond tagging a release.
Governance (multi-tenant installs)
For shared/enterprise installs, enable the governance controls (each has a dedicated page):
- Egress control — per-app outbound allow-lists enforced by a forward proxy.
- Deploy quotas — per-owner / per-department caps on concurrent apps.
- Audit trail — who deployed/deleted/rolled back what (persistent on the SQLite store).
Security checklist
-
auth.enabled: truebefore exposing the API - Image tags pinned to a release (not
latest) -
runtime.sandboxNetworkPolicy: Unmanaged+networkPolicy.enabled: true -
storage.tarball.backend: s3(neverserved) - Sandbox node pool isolated + Kata RuntimeClass in place
- Caddy wildcard DNS-01 TLS enabled with the provider token in a Secret
-
controller.domainis a real domain (notlocalhost) - CRD re-applied after any schema-changing upgrade
- (Multi-tenant)
egressControl.enabled+quotas.enabledreviewed; audit trail on a persistent store - Release SBOMs reviewed / ingested into your scanner