Sandboxing

Sandboxing is the most effective defense against prompt-injection-to-RCE attacks. When an agent can execute code or interact with external systems, it must do so within controlled boundaries.

Why Sandboxing?

Without sandboxing:

User input → Prompt injection → Agent calls code_exec → rm -rf / → 💀

With sandboxing:

User input → Prompt injection → Agent calls code_exec → Sandboxed → No effect → ✅

The fundamental challenge: LLMs are probabilistic. They can be tricked into performing unintended actions. Sandboxing ensures that even when the model is compromised, the blast radius is contained.

Built-in Sandbox

The SDK provides a built-in sandbox for code execution:

import { Sandbox } from 'assistme-agent-sdk'

const sandbox = Sandbox.create({
  type: 'container', // or 'microvm', 'wasm'
  limits: {
    memory: '512MB',
    cpu: '1 core',
    timeout: 30_000, // 30 seconds
    network: false,  // No network access
    filesystem: {
      read: ['/workspace'],  // Can read from workspace
      write: ['/workspace/output'], // Can write to output dir
    },
  },
})

const codeExec = Tool.create({
  name: 'code_exec',
  description: 'Execute Python code in a sandboxed environment',
  parameters: z.object({
    code: z.string(),
    language: z.enum(['python', 'javascript', 'bash']),
  }),
  execute: async ({ code, language }) => {
    return await sandbox.run(code, { language })
  },
})

Sandbox Types

Container Sandbox

Uses Docker/OCI containers for isolation. Good balance of security and compatibility.

const sandbox = Sandbox.create({
  type: 'container',
  image: 'python:3.12-slim', // Base image
  limits: {
    memory: '1GB',
    cpu: '2 cores',
    timeout: 60_000,
    network: false,
    filesystem: {
      read: ['/data'],
      write: ['/output'],
    },
  },
  // Pre-install packages
  setup: async (container) => {
    await container.exec('pip install numpy pandas matplotlib')
  },
})

MicroVM Sandbox

Uses Firecracker-style microVMs for strongest isolation. Recommended for production.

const sandbox = Sandbox.create({
  type: 'microvm',
  kernel: 'default',
  rootfs: 'python-3.12',
  limits: {
    memory: '512MB',
    vcpus: 1,
    timeout: 30_000,
    network: false,
  },
})

WASM Sandbox

Uses WebAssembly for lightweight, fast isolation. Best for simple computations.

const sandbox = Sandbox.create({
  type: 'wasm',
  runtime: 'wasmtime',
  limits: {
    memory: '256MB',
    timeout: 10_000,
    filesystem: false,
    network: false,
  },
})

Network Policies

Control network access granularly:

const sandbox = Sandbox.create({
  type: 'container',
  limits: {
    network: {
      // Allow specific domains
      allow: ['api.github.com', 'pypi.org'],
      // Block everything else
      defaultPolicy: 'deny',
      // Limit bandwidth
      maxBandwidth: '10MB/s',
      // No DNS resolution for unlisted domains
      dns: 'restricted',
    },
  },
})

Per-Agent Permissions

Different agents get different sandbox configurations:

const readOnlyAgent = new Agent({
  name: 'reader',
  tools: [
    Tool.create({
      name: 'code_exec',
      execute: async ({ code }) => {
        return await Sandbox.create({
          type: 'container',
          limits: {
            filesystem: { read: ['/data'], write: [] }, // No writes
            network: false,
            timeout: 10_000,
          },
        }).run(code)
      },
    }),
  ],
})

const fullAccessAgent = new Agent({
  name: 'developer',
  tools: [
    Tool.create({
      name: 'code_exec',
      requiresApproval: true, // Require human approval
      execute: async ({ code }) => {
        return await Sandbox.create({
          type: 'container',
          limits: {
            filesystem: { read: ['/workspace'], write: ['/workspace'] },
            network: { allow: ['api.github.com'] },
            timeout: 60_000,
          },
        }).run(code)
      },
    }),
  ],
})

Hermetic Execution

For maximum security, use hermetic (fully isolated) execution:

const sandbox = Sandbox.create({
  type: 'microvm',
  hermetic: true, // Full isolation
  limits: {
    memory: '512MB',
    vcpus: 1,
    timeout: 30_000,
    network: false,
    filesystem: false, // No host filesystem access
  },
  // Data is passed in/out through the API, not filesystem
})

const result = await sandbox.run(code, {
  language: 'python',
  stdin: inputData,    // Pass data in via stdin
  // stdout is the result
})

Hermetic execution means:

No network connections
No host filesystem access
No API calls from inside the sandbox
Full cleanup of data after execution
Each execution starts from a clean state

Secrets Management

Never expose secrets to sandboxed code:

// Bad: secret in the code
const result = await sandbox.run(`
  import requests
  r = requests.get('https://api.example.com', headers={'Authorization': '${SECRET}'})
`)

// Good: secret injected securely, not visible in logs or traces
const result = await sandbox.run(code, {
  env: {
    API_KEY: { secret: 'api-key-ref', expose: 'env' },
    // The secret is injected into the sandbox environment
    // but never appears in agent context, logs, or traces
  },
})

Monitoring Sandbox Usage

const sandbox = Sandbox.create({
  type: 'container',
  limits: { memory: '512MB', timeout: 30_000 },
  monitoring: {
    onResourceLimit: async (event) => {
      console.warn(`Sandbox hit ${event.resource} limit: ${event.detail}`)
    },
    onNetworkAttempt: async (event) => {
      if (!event.allowed) {
        await securityLog.write(`Blocked network attempt: ${event.destination}`)
      }
    },
  },
})

Best Practices

Default to no-network, no-filesystem — Start with the most restrictive sandbox and only open access as needed.
Use microVMs for production — Container escape is a real risk. MicroVMs provide hardware-level isolation.
Set tight timeouts — Code execution should have short timeouts. Most legitimate code completes in seconds.
Never put secrets in agent context — Secrets should be injected at the sandbox level, invisible to the agent and its logs.
Clean up after each execution — Don't let state leak between sandbox runs. Each execution should start fresh.
Monitor and alert on limits — Track when sandboxes hit resource limits. Repeated limit hits may indicate malicious activity.
Log all sandbox executions — Every code execution should be logged with input, output, and resource usage for audit.

Sandboxing

On this page