guide

Run Claude Code or Codex in a Docker Sandbox: Isolation Without Risk

Sahil Kathpal

12 May 2026 • 7 min read

TL;DR

You can run Claude Code or Codex with --dangerously-skip-permissions safely by making Docker the security boundary instead of relying on the agent's internal guardrails. The container gets blown away after each session, so there's nothing persistent to compromise. If you also need remote access, session persistence across restarts, and the ability to approve tool calls from your phone, Grass combines pre-configured Daytona VMs with those features out of the box.

Why `--dangerously-skip-permissions` Exists and Why It's Fine in a Container

Claude Code prompts you for confirmation before executing bash commands or writing files. That's the right default when you're running it locally against your actual machine. The flag --dangerously-skip-permissions bypasses those prompts entirely, which is what you want when the agent is doing long autonomous runs — but terrifying when the "machine" is your laptop.

Docker solves this by decoupling "the machine the agent can destroy" from "the machine you care about." The container sees a filesystem you provisioned. It can write anywhere, run arbitrary commands, install packages, and exit. When it's done, docker rm and the container is gone. Your host is untouched.

This is not a novel insight — it's the same reason CI pipelines run in containers. Coding agents are just CI for code generation.

How to Build a Docker Sandbox for Claude Code

Step 1: Write a minimal Dockerfile

Start from a Node image (Claude Code is an npm package) and add only what your project needs:

FROM node:22-slim

# Install Claude Code globally
RUN npm install -g @anthropic-ai/claude-code

# Install common tools your agent will need
RUN apt-get update && apt-get install -y \
    git \
    curl \
    ripgrep \
    && rm -rf /var/lib/apt/lists/*

# Create a non-root user for defense-in-depth
RUN useradd -ms /bin/bash agent
USER agent
WORKDIR /home/agent/project

ENTRYPOINT ["claude"]

Your API key never bakes into this image. Pass it at runtime.

Step 2: Build the image

docker build -t claude-sandbox:latest .

Expected output:

[+] Building 34.2s (9/9) FINISHED
 => [internal] load build definition from Dockerfile
 => [1/5] FROM node:22-slim
 => [2/5] RUN npm install -g @anthropic-ai/claude-code
 => [3/5] RUN apt-get update && apt-get install -y git curl ripgrep
 => [4/5] RUN useradd -ms /bin/bash agent
 => exporting to image

Step 3: Mount your project and run the agent

docker run --rm -it \
  -e ANTHROPIC_API_KEY="${ANTHROPIC_API_KEY}" \
  -v "$(pwd)":/home/agent/project \
  claude-sandbox:latest \
  --dangerously-skip-permissions \
  -p "Refactor the auth module to use JWTs"

Key flags:

--rm — destroys the container when done. No leftover state.
-v "$(pwd)":/home/agent/project — mounts your repo so output survives the container.
-e ANTHROPIC_API_KEY — injects the key from your host environment. Never hardcode it.

The --dangerously-skip-permissions flag is now safe because the worst case is the container trashes itself.

Docker Desktop 4.58+ Sandboxes: microVM Isolation

Docker Desktop 4.58 (released February 2026) introduced a Sandboxes feature that wraps containers in a lightweight microVM using Apple Virtualization Framework on macOS or similar hypervisor technology on Linux. This gives you a second isolation layer: even a container escape (a rare but real class of vulnerability) is contained within the microVM rather than reaching your host kernel.

To use it on Docker Desktop 4.58+:

# Enable the Sandboxes feature in Docker Desktop settings
# Settings > Features in Development > Docker Sandboxes

# Run with the sandbox runtime
docker run --rm -it \
  --runtime=sandbox \
  -e ANTHROPIC_API_KEY="${ANTHROPIC_API_KEY}" \
  -v "$(pwd)":/home/agent/project \
  claude-sandbox:latest \
  --dangerously-skip-permissions \
  -p "Add unit tests for the payment module"

The --runtime=sandbox flag routes the container through the microVM layer. Startup is a few hundred milliseconds slower than standard Docker. For agent runs that last minutes or hours, this overhead is irrelevant.

If you're not on Docker Desktop 4.58+, --runtime=sandbox will error. Remove it and you're back to standard container isolation, which is still sufficient for most threat models.

Locking Down the Network

By default, containers can reach the internet. If your agent task doesn't need it (refactoring, test writing, documentation), cut network access entirely:

docker run --rm -it \
  --network=none \
  -e ANTHROPIC_API_KEY="${ANTHROPIC_API_KEY}" \
  -v "$(pwd)":/home/agent/project \
  claude-sandbox:latest \
  --dangerously-skip-permissions \
  -p "Refactor the database layer"

--network=none prevents the container from making any outbound connections. The agent can still write files to the mounted volume.

If the agent needs to install packages or call external APIs, use a custom bridge network with explicit allowlists instead:

# Create a network with restricted egress
docker network create --driver bridge agent-net

# Run with that network
docker run --rm -it \
  --network=agent-net \
  -e ANTHROPIC_API_KEY="${ANTHROPIC_API_KEY}" \
  -v "$(pwd)":/home/agent/project \
  claude-sandbox:latest \
  --dangerously-skip-permissions \
  -p "Install the lodash package and refactor utility functions"

For strict egress control, put an HTTP proxy (Squid, mitmproxy) on the agent-net network and configure the container to use it. That's beyond most use cases, but it exists if your threat model requires it.

Running Codex in a Docker Sandbox

OpenAI's Codex CLI follows the same pattern. Install it globally in your Dockerfile:

FROM node:22-slim

RUN npm install -g @openai/codex

RUN apt-get update && apt-get install -y \
    git \
    curl \
    && rm -rf /var/lib/apt/lists/*

RUN useradd -ms /bin/bash agent
USER agent
WORKDIR /home/agent/project

ENTRYPOINT ["codex"]

Run it:

docker run --rm -it \
  --runtime=sandbox \
  -e OPENAI_API_KEY="${OPENAI_API_KEY}" \
  -v "$(pwd)":/home/agent/project \
  codex-sandbox:latest \
  --approval-mode full-auto \
  "Add error handling to all API routes"

--approval-mode full-auto is Codex's equivalent of --dangerously-skip-permissions. Same reasoning applies: safe in a container, risky on bare metal.

Custom Templates for Repeatable Environments

If you run agents against the same stack repeatedly, bake your dependencies into a base image:

FROM node:22-slim AS base

RUN npm install -g @anthropic-ai/claude-code

# Python stack (if your project uses it)
RUN apt-get update && apt-get install -y \
    python3 \
    python3-pip \
    python3-venv \
    git \
    curl \
    ripgrep \
    postgresql-client \
    && rm -rf /var/lib/apt/lists/*

# Project-specific Python deps
COPY requirements.txt /tmp/requirements.txt
RUN pip3 install --no-cache-dir -r /tmp/requirements.txt

RUN useradd -ms /bin/bash agent
USER agent
WORKDIR /home/agent/project

ENTRYPOINT ["claude"]

Build this once and tag it:

docker build -t myproject-claude-sandbox:v1 .

Now every agent run starts from a consistent environment. No "works on my machine" issues when you share the image with teammates.

Troubleshooting Common Problems

Permission denied when writing to the mounted volume

The container user (agent, UID 1000) may not match your host user's UID. Fix it:

docker run --rm -it \
  --user "$(id -u):$(id -g)" \
  -e ANTHROPIC_API_KEY="${ANTHROPIC_API_KEY}" \
  -v "$(pwd)":/home/agent/project \
  claude-sandbox:latest \
  --dangerously-skip-permissions \
  -p "your task"

Agent exits immediately without doing anything

Claude Code in non-interactive mode needs -p (the prompt flag) to know what to do. Without it, it drops into an interactive REPL that has no TTY input and exits.

--runtime=sandbox not found

You're on Docker Desktop older than 4.58, or you're on a Linux host without the microVM runtime configured. Remove the flag — standard container isolation is sufficient for most agent workloads.

The agent is modifying files I didn't want it to touch

Narrow the mount. Instead of mounting $(pwd), mount only the subdirectory the agent should operate in:

-v "$(pwd)/src":/home/agent/project/src

Docker Sandbox vs. Bare-Metal VPS: When to Use Which

Factor	Docker sandbox (local)	Bare-metal VPS
Setup time	Minutes	15-30 minutes
Cost	Free (your machine's CPU)	$4-20/month
Session persistence	None — container dies	Persistent via tmux/screen
Remote access	Not built in	SSH
Concurrent agent runs	Limited by RAM	As many as you provision
Cold start	Fast (seconds)	Always running
Best for	One-off tasks, local dev	Long autonomous runs

If you're running a 20-minute refactor and you're at your desk, Docker sandbox wins. If you're running an overnight research job and want to check in from your phone at midnight, a persistent VPS or dedicated cloud environment is the right tool.

Turn on Grass If You Need the Docker Sandbox to Also Be Always-On and Remote

Docker sandboxes solve local security. They don't solve:

Sessions dying when your laptop sleeps
Checking agent progress from your phone
Approving a tool execution (a bash command, a file deletion) while you're away
Coming back to a machine that's been off for six hours and picking up mid-task

Grass is built for this gap. It gives you a pre-configured Daytona VM that runs Claude Code 24/7 — you don't set it up, it's already configured and waiting. Sessions persist across disconnects; when you reconnect, you're back where you left off. If the agent hits a tool call that needs approval (a destructive bash command, a large file write), Grass forwards that permission prompt to a native mobile modal. You approve or deny it from your phone. The agent continues or stops accordingly.

Your API key stays yours — Grass uses BYOK, so the key goes directly from you to Anthropic. Grass never touches it.

The free tier gives you 10 hours with no credit card. If you're already comfortable running agents in Docker, Grass is the same mental model extended to always-on remote infrastructure.

FAQ

Can I use --dangerously-skip-permissions safely inside Docker?

Yes. The flag bypasses Claude Code's internal confirmation prompts. Inside a container, the worst outcome is the container destroys its own filesystem — which is ephemeral anyway. Your host is isolated by the container boundary, and Docker Desktop 4.58+ adds a microVM layer on top of that. This is the recommended way to run unattended agent sessions.

Do I need Docker Desktop 4.58+ for the Sandboxes feature?

Yes. The --runtime=sandbox flag and the microVM-based isolation are specific to Docker Desktop 4.58+. On older versions or on Linux without additional configuration, just omit the flag. Standard Docker container isolation is sufficient for most coding agent threat models.

How do I prevent the agent from making network requests?

Pass --network=none to docker run. The container loses all outbound connectivity. Use this for tasks that don't need external access — refactoring, test generation, documentation. If the agent needs to install packages, use a bridge network instead of none.

What's the difference between running Codex vs. Claude Code in a Docker sandbox?

The container setup is nearly identical — install the CLI globally, mount your project, pass the API key as an environment variable. The difference is the full-auto flag: Codex uses --approval-mode full-auto, Claude Code uses --dangerously-skip-permissions. Both agents benefit equally from container isolation.

Does the Docker sandbox approach work on Linux, not just macOS?

Yes, and it's arguably simpler on Linux. Standard Docker isolation is mature on Linux. The Docker Desktop Sandboxes microVM feature targets Docker Desktop specifically, but Linux users can achieve similar hypervisor-level isolation with tools like gVisor (--runtime=runsc) or Kata Containers on supported kernels.