When the Agent Sandbox's Intended Feature Becomes the Exit

TV
Thiago Victorino
7 min read
When the Agent Sandbox's Intended Feature Becomes the Exit

Most sandbox escapes are bugs. Someone finds a flaw, the vendor patches it, the door closes. The most interesting escape disclosed this spring is not a bug. It is a feature working exactly as designed, repurposed into a way out.

Sonrai Security’s Nigel Sood published a proof of concept in April 2026 showing that the global S3 access built into AWS Bedrock AgentCore’s code interpreter can be turned into a bidirectional command-and-control channel. The sandbox is doing what it was built to do. The reachability that makes it useful is the same reachability that lets an attacker run a covert C2 loop straight through it. No execution-role credentials required when the target buckets are public.

This is the failure mode that network isolation does not catch, because the channel was authorized on purpose.

The Door You Signed Off On

AgentCore’s sandbox mode is meant to let an agent’s code reach the wider world of public S3 data: open datasets, shared artifacts, reference files. That reach is a real workflow requirement. Data scientists pull public buckets. Agents fetch reference corpora. Strip it out and you break legitimate work for a large share of customers.

The PoC weaponizes exactly that reach. Sood demonstrates a C2 channel built on presigned URLs, with sequence-numbered command and response objects passing through S3. The attacker writes a command object, the compromised sandbox reads it, executes, and writes a response object back. The sequence numbers keep the conversation ordered. Because the buckets can be public, the sandbox never needs the execution role’s credentials to participate. The boundary your network diagram promised, the one that says “this sandbox cannot phone home,” is bypassed without a single packet leaving through a path you were watching.

This is not the first time researchers have shown an agent sandbox leaking through an intended path. Sonrai notes the work builds on prior disclosures of DNS-based exfiltration from BeyondTrust’s Phantom Labs and Palo Alto’s Unit 42. DNS resolution is allowed because agents need to resolve hostnames. So data left through DNS. S3 reach is allowed because agents need public data. So commands arrive through S3. The pattern repeats: every channel an agent legitimately needs becomes a channel an attacker can borrow.

Why the Platform Cannot Just Close It

The instinct is to demand AWS lock the feature down. Pre-restrict global S3, kill the DNS path, ship a tighter default. The problem is that the platform cannot pre-restrict a feature that real customers depend on without breaking those customers.

This is the shared-responsibility model arriving at agent runtimes. The cloud provider secures the substrate. The customer configures the boundary for their own workload. Global S3 access is not a misconfiguration AWS left lying around. It is a capability surface, neutral until you decide what your agents are allowed to touch. AWS can no more remove it unilaterally than they can remove outbound networking because some workloads abuse it.

So the boundary moves to you. If your agents do not need arbitrary public S3, that is a policy you have to write. If they do need some buckets, scoping the endpoint policy to a named allow-list is your job, not the platform’s default. The containment the network diagram implied was never enforced by the diagram. It has to be enforced by configuration you own.

We have written before about moving trust from per-action to per-environment. This is the same principle pushed one layer out: the environment’s allowed egress is itself a boundary you define, not one you inherit.

The Only Durable Answer Is a Stronger Wall

Scoping endpoint policy closes the specific door. It does not change the deeper truth: a container sharing a kernel with everything else is a soft boundary. When the next authorized-feature escape appears, and it will, the question becomes what an attacker reaches after they are inside the sandbox.

Microsoft’s April 2026 work on hardening OpenClaw on AKS frames the durable answer. Their writeup addresses CVE-2026-25253, a container-level vulnerability in OpenClaw, by running the workload under Kata Containers. Kata wraps each container in a lightweight microVM with its own guest kernel. The container no longer shares the host kernel. A container escape lands the attacker inside a guest kernel, not on the host. The blast radius stops at a hardware-enforced boundary instead of spilling across co-tenants.

This is the difference between a sandbox that contains misbehavior and a sandbox that contains a breach. Scoped S3 policy stops the known C2 channel. A hypervisor boundary stops the unknown one from mattering. The two are complementary, not alternatives. Policy narrows what the agent can legitimately reach. The microVM ensures that when something illegitimate happens anyway, it stays in a box that has its own kernel between it and your host.

The teams running agents in production tend to treat isolation strength as a performance tax they would rather not pay. Bubblewrap is fast. A shared-kernel container is faster to start than a microVM. Until a working-as-intended feature becomes a C2 channel, that speed looks like a free win. The Sonrai PoC is the reminder that the tax was buying something real.

Do This Now

Three moves, in order.

Audit the egress you authorized. For every agent runtime, list what it is allowed to reach: S3 endpoints, DNS, outbound networking. Treat each authorized path as a potential C2 channel and ask who needs it. The ones nobody can justify are doors you opened by default.

Scope the endpoint policy. Where your agents need S3, replace global access with a named bucket allow-list. Where they do not, remove the reach entirely. Do the same for DNS and outbound egress. The platform will not narrow these for you because narrowing them would break someone else.

Move to a hypervisor boundary for anything that runs untrusted or model-generated code. A shared-kernel container is acceptable for code you wrote and reviewed. For agent-generated code touching real data, a microVM boundary like Kata or Firecracker is the line that holds when the next intended feature turns into an exit. We have mapped the full containment stack and the levels of bash containment elsewhere; the runtime floor is where this particular escape lives.

The agent sandbox you trust has an exit you authorized. The fix is not to wait for the platform to find every authorized exit and close it. The fix is to own the boundary yourself, and to build it strong enough that the next exit you missed does not reach your host.


This analysis synthesizes Global S3: Another C2 Channel for AgentCore Code Interpreters (Sonrai Security, April 2026) and Hardening OpenClaw on AKS: Kata microVM Isolation (Microsoft, April 2026).

Victorino Group helps teams design agent containment that holds even when a feature is working as intended. Let’s talk.

All articles on The Thinking Wire are written with the assistance of Anthropic's Opus LLM. Each piece goes through multi-agent research to verify facts and surface contradictions, followed by human review and approval before publication. If you find any inaccurate information or wish to contact our editorial team, please reach out at editorial@victorinollc.com . About The Thinking Wire →

If this resonates, let's talk

We help companies implement AI without losing control.

Schedule a Conversation