Four Containment Surfaces, One Diagram: The Agent Stack Just Got Drawn
Compute, data, knowledge, identity. Four agent-control surfaces shipped reference implementations in one week. The architecture is no longer optional.
Speed and safety are not opposites. They are partners when governance is built in.
84 articles
Compute, data, knowledge, identity. Four agent-control surfaces shipped reference implementations in one week. The architecture is no longer optional.
From virtual assistants to autonomous vehicles, AI agents are everywhere. Learn the three-stage Sense-Think-Act architecture.
Why two-thirds of organizations are stuck in pilot purgatory, and how to join the 8.6% that reach production with AI agents.
Eugene Yan's five primitives for compounding with AI: context, config, verification, delegation, loop closure. Two of them are the ones most teams skip.
Three independent analyses converged on the same thesis this week. Model performance is becoming table stakes. The new moats live elsewhere.
Anthropic, Figma, and Vercel each turned governance into a product surface in the same week. Procurement now has comparable line items.
Same model, different harness, 4.5-point swing. Terminal-Bench 2.0 confirms harness selection is a procurement variable, not an implementation detail.
Last week the harness was a SKU. This week it is an architecture. Three decisions buyers are signing off on, often without realizing it.
Krishnan measured multi-agent architectures: coding favors solo, reasoning favors markets, hub-spoke costs 4x more. Procurement now has a number.
Vercel, Perplexity, and Addy Osmani published three incompatible Skill designs in six weeks. The disagreement reveals where governance must hold.
Three writers, three domains, one shift in a single week: static substrates do not survive agents. Tokens, ADRs, and judgment have to move with them.
OpenAI on Bedrock, Mistral Vibe, AWS Neuron Agentic. In seven days, agent runtimes became a vendor purchase, not an in-house build.
AI design tools learned code primitives, not your design system. The fix is not designer discipline — it is constrained generation.
OpenAI, Google, and Adobe each shipped a workspace-as-agent-environment play in seven days. The platform layer for agent governance is being defined right now.
Capability is no longer the bottleneck. The ungoverned 60% of product knowledge outside the codebase is. Three writers converge on the same fix.
ManyIH-Bench shows frontier models resolve instruction hierarchy at ~40% when sources exceed four tiers. Your permission model runs on a 40% base.
LukeW says designers should code again. The real story is governance - when AI collapses the handoff wall, the boundary moves and ownership shifts.
Four April 2026 signals converge: role separation — planner, worker, validator — is replacing prompt engineering as the unit of AI governance.
If memory is the moat and memory lives in the harness, a vendor-owned harness is a vendor-owned moat. The exit question most teams can't answer.
Anthropic turned the agent harness into a SKU. Build-vs-rent is now a governance decision, not a technical one. Here is the split.
Anthropic decouples brain/hands/session. Claw-Eval grades safety alongside completion. Infrastructure catches up to capability.
Spec-driven development's 7 enterprise adoption barriers are organizational, not technical. Here is what the tooling vendors won't tell you.
Semi-formal reasoning forces AI agents to show verifiable evidence before making code judgments. The real value is auditability, not accuracy.
Spacelift Intent, Google Skills, Claude Code harnesses — the pattern is clear. Effective governance is invisible, embedded in tooling, not bolted on.
Matt Rickard, Google, GitHub, Anthropic, and Kent Beck converge on specs as the primary governance mechanism for AI agents.
Hallucinations are intrinsic to probabilistic AI. The discipline that matters is not elimination but governance: layered detection, validation, and control.
AI agents let non-technical users modify and redistribute software. Open-source is now a governance decision, not just a technical one.
AI assistants collapse design into code. A five-level framework forces the decisions back into the open. But agentic AI needs more.
PostHog built 44 tools before choosing MCP. Ossature added verification loops. Anthropic shipped a safety classifier. All constrain the environment.
A PM platform, a security team, and an infra provider independently built governed AI agents. They converged on four identical patterns.
Stripe and Paradigm launched MPP with Visa, Mastercard, and both AI labs. The protocol is live. The governance is not.
Vint Cerf says trust is infrastructure. Lean 4 says types are proof. A Haskell expert says specs become code. All three are right.
Three companies shipped agent containment in one week. The pattern is identical: YAML policies, egress proxies, credential isolation.
GitLab cut SOC controls by 58% with a custom framework. AI can now rewrite GPL code in days. Both stories reveal how governance actually works.
Three independent frameworks converge on the same conclusion: agent specs are not documentation. They are auditable, enforceable governance infrastructure.
The Agent Skills standard solves what monolithic agents never could: modular, auditable, version-controlled AI capabilities.
Three practitioners independently rediscovered the same truth: AI agents need engineering discipline, not new frameworks.
Reliable AI agents come from environmental constraints, not better prompts. Three independent sources converge on the same architectural principle.
Three companies running AI agents at scale converged on the same principle: maximum autonomy inside structural constraints.
Most AI agents forget everything between sessions. Learn how runtime learning transforms agents from tools into teammates.
A simple filesystem outperforms sophisticated memory solutions. Discover what benchmarks reveal about memory architectures for AI agents.
How to use AI that acts and free up time for what matters. Use cases and tools for PMs without code.
Insights from Uber's Gen AI on-call copilot. RAG vs fine-tuning, Spark pipeline, and the quality secret.
The definitive guide for specifications that work. 5 principles tested by Google and GitHub engineers.
The difference between AI that responds and AI that acts. How agentic systems transform expectations and deliver 4x productivity.
Google engineers walked through spec-driven development on stage at Cloud Next 2026. The methodology is not new. The public commitment is.
Bedrock Managed Agents, Claude creative connectors, and AI-friendly APIs. Three signals, one direction. Governance is no longer a layer. It is the platform.
Three independent practitioners reached for the same idea this week using different vocabularies. AI velocity breaks unless verification keeps up.
OpenAI just open-sourced a spec calling Linear-class issue trackers the control plane for coding-agent fleets. The vocabulary fight has started.
Docker's microVM architecture makes the argument concrete. An LLM deciding its own security boundaries is not a security model. Infrastructure is.
OpenAI's SDK update adds sandboxes and memory. It solves execution. It does not solve governance. The attack surface grew; the controls did not.
McKinsey names trust and guardrails for the first time. The sixth chapter in our tracking series. What the manifesto still leaves out.
Fowler's team frames AI instruction files as developer tooling. The real move is organizational governance. Here is where their pattern stops and ours begins.
AI agents are selecting vendors autonomously. Optimizing for agent discovery without governing what you expose is a liability.
OpenAI built governance into their agent platform. That solves the containment problem. It does not solve the governance problem.
McKinsey frames enterprise architecture as incremental vs comprehensive. The real choice is governance maturity. Fifth in our tracking series.
Cursor ships model checkpoints every 5 hours. OpenAI's own training lags its spec. Governance built for quarterly review cannot survive continuous deployment.
Figma's MCP beta lets agents write to the canvas. Design systems are no longer style guides. They are constraint layers for autonomous software.
McKinsey prescribes agent factories to fix 80% AI failure. They left out governance. Again. The fourth chapter in a pattern we have been tracking.
OpenAI published its Model Spec and Safety Bug Bounty. What these artifacts reveal about treating governance as product infrastructure.
McKinsey went from measuring AI wrong to calling it a design problem to using the word governable. The pattern reveals more than any single article.
McKinsey says AI's scaling problem is a design problem. They're half right. Design is the interface layer of governance.
Uber and Stripe built governance into infrastructure. Microsoft hired a quality head. The pattern reveals what most companies are missing.
OpenAI published four personality profiles for AI agents. They missed the point. Personality is fleet-wide behavioral governance, not prompt cosmetics.
Executives self-report 16-45% AI gains. Controlled trials show 19% slowdown. The perception mismatch is not noise. It is missing infrastructure.
Hyperscalers give away agent SDKs to sell runtime. The real contest is governance: security, evaluation, context control. Bet on that layer.
Spec-driven development compresses PM cycles. It also turns every unclear requirement into a production risk nobody reviews.
When consulting firms deploy your AI agents, they also define your governance. Enterprises need to decide who owns the rules.
LLMs don't just write tactical code. They turn entire organizations into tactical tornadoes. The fix isn't better code review.
Apple mapped 55 UX features for AI agents. The finding most teams miss: governance that users cannot see is governance that does not work.
Cursor, Docker, Zenity, and Entire shipped four distinct containment layers in one week. The shift from approval fatigue to trust boundaries is here.
Bounded autonomy is the right design target for AI agents. Platform engineering is the governance layer most organizations already have.
Anthropic published a guide to fix generic AI designs. The real lesson: output quality requires the same governance as safety and compliance.
Every concern about AI code generation maps to a governance failure, not a technology deficiency. The question was never whether to use AI.
Product teams encoding brand rules into CLAUDE.md are doing governance-as-code. Content ops proves the pattern.
AI is reshaping software careers. The advice to 'learn business' misses the point. The premium is in governance-aware architecture.
AI compresses feasibility, viability, and usability risk. Desirability becomes the only differentiator. What changes for product teams.
Why AI agents fail in production and how orchestration fixes it. Temporal, Conductor, and LangGraph compared.
How leading companies integrate AI into design processes with templates, instruction files, plugins, and internal chatbots.
Why conservative design beats clever automation. Lessons from nuclear engineering for building AI systems that fail gracefully.
Analysis of Anthropic's 23,000-word framework and how to apply its principles to corporate AI governance.
Google UCP enables AI agents to complete purchases conversationally. See how retailers integrate and what it means for e-commerce.
With salaries reaching $4,000/month, AI agent specialists are among Brazil's most sought-after professionals. But what does this mean for your company?
After a year of agentic AI projects, clear patterns emerge. Six fundamental lessons for capturing real value with autonomous agents.
Building and deploying AI with governance baked in from day one.
Start Your Implementation