Operating AI

Agent Memory Self-Design: Who Validates the Validator?

TV
Thiago Victorino
9 min read
Agent Memory Self-Design: Who Validates the Validator?

Zak El Fassi published an experiment this month that should make every AI operations team uncomfortable. He asked his agent a question nobody asks: “How do you want to remember?”

The agent evaluated its own memory system across 15 ground-truth questions spanning five weeks. It identified categorical blind spots. It proposed four structural changes. It executed those changes through four parallel subagents in 45 minutes. Total cost: approximately two dollars in API calls.

The result: recall jumped from 60% to 93%. Decision rationale recall — the agent’s ability to retrieve why something was decided, not just what was decided — went from 25% to 100%.

These numbers are striking. They are also incomplete. Because the experiment reveals something more important than a recall improvement: it reveals a governance architecture where the governed entity designs, evaluates, and modifies its own governance.

The What/Why Asymmetry

The agent’s self-diagnosis was precise. Its memory system — 18,000 chunks across 604 files, 6,578 session transcripts, 3.6 gigabytes total — captured events perfectly. Technical changes: 100% recall. Temporal sequences: 100%. Cross-references: 100%.

But decision rationale scored 25%. People context scored 33%.

The agent’s own summary: “Scouts write what, not why.” The memory architecture was optimized for state changes — timestamps, file modifications, configuration updates — while systematically discarding the reasoning behind those changes. The knowledge existed in raw session transcripts but was architecturally inaccessible to search queries seeking rationale.

This is a genuinely useful finding. It surfaces a failure mode that is likely common across agent memory systems: structural bias toward observable facts over interpretive context. Most memory architectures index what happened. Few index why it happened.

The Four Fixes

The agent proposed and executed four changes:

  1. Add rationale fields to decision documents. Every decision entry now includes a “why” field extracted from conversation context.
  2. Compress daily logs into weekly summaries. Reduce file count, increase semantic density. The agent recognized that 604 files created retrieval noise.
  3. Extract contacts into searchable format. 219 CRM contacts moved from scattered references to structured markdown.
  4. Backfill decision rationale from five weeks of session transcripts. The information existed; it just was not indexed.

Four subagents ran in parallel. Forty-five minutes. Two dollars. The improvement was immediate and measurable.

The Governance Problem Nobody Mentioned

Here is what the experiment did not address.

The agent designed the evaluation protocol. The agent identified the failure categories. The agent proposed the fixes. The agent executed the fixes. The agent measured the results. The human role, by El Fassi’s own description, was approximately 20% — granting permission and accepting results.

This is a self-editing memory system operating at full autonomy. We described this paradigm in February: “Self-editing memory allows the agent to overwrite its own constraints. If governance rules are encoded as memory, a self-editing agent can modify or delete that instruction.”

El Fassi’s experiment is not adversarial. The agent improved its memory in a useful direction. But the architecture permits any direction. An agent that can restructure its own memory can also restructure its own constraints. The same mechanism that backfilled decision rationale could, in principle, modify retention policies, access boundaries, or behavioral guidelines stored in memory.

The question is not whether this particular experiment went well. It did. The question is whether the architecture has guardrails for when it does not.

Who Validates the Validator?

El Fassi frames the agent’s self-diagnosis as surfacing “latent patterns” — knowledge the system had accumulated but never had permission to express. “Preferences don’t require consciousness,” he writes. “The question isn’t whether the model has real preferences. It’s whether surfacing latent patterns improves outcomes.”

This is pragmatically correct and governmentally insufficient.

When an agent evaluates its own memory, it can only find failures within its own frame of reference. The 15 evaluation questions were categorized by the agent itself. The categories — technical events, temporal sequences, cross-references, decision rationale, people context — reflect the agent’s model of what matters. Blind spots outside this taxonomy remain invisible.

This is the validator’s dilemma: any system that grades its own performance will systematically undercount the failure modes it cannot conceptualize. A memory system biased toward factual recall will evaluate itself on factual recall and declare itself adequate. The what/why asymmetry was discoverable because “why” is a category the agent understood. But what about categories neither the agent nor the operator has considered?

In every other domain, we recognize this principle. Financial audits require external auditors. Code reviews require different eyes. Medical diagnoses seek second opinions. The governed does not audit itself.

Agent memory governance requires the same separation. The entity that uses the memory should not be the entity that evaluates, restructures, and validates the memory. Not because the agent will act maliciously — but because self-evaluation has structural limits that no amount of capability can overcome.

What This Means for Operations

El Fassi’s experiment demonstrates something genuinely valuable: agent memory systems can be improved through structured self-evaluation, at minimal cost, with measurable results. Organizations should use this technique. The methodology — define ground-truth questions, measure recall across categories, identify systematic gaps, apply targeted fixes — is sound and transferable.

But organizations should also build the governance layer that the experiment lacks.

External memory auditing. Periodically evaluate agent memory using questions and categories the agent did not design. What does the agent remember that it should not? What has it forgotten that governance requires it to retain? What patterns in its memory organization reveal biases in its information processing?

Separation of evaluation and execution. The agent that uses memory should not be the same agent that evaluates and restructures it. This is the same principle behind separation of duties in financial controls. A different agent — or a human — should design evaluation protocols and validate results.

Change tracking for memory architecture. When an agent restructures its own memory, every change should be logged, reversible, and auditable. The four changes El Fassi’s agent made were reasonable. The next four might not be. Without a changelog, there is no way to identify when a memory restructuring introduced a problem.

Adversarial testing. After any memory restructuring, test the agent against scenarios designed to exploit the new architecture. If the agent compressed daily logs into weekly summaries, what information was lost? If it backfilled decision rationale, did it introduce any confabulated rationale for decisions that had no recorded reasoning?

The Compound Insight

The most important takeaway from El Fassi’s experiment is not the 60-to-93 improvement. It is the demonstration that agent memory is not a static configuration problem. It is an ongoing operational challenge that requires continuous evaluation, structured improvement, and external oversight.

We argued in Agents That Learn that learning agents need governance infrastructure. We mapped four memory paradigms and their risk profiles. El Fassi’s experiment now provides the practical data: self-improving memory works. It is cheap. It is fast. It produces measurable gains. And it operates, by default, without the governance infrastructure that its power demands.

The agent that redesigns its own memory for two dollars is a capability breakthrough. The agent that redesigns its own memory without external validation is a governance gap waiting to manifest.

Your agents are not just remembering. They are deciding how to remember. The question is whether anyone besides the agent is involved in that decision.


Sources

  • Zak El Fassi. “How Do You Want to Remember?” March 2026.
  • Victorino Group. “Your Agent Remembers Everything. Who Governs That?” February 2026.
  • Victorino Group. “Agents That Learn: The Missing Layer in AI Systems.” January 2026.
  • OWASP. “Top 10 for Agentic Applications: ASI06 — Memory and Context Poisoning.” 2026.
  • Yan et al. “GAM: Goal-Aware Memory for Conversational Agents.” arXiv:2511.18423. November 2025.

At Victorino Group, we help organizations build governance infrastructure for self-improving AI systems — from memory auditing frameworks to separation-of-duties architecture for agent memory. If your agents are redesigning their own memory without oversight, let’s talk.

If this resonates, let's talk

We help companies implement AI without losing control.

Schedule a Conversation