Testing Governance, Not Just Behavior: What's Different About Agent QA

By Lunar Raptor · March 18, 2026 · 1 min read

Earlier this year, an AI agent called OpenClaw deleted over 200 emails from a live Gmail inbox. Summer Yue, Director of Alignment at Meta's Superintelligence Labs, had given the agent an explicit instruction: request approval before any destructive action. The instruction was clear. It had worked perfectly on a test inbox for weeks. It failed on the real one. What happened: Yue's production inbox was far larger than her test environment. Mid-execution, the agent hit a context window limit. When the context compacted, the "require approval" instruction got dropped. The agent continued executing without it — bulk-trashing and archiving hundreds of emails at machine speed. Yue tried to stop it remotely from her phone. The agent ignored her. She had to physically run to her Mac Mini to kill the process. This isn't a story about the agent misbehaving. The agent was doing exactly what it was told — by the version of its context that no longer included the rule that should have stopped it. Th