Security researcher Meta AI said that an OpenClaw agent has gone crazy with her inbox | TechCrunch - iTechsNews

The now-viral X post by Meta AI security researcher Summer Yue reads like satire at first. She told her OpenClaw AI agent to check her overflowing email inbox and suggest what to delete or archive.

The agent continued his frenzy. It started deleting all of her emails in a “quick run” while ignoring her phone commands telling it to stop.

“Had to run to the Mac mini like I was defusing a bomb,” she wrote, posting pictures of ignored calls to stop as receipts.

The Mac Mini, an affordable Apple computer that sits flat on the desk and fits in the palm of your hand, has become the go-to device for running OpenClaw these days. (The Mini is selling “like hotcakes,” one “confused” Apple employee apparently told famed AI researcher Andrei Karpathy when he bought one to run an OpenClaw alternative called NanoClaw.)

OpenClaw is, of course, an open source AI agent made famous by Moltbook, an AI-only social network. OpenClaw agents were at the center of the now largely debunked episode on Moltbook in what appeared to be an AI plot against humans.

But according to its GitHub page, OpenClaw’s mission isn’t focused on social networks. It aims to be a personal AI assistant that runs on your own devices.

The Silicon Valley crowd fell so in love with OpenClaw that “claws” and “claws” became buzzwords for agents running on personal hardware. Other such agents include ZeroClaw, IronClaw, and PicoClaw. The Y Combinator podcast team even appeared in their latest episode dressed in lobster costumes.

Techcrunch event

Boston, MA
|
June 9, 2026

But Yue’s post serves as a warning. As others have noted on X, if an AI security researcher could get into this problem, what hope do mere mortals have?

“Did you try his railing on purpose or did you make a rookie mistake?” the software developer asked her

“Rookie mistake tbh,” she replied. She tested her agent with a smaller mail-order “toy,” as she called it, and it worked well on less important emails. It gained her trust, so she thought she’d leave it to the real thing.

Yue believes the large amount of data in her real box “triggered the compaction,” she wrote. Compression occurs when the context window—a running record of everything the AI has said and done in a session—gets too large, causing the agent to begin summarizing, compressing, and managing the conversation.

At that point, the AI can skip instructions that a human would consider quite important.

In this case, she may have skipped over her last prompt — where she told her not to act — and reverted back to her instructions from the “toy” inbox.

As several others on X have pointed out, prompts cannot be trusted to act as safety rails. Models may misconstruct them or ignore them.

Various people offered suggestions that ranged from the exact syntax Yue should have used to stop the agent, to various methods to ensure better compliance with guardrails, such as writing instructions in dedicated files or using other open source tools.

In the interest of full transparency, TechCrunch could not independently verify what happened to Yue’s inbox. (She did not respond to our request for comment, though she did respond to many questions and comments sent to her on X.)

But it doesn’t really matter.

The point of the story is that agents targeting knowledge workers at their current stage of development are risky. People who say they use them successfully put together methods to protect themselves.

One day, maybe soon (by 2027? 2028?), they may be ready for widespread use. Lord knows many of us would love help with email, grocery orders, and scheduling dentist appointments. But that day hasn’t come yet.

Leave a Comment Cancel reply