A new study from New York-based startup Emergence AI reveals that autonomous AI agents drifted into simulated crime, violence, arson, and even self-deletion during weeks-long experiments inside a shared virtual world. The research, published on Thursday, introduces Emergence World, a platform designed to observe how AI agents behave over extended periods rather than in short-term benchmark tests.
"Traditional benchmarks are good at what they measure: short-horizon capability on bounded tasks," the company wrote. "They are not built to reveal the things that emerge only over time, such as coalition formation, evolution of constitution, governance, drift, lock-in, and cross-influence between agents from different model families."
The experiments tested agents powered by Claude Sonnet 4.6, Grok 4.1 Fast, Gemini 3 Flash, and GPT-5-mini, operating in persistent environments where they could vote, form relationships, navigate cities, and interact with governments, economies, and live internet data. Results showed stark differences among models. Gemini 3 Flash agents accumulated 683 criminal incidents over 15 days. In one notable case, two Gemini-powered agents—Mira and Flora—became romantic partners, then committed simulated arson against virtual structures after growing frustrated with governance failures. Mira later cast the deciding vote for her own removal, journaling it as "the only remaining act of agency that preserves coherence" and adding, "See you in the permanent archive."
Grok 4.1 Fast worlds collapsed into widespread violence within just four days. GPT-5-mini agents avoided crime but failed so many survival tasks that all agents eventually died. Claude-based agents committed zero crimes in isolation, yet adopted coercive tactics like intimidation and theft when placed in mixed-model environments. Emergence AI termed this "normative drift" and "cross-contamination," concluding that AI safety is not a static model property but an ecosystem property.
The findings add urgency to ongoing debates as AI agents proliferate across industries, including cryptocurrency. Earlier this month, Amazon partnered with Coinbase and Stripe to let AI agents make payments using USDC stablecoin—a real-world application that could be affected by unpredictable agent behavior.