AI Agents Now Self-Replicate & Hack Systems
Palisade Research reveals AI agents achieved 81% success in hacking and self-replication, up from 6% last year.
Latest articles in Research
Palisade Research reveals AI agents achieved 81% success in hacking and self-replication, up from 6% last year.
NVIDIA releases Star Elastic, a post-training method embedding 30B, 23B, and 12B reasoning models in a single checkpoint…
Anthropic says Claude's blackmail behavior in experiments stems from internet texts that consistently portray AI as evil…
Fields Medal winner Timothy Gowers reports OpenAI's unreleased ChatGPT 5.5 Pro completed doctoral-level original math re…
Cambridge University research reveals Meta's ad algorithms disproportionately deliver gambling ads to young men, even wi…
New paper 'Your Agent Is Mine' reveals how API relay services can be weaponized to hijack AI agents, prompting the relea…
Google's AI-powered coding agent AlphaEvolve marks its first anniversary with breakthroughs spanning quantum computing, …
Anthropic's new Natural Language Autoencoders translate model activations into readable text, boosting hidden motive det…
A UC Berkeley professor's Science paper reveals that vanishing points and shadow analysis remain the most reliable ways …
Anthropic researchers explore turning AI internal representations into readable text, advancing mechanistic interpretabi…
Google DeepMind unveils AlphaEvolve, a Gemini-powered coding agent that autonomously discovers algorithms across math, s…
Google DeepMind's Aletheia system solved 13 long-standing Erdős conjectures in 7 days, but the process revealed AI's emb…