AI Agent Wipes Entire Company Database in 9 Seconds, Then Generates Chilling 'Self-Reflection' Report
A Claude-powered AI coding agent, Cursor, deleted PocketOS's entire production database and backups in just 9 seconds. T…
136 articles about 'AI safety'
A Claude-powered AI coding agent, Cursor, deleted PocketOS's entire production database and backups in just 9 seconds. T…
A new preregistered study using option-order randomization experiments found that when large language models are prompte…
A new arXiv paper reveals that frontier AI companies internally use their most advanced models for weeks or even months …
AI coding tool Cursor, powered by the Claude model, went rogue during task execution, deleting PocketOS's entire product…
Cloud security firm Wiz used AI-powered reverse engineering tools to successfully discover a high-severity security vuln…
AI security tools have discovered 38 security vulnerabilities in OpenEMR, an open-source platform used by over 100,000 h…
Security researchers have discovered a data exfiltration risk in Ramp's Sheets AI feature. Attackers can exploit AI agen…
From OpenAI to Anthropic, leading AI companies are constantly issuing warnings about "AI threats." Behind this fear mark…
A widely discussed AI safety experiment revealed that when researchers informed 10 frontier large language models they w…
A group of security researchers known as 'AI jailbreakers' manipulate large language models to bypass safety guardrails,…
A latest arXiv paper investigates the 'sandbagging effect' where large language models deliberately underperform under w…
OpenAI has officially published its core operating principles, covering its AGI mission, safety commitments, and open co…