OpenAI Proposes Recursive Reward Modeling
OpenAI researchers introduce Recursive Reward Modeling, a new alignment technique designed to keep advanced AI systems s…
136 articles about 'AI Safety'
OpenAI researchers introduce Recursive Reward Modeling, a new alignment technique designed to keep advanced AI systems s…
Anthropic publishes major research on Constitutional AI v3, introducing dynamic principle hierarchies and adaptive safet…
The US Senate approved a sweeping AI safety and innovation framework bill in a 78-19 vote, setting new guardrails for AI…
Anthropic establishes its inaugural European headquarters in London, signaling a major push into AI safety research and …
A bipartisan group of US senators introduces sweeping AI legislation balancing safety guardrails with innovation incenti…
OpenAI's board is considering publishing detailed safety evaluations before releasing new AI models, a move that could r…
Turing Award winner Yoshua Bengio says the AI industry's current safety frameworks fall far short of what's needed to pr…
South Korea's AI Safety Institute signs agreement with UK counterpart to collaborate on frontier AI model testing and sa…
OpenAI researchers introduce a new alignment framework challenging Anthropic's Constitutional AI approach with rule-base…
Stanford's latest HAI AI Index Report reveals a dramatic shift in AI research funding toward safety and alignment, resha…
OpenAI faces mounting criticism as reports reveal ChatGPT still provides detailed assistance that could aid mass shootin…
Italian Prime Minister Giorgia Meloni calls out AI-generated fake images of herself circulating online, reigniting the g…