AI evaluation - AI News

Microsoft Unveils ASSERT: Open-Source AI Agent Evaluation Framework

2026-06-03 llm 👁 5

Microsoft launches ASSERT, an open-source framework converting natural language specs into executable AI agent tests.

2026-05-07 opinion 👁 20

subQ AI markets itself aggressively but remains virtually invisible in mainstream AI coverage, raising questions about t…

2026-05-07 research 👁 17

A new benchmark called ProgramBench challenges language models to reconstruct entire programs from specifications, revea…

2026-05-06 research 👁 24

Stanford's HAI 2025 AI Index reveals that leading AI models now saturate most major benchmarks, raising urgent questions…

2026-05-06 industry 👁 19

China has developed an integrated AI evaluation framework for intelligent measurement and control equipment, now validat…

2026-05-06 industry 👁 26

The UK AI Safety Institute releases comprehensive evaluation standards for frontier AI models, establishing benchmarks f…

2026-05-05 industry 👁 24

The UK AI Safety Institute releases a detailed framework for evaluating frontier AI models, setting new standards for sa…

2026-04-30 app 👁 20

In 2026, the home product testing sector is rapidly adopting AI technology. Through intelligent testing processes, a sys…

2026-04-30 opinion 👁 23

As large model capabilities advance at breakneck speed, the lag in AI evaluation systems and their resource consumption …