Study Reveals Claude's Cross-Language Response Consistency Performance
A new study based on the ILR (Interagency Language Roundtable) proficiency scale framework systematically evaluates Clau…
Latest articles in Research
A new study based on the ILR (Interagency Language Roundtable) proficiency scale framework systematically evaluates Clau…
A new study introduces CL-bench Life, a benchmark that systematically evaluates the ability of large language models to …
A new study proposes a universal framework for model merging based on the Fréchet mean, addressing the fragility of trad…
A latest arXiv paper proposes the ConformDecompose framework, which decomposes conformal prediction uncertainty into exp…
A latest arXiv paper proposes the "Distributional Alignment Games" framework, leveraging game theory to tackle the compu…
A latest arXiv paper proposes Flow Map Reward Guidance, a method that reframes reward guidance for generative models as …
A research team has released BatteryPass-12K, the first publicly available benchmark dataset for Digital Battery Passpor…
A latest arXiv paper finds that while the Normalized Transformer (nGPT) delivers impressive training speedups, it fails …
A new study proposes the Co-Evolving Policy Distillation (CoPD) framework, offering a unified analysis of capability los…
New research shows that replacing traditional Softmax attention with Sigmoid attention in single-cell biology foundation…
A new study proposes an adaptive weight decay mechanism that enables AI agents to 'learn to forget' like humans, dynamic…
A new survey on arXiv systematically reviews deep learning methods for cross-subject EEG decoding, formalizing the cross…