SAGE: Training Smart Any-Horizon Agents for Long Video Reasoning with Reinforcement Learning Paper • 2512.13874 • Published 3 days ago • 14
DR Tulu: Reinforcement Learning with Evolving Rubrics for Deep Research Paper • 2511.19399 • Published 24 days ago • 58
DR Tulu: Reinforcement Learning with Evolving Rubrics for Deep Research Paper • 2511.19399 • Published 24 days ago • 58
DR Tulu: Reinforcement Learning with Evolving Rubrics for Deep Research Paper • 2511.19399 • Published 24 days ago • 58
RLVE: Scaling Up Reinforcement Learning for Language Models with Adaptive Verifiable Environments Paper • 2511.07317 • Published Nov 10 • 13
DR Tulu: Reinforcement Learning with Evolving Rubrics for Deep Research Paper • 2511.19399 • Published 24 days ago • 58
Text or Pixels? It Takes Half: On the Token Efficiency of Visual Text Inputs in Multimodal LLMs Paper • 2510.18279 • Published Oct 21 • 4
CulturalTeaming: AI-Assisted Interactive Red-Teaming for Challenging LLMs' (Lack of) Multicultural Knowledge Paper • 2404.06664 • Published Apr 10, 2024 • 1
CulturalBench: a Robust, Diverse and Challenging Benchmark on Measuring the (Lack of) Cultural Knowledge of LLMs Paper • 2410.02677 • Published Oct 3, 2024 • 1
Aligning LLMs to Ask Good Questions A Case Study in Clinical Reasoning Paper • 2502.14860 • Published Feb 20
MediQ: Question-Asking LLMs and a Benchmark for Reliable Interactive Clinical Reasoning Paper • 2406.00922 • Published Jun 3, 2024
PrefPalette: Personalized Preference Modeling with Latent Attributes Paper • 2507.13541 • Published Jul 17 • 8
Medical Hallucinations in Foundation Models and Their Impact on Healthcare Paper • 2503.05777 • Published Feb 26