FaithLens: Detecting and Explaining Faithfulness Hallucination Paper • 2512.20182 • Published 4 days ago • 7
Lumine: An Open Recipe for Building Generalist Agents in 3D Open Worlds Paper • 2511.08892 • Published Nov 12 • 201
A Goal Without a Plan Is Just a Wish: Efficient and Effective Global Planner Training for Long-Horizon Agent Tasks Paper • 2510.05608 • Published Oct 7 • 3
BIRD-INTERACT: Re-imagining Text-to-SQL Evaluation for Large Language Models via Lens of Dynamic Interactions Paper • 2510.05318 • Published Oct 6 • 21
SWE-SQL: Illuminating LLM Pathways to Solve User SQL Issues in Real-World Applications Paper • 2506.18951 • Published Jun 23 • 21
Beyond the 80/20 Rule: High-Entropy Minority Tokens Drive Effective Reinforcement Learning for LLM Reasoning Paper • 2506.01939 • Published Jun 2 • 187
Teaching Large Language Models to Maintain Contextual Faithfulness via Synthetic Tasks and Reinforcement Learning Paper • 2505.16483 • Published May 22 • 10
Selecting Influential Samples for Long Context Alignment via Homologous Models' Guidance and Contextual Awareness Measurement Paper • 2410.15633 • Published Oct 21, 2024 • 7
SpokenWOZ: A Large-Scale Speech-Text Benchmark for Spoken Task-Oriented Dialogue Agents Paper • 2305.13040 • Published May 22, 2023 • 2
FLAME: Factuality-Aware Alignment for Large Language Models Paper • 2405.01525 • Published May 2, 2024 • 27