TurboDiffusion: Accelerating Video Diffusion Models by 100-200 Times Paper • 2512.16093 • Published 8 days ago • 50
DreaMontage: Arbitrary Frame-Guided One-Shot Video Generation Paper • 2512.21252 • Published 1 day ago • 24
StoryMem: Multi-shot Long Video Storytelling with Memory Paper • 2512.19539 • Published 3 days ago • 16
LoGoPlanner: Localization Grounded Navigation Policy with Metric-aware Visual Geometry Paper • 2512.19629 • Published 3 days ago • 21
An Anatomy of Vision-Language-Action Models: From Modules to Milestones and Challenges Paper • 2512.11362 • Published 14 days ago • 20
Seed-Prover 1.5: Mastering Undergraduate-Level Theorem Proving via Learning from Experience Paper • 2512.17260 • Published 7 days ago • 47
PhysBrain: Human Egocentric Data as a Bridge from Vision Language Models to Physical Intelligence Paper • 2512.16793 • Published 7 days ago • 71
Exploration v.s. Exploitation: Rethinking RLVR through Clipping, Entropy, and Spurious Reward Paper • 2512.16912 • Published 7 days ago • 10
The World is Your Canvas: Painting Promptable Events with Reference Images, Trajectories, and Text Paper • 2512.16924 • Published 7 days ago • 24
Next-Embedding Prediction Makes Strong Vision Learners Paper • 2512.16922 • Published 7 days ago • 78
LLaDA2.0: Scaling Up Diffusion Language Models to 100B Paper • 2512.15745 • Published 16 days ago • 76
Qwen-Image-Layered: Towards Inherent Editability via Layer Decomposition Paper • 2512.15603 • Published 8 days ago • 55
Nemotron-Cascade: Scaling Cascaded Reinforcement Learning for General-Purpose Reasoning Models Paper • 2512.13607 • Published 10 days ago • 26
WorldPlay: Towards Long-Term Geometric Consistency for Real-Time Interactive World Modeling Paper • 2512.14614 • Published 9 days ago • 64
DrivePI: Spatial-aware 4D MLLM for Unified Autonomous Driving Understanding, Perception, Prediction and Planning Paper • 2512.12799 • Published 11 days ago • 10