Hugging Face
Models
Datasets
Spaces
Community
Docs
Enterprise
Pricing
Log In
Sign Up
4
16
5
Xiangpeng Yang
PRO
XiangpengYang
Follow
Sawyer1000's profile picture
John6666's profile picture
caoyichao's profile picture
17 followers
Β·
24 following
https://xiangpengyang.github.io/
Ayden_Yang_
knightyxp
xiangpeng-yang-a422851b2
AI & ML interests
diffusion models, video generaiton, video editing
Recent Activity
replied
to
their
post
11 days ago
π Introducing VideoCoF: Unified Video Editing with a Temporal Reasoner (Chain-of-Frames)! Weβre excited to introduce VideoCoF, a unified framework for instruction-based video editing that enables temporal reasoning and ~4Γ video length extrapolation, trained with only 50k video pairs. π₯ π What makes VideoCoF different? π§ Chain-of-Frames reasoning , mimic human thinking process like Seeing β Reasoning β Editing to apply edits accurately over time without external masks, ensuring physically plausible results. π Strong length generalization β trained on 33-frame clips, yet supports multi-shot editing and long-video extrapolation (~4Γ). π― Unified fine-grained editing β Object Removal, Addition, Swap, and Local Style Transfer, with instance-level & part-level, spatial-aware control. β‘ Fast inference update π H100: ~20s / video with 4-step inference, making high-quality video editing far more practical for real-world use. π Links π Paper: https://arxiv.org/abs/2512.07469 π» Code: https://github.com/knightyxp/VideoCoF π€ Demo: https://huggingface.co/spaces/XiangpengYang/VideoCoF π§© Models: https://huggingface.co/XiangpengYang/VideoCoF π Project Page: https://videocof.github.io/ #VideoEditing #DiffusionModels #GenerativeAI #ComputerVision #AI
replied
to
their
post
11 days ago
π Introducing VideoCoF: Unified Video Editing with a Temporal Reasoner (Chain-of-Frames)! Weβre excited to introduce VideoCoF, a unified framework for instruction-based video editing that enables temporal reasoning and ~4Γ video length extrapolation, trained with only 50k video pairs. π₯ π What makes VideoCoF different? π§ Chain-of-Frames reasoning , mimic human thinking process like Seeing β Reasoning β Editing to apply edits accurately over time without external masks, ensuring physically plausible results. π Strong length generalization β trained on 33-frame clips, yet supports multi-shot editing and long-video extrapolation (~4Γ). π― Unified fine-grained editing β Object Removal, Addition, Swap, and Local Style Transfer, with instance-level & part-level, spatial-aware control. β‘ Fast inference update π H100: ~20s / video with 4-step inference, making high-quality video editing far more practical for real-world use. π Links π Paper: https://arxiv.org/abs/2512.07469 π» Code: https://github.com/knightyxp/VideoCoF π€ Demo: https://huggingface.co/spaces/XiangpengYang/VideoCoF π§© Models: https://huggingface.co/XiangpengYang/VideoCoF π Project Page: https://videocof.github.io/ #VideoEditing #DiffusionModels #GenerativeAI #ComputerVision #AI
updated
a Space
11 days ago
XiangpengYang/VideoCoF
View all activity
Organizations
XiangpengYang
's datasets
2
Sort:Β Recently updated
XiangpengYang/Video-InContext-Editing
Viewer
β’
Updated
Jun 25
β’
12
β’
141
XiangpengYang/VideoGrain-dataset
Viewer
β’
Updated
Mar 15
β’
2.83k
β’
1.31k
β’
1