Towards Scalable Pre-training of Visual Tokenizers for Generation
-
MiniMaxAI/VTP-Small-f16d64
Image Feature Extraction β’ 0.2B β’ Updated β’ 3.18k β’ 8 -
MiniMaxAI/VTP-Base-f16d64
Image Feature Extraction β’ 0.3B β’ Updated β’ 3.38k β’ 14 -
MiniMaxAI/VTP-Large-f16d64
Image Feature Extraction β’ 0.7B β’ Updated β’ 3.43k β’ 10 -
Towards Scalable Pre-training of Visual Tokenizers for Generation
Paper β’ 2512.13687 β’ Published β’ 85