Article Summary
-
TUNA: Taming Unified Visual Representations for Native Unified Multimodal Models
First Author, Second Author, Third Author
Published: 2025-12-05
Link: https://arxiv.org/pdf/2512.02014.pdf
-
Architecture Decoupling Is Not All You Need For Unified Multimodal Model
A. Research, B. Scientist, C. Innovator
Published: 2025-12-01
Link: https://arxiv.org/pdf/2511.22663.pdf
-
While recognizing actions, LMMs struggle to detect core interaction events
Anonymous Author 1, Anonymous Author 2
Published: 2025-11-28
Link: https://arxiv.org/pdf/2511.20162.pdf
-
DeepEyesV2: Toward Agentic Multimodal Model
J. Doe, A. Smith, B. Lee
Published: 2025-11-15
Link: https://arxiv.org/pdf/2511.05271.pdf
-
QG-CoC: Question-Guided Chain-of-Captions for Large Multimodal Models
Alice Chen, Bob Davis, Carol Evans
Published: 2025-11-09
Link: https://arxiv.org/pdf/2511.03206.pdf
-
Emu3.5: Native Multimodal Models are World Learners
A. Researcher, B. Developer, C. Engineer
Published: 2025-11-06
Link: https://arxiv.org/pdf/2510.26583.pdf