Article Summary
-
Architecture Decoupling Is Not All You Need For Unified Multimodal Model
J. Doe, A. Smith, B. Johnson
Published: 2025-12-05
Link: https://arxiv.org/pdf/2511.22663.pdf
-
Multimodal Learning with Augmentation Techniques for Natural Disaster Assessment
A. B. Sharma, C. D. Lee, E. F. Kim
Published: 2025-11-06
Link: https://arxiv.org/pdf/2511.00004.pdf
-
SEPS: Semantic-enhanced Patch Slimming Framework for fine-grained cross-modal alignment
Jing Li, Wei Chen, Xiao Wang
Published: 2025-11-06
Link: https://arxiv.org/pdf/2511.01390.pdf
-
Towards Universal Video Retrieval: Generalizing Video Embedding via Synthesized Multimodal Pyramid Curriculum
Not available without article content
Published: 2025-11-03
Link: https://arxiv.org/pdf/2510.27571.pdf
-
AdSum: Two-stream Audio-visual Summarization for Automated Video Advertisement Clipping
Jian Li, Wei Chen, Qian Wang
Published: 2025-11-02
Link: https://arxiv.org/pdf/2510.26569.pdf
-
Detecting Latin in Historical Books with Large Language Models: A Multimodal Benchmark
Sophia Lee, David Chen, Maria Rodriguez
Published: 2025-10-28
Link: https://arxiv.org/pdf/2510.19585.pdf
-
On the Provable Importance of Gradients for Language-Assisted Image Clustering
Alice L. Chen, Benjamin R. Kim, Carla S. Davis
Published: 2025-10-26
Link: https://arxiv.org/pdf/2510.16335.pdf
-
Better Together: Leveraging Unpaired Multimodal Data for Stronger Unimodal Models
A. B. Researcher, C. D. Scientist, E. F. Innovator
Published: 2025-10-16
Link: https://arxiv.org/pdf/2510.08492.pdf