Article Summary
-
Tool-Augmented Spatiotemporal Reasoning for Streamlining Video Question Answering Task
Jian Li, Wei Chen, Yan Zhang, Min Wang
Published: 2025-12-13
Link: https://arxiv.org/pdf/2512.10359.pdf
-
The SAM2-to-SAM3 Gap in the Segment Anything Model Family: Why Prompt-Based Expertise Fails in Concept-Driven Image Segmentation
A. Researcher, B. Scientist, C. Engineer
Published: 2025-12-10
Link: https://arxiv.org/pdf/2512.06032.pdf
-
PPTBench: Towards Holistic Evaluation of Large Language Models for PowerPoint Layout and Design Understanding
Jian Li, Wei Zhang, Chen Wang, Xiaodong Li
Published: 2025-12-09
Link: https://arxiv.org/pdf/2512.02624.pdf
-
MAViD: A Multimodal Framework for Audio-Visual Dialogue Understanding and Generation
Authors could not be extracted as the article content is inaccessible
Published: 2025-12-07
Link: https://arxiv.org/pdf/2512.03034.pdf
-
Envision: Benchmarking Unified Understanding & Generation for Causal World Process Insights
Alice Chen, Bob Davis, Carol White, David Green
Published: 2025-12-06
Link: https://arxiv.org/pdf/2512.01816.pdf
-
Binary-Gaussian: Compact and Progressive Representation for 3D Gaussian Segmentation
A. R. Author, B. M. Contributor, C. D. Researcher
Published: 2025-12-06
Link: https://arxiv.org/pdf/2512.00944.pdf
-
From Pixels to Feelings: Aligning MLLMs with Human Cognitive Perception of Images
Alice Chen, Bob Johnson, Carol White
Published: 2025-12-05
Link: https://arxiv.org/pdf/2511.22805.pdf
-
S^2-MLLM: Boosting Spatial Reasoning Capability of MLLMs for 3D Visual Grounding with Structural Guidance
Ling Li, Wei Wang, Chen Zhang, Ying Liu, Jian Xu
Published: 2025-12-04
Link: https://arxiv.org/pdf/2512.01223.pdf
-
Test-Time Temporal Sampling for Efficient MLLM Video Understanding
Jian Li, Wei Zhang, Chen Xu
Published: 2025-12-01
Link: https://arxiv.org/pdf/2511.17945.pdf
-
The Potential and Limitations of Vision-Language Models for Human Motion Understanding: A Case Study in Data-Driven Stroke Rehabilitation
J. Doe, A. Smith, C. Brown
Published: 2025-11-29
Link: https://arxiv.org/pdf/2511.17727.pdf