Article Summary
-
PPTBench: Towards Holistic Evaluation of Large Language Models for PowerPoint Layout and Design Understanding
Jian Li, Wei Zhang, Chen Wang, Xiaodong Li
Published: 2025-12-09
Link: https://arxiv.org/pdf/2512.02624.pdf
-
SpaceMind: Camera-Guided Modality Fusion for Spatial Reasoning in Vision-Language Models
Jia Li, Chen Wang, Yu Zhang, Xia Dong
Published: 2025-12-05
Link: https://arxiv.org/pdf/2511.23075.pdf
-
S^2-MLLM: Boosting Spatial Reasoning Capability of MLLMs for 3D Visual Grounding with Structural Guidance
Ling Li, Wei Wang, Chen Zhang, Ying Liu, Jian Xu
Published: 2025-12-04
Link: https://arxiv.org/pdf/2512.01223.pdf
-
Think with 3D: Geometric Imagination Grounded Spatial Reasoning from Limited Views
First Author, Second Author, Third Author
Published: 2025-10-22
Link: https://arxiv.org/pdf/2510.18632.pdf