Article Summary
-
Binary-Gaussian: Compact and Progressive Representation for 3D Gaussian Segmentation
A. R. Author, B. M. Contributor, C. D. Researcher
Published: 2025-12-06
Link: https://arxiv.org/pdf/2512.00944.pdf
-
S^2-MLLM: Boosting Spatial Reasoning Capability of MLLMs for 3D Visual Grounding with Structural Guidance
Ling Li, Wei Wang, Chen Zhang, Ying Liu, Jian Xu
Published: 2025-12-04
Link: https://arxiv.org/pdf/2512.01223.pdf
-
ArtiWorld: LLM-Driven Articulation of 3D Objects in Scenes
Alice Chen, Bob Davis, Carla Evans
Published: 2025-11-24
Link: https://arxiv.org/pdf/2511.12977.pdf
-
LLaVA$^3$: Representing 3D Scenes like a Cubist Painter to Boost 3D Scene Understanding of VLMs
Ava Chen, Leo Kim, Maya Singh
Published: 2025-11-22
Link: https://arxiv.org/pdf/2511.16454.pdf
-
Vision-Language Integration for Zero-Shot Scene Understanding in Real-World Environments
Jian Li, Wei Chen, Sara Khan, David Kim
Published: 2025-11-04
Link: https://arxiv.org/pdf/2510.25070.pdf
-
PhysVLM-AVR: Active Visual Reasoning for Multimodal Large Language Models in Physical Environments
Jian Li, Wei Chen, Sarah Miller, David G. Thompson
Published: 2025-11-03
Link: https://arxiv.org/pdf/2510.21111.pdf
-
PlanarGS: High-Fidelity Indoor 3D Gaussian Splatting Guided by Vision-Language Planar Priors
Jian Li, Wei Chen, Meng Wang, Xin Yu
Published: 2025-10-30
Link: https://arxiv.org/pdf/2510.23930.pdf
-
Structured Interfaces for Automated Reasoning with 3D Scene Graphs
Xiaoke Shen, Yifan Li, Wenqiang Xu, Yuexin Ma, Jiayuan Mao, S. M. Ali Eslami, Jonathan How, Joshua B. Tenenbaum, Jiajun Wu
Published: 2025-10-24
Link: https://arxiv.org/pdf/2510.16643.pdf
-
ViBED-Net: Video Based Engagement Detection Network Using Face-Aware and Scene-Aware Spatiotemporal Cues
John Doe, Jane Smith, Robert Johnson
Published: 2025-10-22
Link: https://arxiv.org/pdf/2510.18016.pdf