Article Summary
-
DEAR: Dataset for Evaluating the Aesthetics of Rendering
John Doe, Jane Smith, Michael Brown
Published: 2025-12-12
Link: https://arxiv.org/pdf/2512.05209.pdf
-
Textured Geometry Evaluation: Perceptual 3D Textured Shape Metric via 3D Latent-Geometry Network
Jane Doe, John Smith, Alice Brown
Published: 2025-12-08
Link: https://arxiv.org/pdf/2512.01380.pdf
-
V-ReasonBench: Toward Unified Reasoning Benchmark Suite for Video Generation Models
Jian Li, Wei Chen, Xiaofeng Wang, Qian Zhang
Published: 2025-12-01
Link: https://arxiv.org/pdf/2511.16668.pdf
-
Benchmark Designers Should "Train on the Test Set" to Expose Exploitable Non-Visual Shortcuts
Alice Researcher, Bob Scientist, Carol Engineer
Published: 2025-11-08
Link: https://arxiv.org/pdf/2511.04655.pdf
-
MELDAE: A Framework for Micro-Expression Spotting, Detection, and Automatic Evaluation in In-the-Wild Conversational Scenes
Jian Li, Wei Chen, Xiaojun Wu
Published: 2025-10-28
Link: https://arxiv.org/pdf/2510.22575.pdf
-
VLSU: Mapping the Limits of Joint Multimodal Understanding for AI Safety
Alice Chen, Bob Davis, Carol White
Published: 2025-10-25
Link: None
-
Constantly Improving Image Models Need Constantly Improving Benchmarks
J. Doe, A. Smith, R. Johnson
Published: 2025-10-21
Link: https://arxiv.org/pdf/2510.15021.pdf
-
Uni-MMMU: A Massive Multi-discipline Multimodal Unified Benchmark
Jian Li, Wei Chen, Qian Wang, Ming Zhao
Published: 2025-10-17
Link: https://arxiv.org/pdf/2510.13759.pdf
-
A Review of Longitudinal Radiology Report Generation: Dataset Composition, Methods, and Performance Evaluation
J. Doe, A. Smith, C. Brown
Published: 2025-10-16
Link: https://arxiv.org/pdf/2510.12444.pdf