Article Summary
-
MAViD: A Multimodal Framework for Audio-Visual Dialogue Understanding and Generation
Authors could not be extracted as the article content is inaccessible
Published: 2025-12-07
Link: https://arxiv.org/pdf/2512.03034.pdf
-
Do You See What I Say? Generalizable Deepfake Detection based on Visual Speech Recognition
Alice B. Researcher, Bob C. Engineer, Carol D. Scientist
Published: 2025-12-03
Link: https://arxiv.org/pdf/2511.22443.pdf
-
Decoupled Audio-Visual Dataset Distillation
Anya Sharma, Ben Carter, Chen Li
Published: 2025-12-01
Link: https://arxiv.org/pdf/2511.17890.pdf
-
Towards Generalizable Deepfake Detection via Forgery-aware Audio-Visual Adaptation: A Variational Bayesian Approach
Jian Li, Wei Chen, Xiao Wang, Yan Zhang
Published: 2025-11-28
Link: https://arxiv.org/pdf/2511.19080.pdf
-
Shared Latent Representation for Joint Text-to-Audio-Visual Synthesis
A. Placeholder, B. Example, C. Author
Published: 2025-11-15
Link: https://arxiv.org/pdf/2511.05432.pdf
-
AVAR-Net: A Lightweight Audio-Visual Anomaly Recognition Framework with a Benchmark Dataset
Y. Chen, W. Zhang, L. Wang, Q. Li
Published: 2025-10-19
Link: https://arxiv.org/pdf/2510.13630.pdf