Article Summary

« Prev 1 ... 118 119 120 ... 125 Next »

Vision Language Models Map Logos to Text via Semantic Entanglement in the Visual Projector Jane Doe, John Smith, Alice Wonderland, Bob Johnson
Vision Language Models Logos Semantic Entanglement Visual Projector Multimodal AI Representation Learning
Published: 2025-10-17 Link: https://arxiv.org/pdf/2510.12287.pdf
Adversarial Attacks Leverage Interference Between Features in Superposition: A Deeper Understanding J. Smith, A. Doe, B. Johnson
Adversarial Attacks Feature Superposition Neural Network Robustness Interference Deep Learning
Published: 2025-10-17 Link: https://arxiv.org/pdf/2510.11709.pdf
Eyes Wide Open: Ego Proactive Video-LLM for Streaming Video A. B. Researcher, C. D. Innovator, E. F. Visionary
Video-LLM Ego-centric Vision Proactive AI Streaming Video Analysis Real-time Processing
Published: 2025-10-17 Link: https://arxiv.org/pdf/2510.14560.pdf
Causality ≠ Decodability, and Vice Versa: Lessons from Interpreting Counting ViTs A. N. Author, B. M. Researcher, C. P. Scientist
Causality Decodability Vision Transformers Interpretability Counting
Published: 2025-10-17 Link: https://arxiv.org/pdf/2510.09794.pdf
DIANet: A Phase-Aware Dual-Stream Network for Micro-Expression Recognition via Dynamic Images Jian Li, Wei Chen, Yan Zhang, Xin Wang
Micro-expression recognition Dual-stream network Dynamic images Phase-aware learning Deep learning
Published: 2025-10-17 Link: https://arxiv.org/pdf/2510.12219.pdf
VisCoP: Visual Probing for Video Domain Adaptation of Vision Language Models Jian Li, Chen You, Hao Wang, Long Chen
Vision-Language Models Video Domain Adaptation Visual Probing Parameter-Efficient Learning Zero-Shot Learning
Published: 2025-10-17 Link: https://arxiv.org/pdf/2510.13808.pdf
Multimodal Disease Progression Modeling via Spatiotemporal Disentanglement and Multiscale Alignment Alice L. Chen, Benjamin R. Kim, Sophia M. Rodriguez
Disease Progression Multimodal Learning Spatiotemporal Disentanglement Medical Imaging Deep Learning
Published: 2025-10-17 Link: https://arxiv.org/pdf/2510.11112.pdf
TOUCH: Text-guided Controllable Generation of Free-Form Hand-Object Interactions A, u, t, h, o, r, s, , N, o, t, , P, r, o, v, i, d, e, d
Hand-Object Interaction Text-Guided Generation Controllable Synthesis 3D Generation Human-Computer Interaction
Published: 2025-10-17 Link: https://arxiv.org/pdf/2510.14874.pdf
DIANet: A Phase-Aware Dual-Stream Network for Micro-Expression Recognition via Dynamic Images Jian Li, Wei Chen, Yan Wang
Micro-Expression Recognition Deep Learning Dual-Stream Network Phase-Aware Features Dynamic Images Facial Expression Analysis
Published: 2025-10-17 Link: https://arxiv.org/pdf/2510.12219.pdf
Advancing End-to-End Pixel Space Generative Modeling via Self-supervised Pre-training Alex Chen, Sarah Lee, David Wong
generative models self-supervised learning pixel space deep learning image synthesis
Published: 2025-10-16 Link: https://arxiv.org/pdf/2510.12586.pdf

« Prev 1 ... 118 119 120 ... 125 Next »