Kun Wang

AcademiClaw: When Students Set Challenges for AI Agents

arXiv 2026

A Survey of Large Audio Language Models: Generalization, Trustworthiness, and Outlook

arXiv 2026

The Latent Space: Foundation, Evolution, Mechanism, Ability, and Outlook

arXiv 2026

SkillNet: Create, Evaluate, and Connect AI Skills

arXiv 2026

MLS-Bench: A Holistic and Rigorous Assessment of AI Systems on Building Better AI

arXiv 2026

NTIRE 2026 Challenge on Video Saliency Prediction: Methods and Results

arXiv 2026

Aligning Multimodal LLM with Human Preference: A Survey

arXiv 2025

Multi-agent Architecture Search via Agentic Supernet

arXiv 2025

GuardReasoner-VL: Safeguarding VLMs via Reinforced Reasoning

arXiv 2025

iAgent: LLM Agent as a Shield between User and Recommender Systems

arXiv 2025

GoT-R1: Unleashing Reasoning Capability of MLLM for Visual Generation with Reinforcement Learning

arXiv 2025

FLUX-Reason-6M & PRISM-Bench: A Million-Scale Text-to-Image Reasoning Dataset and Comprehensive Benchmark

arXiv 2025

OneForecast: A Universal Framework for Global and Regional Weather Forecasting

arXiv 2025

DemonAgent: Dynamically Encrypted Multi-Backdoor Implantation Attack on LLM-based Agent

arXiv 2025

A Comprehensive Survey in LLM(-Agent) Full Stack Safety: Data, Training and Deployment

arXiv 2025

AudioTrust: Benchmarking the Multifaceted Trustworthiness of Audio Large Language Models

arXiv 2025

GoT: Unleashing Reasoning Capability of Multimodal Large Language Model for Visual Generation and Editing

arXiv 2025

HealthiVert-GAN: A Novel Framework of Pseudo-Healthy Vertebral Image Synthesis for Interpretable Compression Fracture Grading

arXiv 2025

MasRouter: Learning to Route LLMs for Multi-Agent Systems

arXiv 2025

AlphaEdit: Null-Space Constrained Knowledge Editing for Language Models

arXiv 2024

MotionTTT: 2D Test-Time-Training Motion Estimation for 3D Motion Corrected MRI

arXiv 2024

PUMA: Empowering Unified MLLM with Multi-granular Visual Generation

ICCV 2025

CLEAR: Can Language Models Really Understand Causal Graphs?

arXiv 2024

On the Role of Attention Heads in Large Language Model Safety

arXiv 2024

Mitigating Modality Prior-Induced Hallucinations in Multimodal Large Language Models via Deciphering Attention Causality

arXiv 2024