Shilin Yan
- Papers
- 16
Cite
Notes
Only stored in your browser.
Authored papers
16Artifact-Bench: Evaluating MLLMs on Detecting and Assessing the Artifacts of AI-Generated Videos
arXiv 2026
Act Wisely: Cultivating Meta-Cognitive Tool Use in Agentic Multimodal Models
arXiv 2026
AdaptMMBench: Benchmarking Adaptive Multimodal Reasoning for Mode Selection and Reasoning Process
arXiv 2026
SwimBird: Eliciting Switchable Reasoning Mode in Hybrid Autoregressive MLLMs
arXiv 2026
MM-CondChain: A Programmatically Verified Benchmark for Visually Grounded Deep Compositional Reasoning
arXiv 2026
MINT-CoT: Enabling Interleaved Visual Tokens in Mathematical Chain-of-Thought Reasoning
arXiv 2025
T2I-R1: Reinforcing Image Generation with Collaborative Semantic-level and Token-level CoT
arXiv 2025
Diffusion Language Models Know the Answer Before Decoding
arXiv 2025
Adaptive Classifier-Free Guidance via Dynamic Low-Confidence Masking
arXiv 2025
CrossLMM: Decoupling Long Video Sequences from LMMs via Dual Cross-Attention Mechanisms
arXiv 2025
GoT: Unleashing Reasoning Capability of Multimodal Large Language Model for Visual Generation and Editing
arXiv 2025
A Sanity Check for AI-generated Image Detection
arXiv 2024
VISA: Reasoning Video Object Segmentation via Large Language Models
arXiv 2024
Visual Perception by Large Language Model's Weights
arXiv 2024
Personalize Segment Anything Model with One Shot
arXiv 2023
PanoVOS: Bridging Non-panoramic and Panoramic Views with Transformer for Video Segmentation
arXiv 2023
Affiliations
Frequent co-authors
10from 16 papers