Yifan Zhang
- Papers
- 53
Cite
Notes
Only stored in your browser.
Authored papers
53OScaR: The Occam's Razor for Extreme KV Cache Quantization in LLMs and Beyond
arXiv 2026
MiroEval: Benchmarking Multimodal Deep Research Agents in Process and Outcome
arXiv 2026
Deep Delta Learning
arXiv 2026
Attention Sink in Transformers: A Survey on Utilization, Interpretation, and Mitigation
arXiv 2026
FlashSampling: Fast and Memory-Efficient Exact Sampling
arXiv 2026
PersonaVLM: Long-Term Personalized Multimodal LLMs
arXiv 2026
VTC-Bench: Evaluating Agentic Multimodal Models via Compositional Visual Tool Chaining
arXiv 2026
PEARL: Personalized Streaming Video Understanding Model
arXiv 2026
Interactive Benchmarks
arXiv 2026
Residual Stream Duality in Modern Transformer Architectures
arXiv 2026
How Well Do Models Follow Visual Instructions? VIBE: A Systematic Benchmark for Visual Instruction-Driven Image Editing
arXiv 2026
Tensor Product Attention Is All You Need
arXiv 2025
On the Design of KL-Regularized Policy Gradient Algorithms for LLM Reasoning
arXiv 2025
FormalMATH: Benchmarking Formal Mathematical Reasoning of Large Language Models
arXiv 2025
Monadic Context Engineering
arXiv 2025
Web World Models
arXiv 2025
Group Representational Position Encoding
arXiv 2025
LongVT: Incentivizing "Thinking with Long Videos" via Native Tool Calling
arXiv 2025
Higher-order Linear Attention
arXiv 2025
Matrix-Game 2.0: An Open-Source, Real-Time, and Streaming Interactive World Model
arXiv 2025
MiroThinker: Pushing the Performance Boundaries of Open-Source Research Agents via Model, Context, and Interactive Scaling
arXiv 2025
Skywork-R1V4: Toward Agentic Multimodal Intelligence through Interleaved Thinking with Images and DeepResearch
arXiv 2025
Matrix-Game: Interactive World Foundation Model
arXiv 2025
CriticLean: Critic-Guided Reinforcement Learning for Mathematical Formalization
arXiv 2025
Latent Sketchpad: Sketching Visual Thoughts to Elicit Multimodal Reasoning in MLLMs
arXiv 2025
Gradient-Attention Guided Dual-Masking Synergetic Framework for Robust Text-based Person Retrieval
arXiv 2025
oMeBench: Towards Robust Benchmarking of LLMs in Organic Mechanism Elucidation and Reasoning
arXiv 2025
Exact Coset Sampling for Quantum Lattice Algorithms
arXiv 2025
RealGen: Photorealistic Text-to-Image Generation via Detector-Guided Rewards
arXiv 2025
A Comprehensive Survey in LLM(-Agent) Full Stack Safety: Data, Training and Deployment
arXiv 2025
AudioTrust: Benchmarking the Multifaceted Trustworthiness of Audio Large Language Models
arXiv 2025
DriveGEN: Generalized and Robust 3D Detection in Driving via Controllable Text-to-Image Diffusion Generation
CVPR 2025 1
One-Shot Diffusion Mimicker for Handwritten Text Generation
arXiv 2024
RARE: Retrieval-Augmented Reasoning Enhancement for Large Language Models
arXiv 2024
General Preference Modeling with Preference Representations for Aligning Language Models
arXiv 2024
On the Diagram of Thought
arXiv 2024
We-Math: Does Your Large Multimodal Model Achieve Human-like Mathematical Reasoning?
arXiv 2024
Poison-splat: Computation Cost Attack on 3D Gaussian Splatting
arXiv 2024
Augmenting Math Word Problems via Iterative Question Composing
arXiv 2024
Scaling Image Tokenizers with Grouped Spherical Quantization
arXiv 2024
Training and Evaluating Language Models with Template-based Data Generation
arXiv 2024
Disentangling Writer and Character Styles for Handwriting Generation
CVPR 2023 1
Meta Prompting for AI Systems
arXiv 2023
Towards Stable Test-Time Adaptation in Dynamic Wild World
arXiv 2023
Cumulative Reasoning with Large Language Models
arXiv 2023
Dataset Quantization
ICCV 2023 1
Contrastive Learning Is Spectral Clustering On Similarity Graph
arXiv 2023
Efficient Test-Time Model Adaptation without Forgetting
arXiv 2022
Expanding Small-Scale Datasets with Guided Imagination
expanding-small-scale-datasets-with-guided
Deep Long-Tailed Learning: A Survey
arXiv 2021
Self-Supervised Aggregation of Diverse Experts for Test-Agnostic Long-Tailed Recognition
arXiv 2021
Source-free Domain Adaptation via Avatar Prototype Generation and Adaptation
arXiv 2021
Unleashing the Power of Contrastive Self-Supervised Visual Models via Contrast-Regularized Fine-Tuning
NeurIPS 2021 12
Affiliations
Frequent co-authors
10from 53 papers