Qing Li
- Papers
- 37
Cite
Notes
Only stored in your browser.
Authored papers
37AutoGUI-v2: A Comprehensive Multi-Modal GUI Functionality Understanding Benchmark
arXiv 2026
ShadowPEFT: Shadow Network for Parameter-Efficient Fine-Tuning
arXiv 2026
The AI Hippocampus: How Far are We From Human Memory?
arXiv 2026
Spurious Rewards Paradox: Mechanistically Understanding How RLVR Activates Memorization Shortcuts in LLMs
arXiv 2026
AdaptMMBench: Benchmarking Adaptive Multimodal Reasoning for Mode Selection and Reasoning Process
arXiv 2026
V-Retrver: Evidence-Driven Agentic Reasoning for Universal Multimodal Retrieval
arXiv 2026
Bridging the Vision-Brain Gap with an Uncertainty-Aware Blur Prior
bridging-the-vision-brain-gap-with-an
ConsisLoRA: Enhancing Content and Style Consistency for LoRA-based Style Transfer
arXiv 2025
Emotion-Qwen: Training Hybrid Experts for Unified Emotion and General Vision-Language Understanding
arXiv 2025
MMKE-Bench: A Multimodal Editing Benchmark for Diverse Visual Knowledge
arXiv 2025
KORE: Enhancing Knowledge Injection for Large Multimodal Models via Knowledge-Oriented Augmentations and Constraints
arXiv 2025
Efficient Multi-turn RL for GUI Agents via Decoupled Training and Adaptive Data Curation
arXiv 2025
Exposing Numeracy Gaps: A Benchmark to Evaluate Fundamental Numerical Abilities in Large Language Models
arXiv 2025
Unveiling the Mist over 3D Vision-Language Understanding: Object-centric Evaluation with Chain-of-Analysis
CVPR 2025 1
When Large Multimodal Models Confront Evolving Knowledge:Challenges and Pathways
arXiv 2025
A Comprehensive Survey in LLM(-Agent) Full Stack Safety: Data, Training and Deployment
arXiv 2025
OneForecast: A Universal Framework for Global and Regional Weather Forecasting
arXiv 2025
Training for Identity, Inference for Controllability: A Unified Approach to Tuning-Free Face Personalization
arXiv 2025
OV-NeRF: Open-vocabulary Neural Radiance Fields with Vision and Language Foundation Models for 3D Semantic Understanding
arXiv 2024
Quantized Side Tuning: Fast and Memory-Efficient Tuning of Quantized Large Language Models
arXiv 2024
TOMG-Bench: Evaluating LLMs on Text-based Open Molecule Generation
arXiv 2024
Linear-Time Graph Neural Networks for Scalable Recommendations
arXiv 2024
Large Language Models are In-Context Molecule Learners
arXiv 2024
2D Matryoshka Sentence Embeddings
arXiv 2024
PSAvatar: A Point-based Shape Model for Real-Time Head Avatar Animation with 3D Gaussian Splatting
arXiv 2024
A Picture Is Worth a Graph: A Blueprint Debate Paradigm for Multimodal Reasoning
arXiv 2024
FIRM: Flexible Interactive Reflection reMoval
arXiv 2024
UltraEdit: Instruction-based Fine-Grained Image Editing at Scale
arXiv 2024
Bongard-OpenWorld: Few-Shot Reasoning for Free-form Visual Concepts in the Real World
arXiv 2023
An Embodied Generalist Agent in 3D World
arXiv 2023
Cross Initialization for Personalized Text-to-Image Generation
arXiv 2023
Learning Signed Hyper Surfaces for Oriented Point Cloud Normal Estimation
CVPR 2023 1
Recurrent Attention Networks for Long-text Modeling
arXiv 2023
Generative Diffusion Models on Graphs: Methods and Applications
arXiv 2023
Label Supervised LLaMA Finetuning
arXiv 2023
Perceive, Ground, Reason, and Act: A Benchmark for General-purpose Visual Representation
arXiv 2022
SQA3D: Situated Question Answering in 3D Scenes
arXiv 2022
Affiliations
Frequent co-authors
10from 37 papers