Jian Li
- Papers
- 27
Cite
Notes
Only stored in your browser.
Authored papers
27VITA-Audio: Fast Interleaved Cross-Modal Token Generation for Efficient Large Speech-Language Model
arXiv 2025
RLVER: Reinforcement Learning with Verifiable Emotion Rewards for Empathetic Agents
arXiv 2025
Dr.V: A Hierarchical Perception-Temporal-Cognition Framework to Diagnose Video Hallucination by Fine-grained Spatial-Temporal Grounding
arXiv 2025
CogniBench: A Legal-inspired Framework and Dataset for Assessing Cognitive Faithfulness of Large Language Models
arXiv 2025
Efficient Multimodal Large Language Models: A Survey
arXiv 2024
MMAD: The First-Ever Comprehensive Benchmark for Multimodal Large Language Models in Industrial Anomaly Detection
arXiv 2024
MooER: LLM-based Speech Recognition and Translation Models from Moore Threads
arXiv 2024
WavChat: A Survey of Spoken Dialogue Models
arXiv 2024
Simul-Whisper: Attention-Guided Streaming Whisper with Truncation Detection
arXiv 2024
MBA-RAG: a Bandit Approach for Adaptive Retrieval-Augmented Generation through Question Complexity
arXiv 2024
Rethinking The Uniformity Metric in Self-Supervised Learning
arXiv 2024
Self-supervised Preference Optimization: Enhance Your Language Model with Preference Degree Awareness
arXiv 2024
LLaVA-VSD: Large Language-and-Vision Assistant for Visual Spatial Description
arXiv 2024
Are Large Language Models Good Prompt Optimizers?
arXiv 2024
LoRA-GA: Low-Rank Adaptation with Gradient Approximation
arXiv 2024
A Survey on Benchmarks of Multimodal Large Language Models
arXiv 2024
SceneTracker: Long-term Scene Flow Estimation Network
arXiv 2024
Unveiling the Ignorance of MLLMs: Seeing Clearly, Answering Incorrectly
CVPR 2025 1
Reasoning-Enhanced Object-Centric Learning for Videos
arXiv 2024
SatVision-TOA: A Geospatial Foundation Model for Coarse-Resolution All-Sky Remote Sensing Imagery
arXiv 2024
Spider: Any-to-Many Multimodal LLM
arXiv 2024
LCM-LoRA: A Universal Stable-Diffusion Acceleration Module
arXiv 2023
Aurora:Activating Chinese chat capability for Mixtral-8x7B sparse Mixture-of-Experts through Instruction-Tuning
arXiv 2023
Rethinking Mobile Block for Efficient Attention-based Models
ICCV 2023 1
Generative Table Pre-training Empowers Models for Tabular Prediction
arXiv 2023
OpenFE: Automated Feature Generation with Expert-level Performance
arXiv 2022
Learning to Break the Loop: Analyzing and Mitigating Repetitions for Neural Text Generation
arXiv 2022
Affiliations
Frequent co-authors
10from 27 papers