0

Jian Li

Papers
27

Cite

Notes

Only stored in your browser.

Attribution

Affiliations & profile
Semantic Scholar
Attribution policy →
27papers

Authored papers

27

VITA-Audio: Fast Interleaved Cross-Modal Token Generation for Efficient Large Speech-Language Model

arXiv 2025

2025

RLVER: Reinforcement Learning with Verifiable Emotion Rewards for Empathetic Agents

arXiv 2025

2025

Dr.V: A Hierarchical Perception-Temporal-Cognition Framework to Diagnose Video Hallucination by Fine-grained Spatial-Temporal Grounding

arXiv 2025

2025

CogniBench: A Legal-inspired Framework and Dataset for Assessing Cognitive Faithfulness of Large Language Models

arXiv 2025

2025

Efficient Multimodal Large Language Models: A Survey

arXiv 2024

2024

MMAD: The First-Ever Comprehensive Benchmark for Multimodal Large Language Models in Industrial Anomaly Detection

arXiv 2024

2024

MooER: LLM-based Speech Recognition and Translation Models from Moore Threads

arXiv 2024

2024

WavChat: A Survey of Spoken Dialogue Models

arXiv 2024

2024

Simul-Whisper: Attention-Guided Streaming Whisper with Truncation Detection

arXiv 2024

2024

MBA-RAG: a Bandit Approach for Adaptive Retrieval-Augmented Generation through Question Complexity

arXiv 2024

2024

Rethinking The Uniformity Metric in Self-Supervised Learning

arXiv 2024

2024

Self-supervised Preference Optimization: Enhance Your Language Model with Preference Degree Awareness

arXiv 2024

2024

LLaVA-VSD: Large Language-and-Vision Assistant for Visual Spatial Description

arXiv 2024

2024

Are Large Language Models Good Prompt Optimizers?

arXiv 2024

2024

LoRA-GA: Low-Rank Adaptation with Gradient Approximation

arXiv 2024

2024

A Survey on Benchmarks of Multimodal Large Language Models

arXiv 2024

2024

SceneTracker: Long-term Scene Flow Estimation Network

arXiv 2024

2024

Unveiling the Ignorance of MLLMs: Seeing Clearly, Answering Incorrectly

CVPR 2025 1

2024

Reasoning-Enhanced Object-Centric Learning for Videos

arXiv 2024

2024

SatVision-TOA: A Geospatial Foundation Model for Coarse-Resolution All-Sky Remote Sensing Imagery

arXiv 2024

2024

Spider: Any-to-Many Multimodal LLM

arXiv 2024

2024

LCM-LoRA: A Universal Stable-Diffusion Acceleration Module

arXiv 2023

2023

Aurora:Activating Chinese chat capability for Mixtral-8x7B sparse Mixture-of-Experts through Instruction-Tuning

arXiv 2023

2023

Rethinking Mobile Block for Efficient Attention-based Models

ICCV 2023 1

2023

Generative Table Pre-training Empowers Models for Tabular Prediction

arXiv 2023

2023

OpenFE: Automated Feature Generation with Expert-level Performance

arXiv 2022

2022

Learning to Break the Loop: Analyzing and Mitigating Repetitions for Neural Text Generation

arXiv 2022

2022

Affiliations

No known affiliations.

Frequent co-authors

10

from 27 papers