Ke Li
- Papers
- 39
Cite
Notes
Only stored in your browser.
Authored papers
39IceCache: Memory-efficient KV-cache Management for Long-Sequence LLMs
arXiv 2026
Youtu-VL: Unleashing Visual Potential via Unified Vision-Language Supervision
arXiv 2026
Dolphin: A Large-Scale Automatic Speech Recognition Model for Eastern Languages
arXiv 2025
VITA-Audio: Fast Interleaved Cross-Modal Token Generation for Efficient Large Speech-Language Model
arXiv 2025
Incentivizing Reasoning for Advanced Instruction-Following of Large Language Models
arXiv 2025
Youtu-Agent: Scaling Agent Productivity with Automated Generation and Hybrid Policy Optimization
arXiv 2025
Training-Free Group Relative Policy Optimization
arXiv 2025
Radiance Fields in XR: A Survey on How Radiance Fields are Envisioned and Addressed for XR Research
arXiv 2025
OmniGenBench: A Modular Platform for Reproducible Genomic Foundation Models Benchmarking
arXiv 2025
VITA-1.5: Towards GPT-4o Level Real-Time Vision and Speech Interaction
arXiv 2025
Solving the Catastrophic Forgetting Problem in Generalized Category Discovery
solving-the-catastrophic-forgetting-problem
SmartSnap: Proactive Evidence Seeking for Self-Verifying Agents
arXiv 2025
Youtu-LLM: Unlocking the Native Agentic Potential for Lightweight Large Language Models
arXiv 2025
Long-VITA: Scaling Large Multi-modal Models to 1 Million Tokens with Leading Short-Context Accuray
arXiv 2025
Reality Fusion: Robust Real-time Immersive Mobile Robot Teleoperation with Volumetric Visual Data Fusion
arXiv 2024
Leveraging Open Knowledge for Advancing Task Expertise in Large Language Models
arXiv 2024
CoMT: Chain-of-Medical-Thought Reduces Hallucination in Medical Report Generation
arXiv 2024
Improving Factuality in Large Language Models via Decoding-Time Hallucinatory and Truthful Comparators
arXiv 2024
Unleashing the Power of Data Tsunami: A Comprehensive Survey on Data Assessment and Selection for Instruction Tuning of Language Models
arXiv 2024
Sinkhorn Distance Minimization for Knowledge Distillation
arXiv 2024
FlashSloth: Lightning Multimodal Large Language Models via Embedded Visual Compression
arXiv 2024
Bridging Sequence-Structure Alignment in RNA Foundation Models
arXiv 2024
Aligning and Prompting Everything All at Once for Universal Visual Perception
arXiv 2023
InstOptima: Evolutionary Multi-objective Instruction Optimization via Large Language Model-based Instruction Operators
arXiv 2023
MMICT: Boosting Multi-Modal Fine-Tuning with In-Context Examples
arXiv 2023
Woodpecker: Hallucination Correction for Multimodal Large Language Models
arXiv 2023
MonoNeRD: NeRF-like Representations for Monocular 3D Object Detection
ICCV 2023 1
Masked Autoencoders are Efficient Class Incremental Learners
ICCV 2023 1
SketchXAI: A First Look at Explainability for Human Sketches
CVPR 2023 1
PyABSA: A Modularized Framework for Reproducible Aspect-based Sentiment Analysis
arXiv 2022
BootAug: Boosting Text Augmentation via Hybrid Instance Filtering Framework
arXiv 2022
C3KG: A Chinese Commonsense Conversation Knowledge Graph
arXiv 2022
speechocean762: An Open-Source Non-native English Speech Corpus For Pronunciation Assessment
arXiv 2021
LSA: Modeling Aspect Sentiment Coherency via Local Sentiment Aggregation
arXiv 2021
Pose Recognition with Cascade Transformers
CVPR 2021 1
Gotta Go Fast When Generating Data with Score-Based Models
gotta-go-fast-when-generating-data-with-score-1
Hyperspectral Image Super-Resolution with Spectral Mixup and Heterogeneous Datasets
arXiv 2021
Variational Attention: Propagating Domain-Specific Knowledge for Multi-Domain Learning in Crowd Counting
ICCV 2021 10
Enhancing Unsupervised Video Representation Learning by Decoupling the Scene and the Motion
arXiv 2020
Affiliations
Frequent co-authors
10from 39 papers