Kai Li
- Papers
- 26
Cite
Notes
Only stored in your browser.
Authored papers
26A Survey of Large Audio Language Models: Generalization, Trustworthiness, and Outlook
arXiv 2026
A Semantically Consistent Dataset for Data-Efficient Query-Based Universal Sound Separation
arXiv 2026
BEAVER: A Training-Free Hierarchical Prompt Compression Method via Structure-Aware Page Selection
arXiv 2026
MMAR: A Challenging Benchmark for Deep Reasoning in Speech, Audio, Music, and Their Mix
arXiv 2025
SoloSpeech: Enhancing Intelligibility and Quality in Target Speech Extraction through a Cascaded Generative Pipeline
arXiv 2025
SepPrune: Structured Pruning for Efficient Deep Speech Separation
arXiv 2025
Toward Stable Semi-Supervised Remote Sensing Segmentation via Co-Guidance and Co-Fusion
arXiv 2025
Advances in Speech Separation: Techniques, Challenges, and Future Trends
arXiv 2025
Efficient Audio-Visual Speech Separation with Discrete Lip Semantics and Multi-Scale Global-Local Attention
arXiv 2025
Mixture of Global and Local Experts with Diffusion Transformer for Controllable Face Generation
arXiv 2025
AudioTrust: Benchmarking the Multifaceted Trustworthiness of Audio Large Language Models
arXiv 2025
SonicSim: A customizable simulation platform for speech processing in moving sound source scenarios
arXiv 2024
MultiBooth: Towards Generating All Your Concepts in an Image from Text
arXiv 2024
SORRY-Bench: Systematically Evaluating Large Language Model Safety Refusal Behaviors
arXiv 2024
Adapting Vision Foundation Models for Robust Cloud Segmentation in Remote Sensing Images
arXiv 2024
Extracting polygonal footprints in off-nadir images with Segment Anything Model
arXiv 2024
DistriFusion: Distributed Parallel Inference for High-Resolution Diffusion Models
CVPR 2024 1
Apollo: Band-sequence Modeling for High-Quality Audio Restoration
arXiv 2024
BitDelta: Your Fine-Tune May Only Be Worth One Bit
arXiv 2024
Catastrophic Jailbreak of Open-source LLMs via Exploiting Generation
arXiv 2023
Strategic Preys Make Acute Predators: Enhancing Camouflaged Object Detectors by Generating Camouflaged Objects
arXiv 2023
PMAA: A Progressive Multi-scale Attention Autoencoder Model for High-performance Cloud Removal from Multi-temporal Satellite Imagery
arXiv 2023
Apparate: Rethinking Early Exits to Tame Latency-Throughput Tensions in ML Serving
arXiv 2023
Conditional Image-to-Video Generation with Latent Flow Diffusion Models
CVPR 2023 1
PruMUX: Augmenting Data Multiplexing with Model Compression
arXiv 2023
Image Super-Resolution Using Very Deep Residual Channel Attention Networks
image-super-resolution-using-very-deep-1
Affiliations
Frequent co-authors
10from 26 papers