0

Kai Li

Papers
26

Cite

Notes

Only stored in your browser.

Attribution

Affiliations & profile
Semantic Scholar
Attribution policy →
26papers

Authored papers

26

A Survey of Large Audio Language Models: Generalization, Trustworthiness, and Outlook

arXiv 2026

2026

A Semantically Consistent Dataset for Data-Efficient Query-Based Universal Sound Separation

arXiv 2026

2026

BEAVER: A Training-Free Hierarchical Prompt Compression Method via Structure-Aware Page Selection

arXiv 2026

2026

MMAR: A Challenging Benchmark for Deep Reasoning in Speech, Audio, Music, and Their Mix

arXiv 2025

2025

SoloSpeech: Enhancing Intelligibility and Quality in Target Speech Extraction through a Cascaded Generative Pipeline

arXiv 2025

2025

SepPrune: Structured Pruning for Efficient Deep Speech Separation

arXiv 2025

2025

Toward Stable Semi-Supervised Remote Sensing Segmentation via Co-Guidance and Co-Fusion

arXiv 2025

2025

Advances in Speech Separation: Techniques, Challenges, and Future Trends

arXiv 2025

2025

Efficient Audio-Visual Speech Separation with Discrete Lip Semantics and Multi-Scale Global-Local Attention

arXiv 2025

2025

Mixture of Global and Local Experts with Diffusion Transformer for Controllable Face Generation

arXiv 2025

2025

AudioTrust: Benchmarking the Multifaceted Trustworthiness of Audio Large Language Models

arXiv 2025

2025

SonicSim: A customizable simulation platform for speech processing in moving sound source scenarios

arXiv 2024

2024

MultiBooth: Towards Generating All Your Concepts in an Image from Text

arXiv 2024

2024

SORRY-Bench: Systematically Evaluating Large Language Model Safety Refusal Behaviors

arXiv 2024

2024

Adapting Vision Foundation Models for Robust Cloud Segmentation in Remote Sensing Images

arXiv 2024

2024

Extracting polygonal footprints in off-nadir images with Segment Anything Model

arXiv 2024

2024

DistriFusion: Distributed Parallel Inference for High-Resolution Diffusion Models

CVPR 2024 1

2024

Apollo: Band-sequence Modeling for High-Quality Audio Restoration

arXiv 2024

2024

BitDelta: Your Fine-Tune May Only Be Worth One Bit

arXiv 2024

2024

Catastrophic Jailbreak of Open-source LLMs via Exploiting Generation

arXiv 2023

2023

Strategic Preys Make Acute Predators: Enhancing Camouflaged Object Detectors by Generating Camouflaged Objects

arXiv 2023

2023

PMAA: A Progressive Multi-scale Attention Autoencoder Model for High-performance Cloud Removal from Multi-temporal Satellite Imagery

arXiv 2023

2023

Apparate: Rethinking Early Exits to Tame Latency-Throughput Tensions in ML Serving

arXiv 2023

2023

Conditional Image-to-Video Generation with Latent Flow Diffusion Models

CVPR 2023 1

2023

PruMUX: Augmenting Data Multiplexing with Model Compression

arXiv 2023

2023

Image Super-Resolution Using Very Deep Residual Channel Attention Networks

image-super-resolution-using-very-deep-1

2018

Affiliations

No known affiliations.

Frequent co-authors

10

from 26 papers