Jiwon Song
- Papers
- 7
Cite
Notes
Only stored in your browser.
7papers
Authored papers
7CompactAttention: Accelerating Chunked Prefill with Block-Union KV Selection
arXiv 2026
Token Sparse Attention: Efficient Long-Context Inference with Interleaved Token Selection
arXiv 2026
RelayGen: Intra-Generation Model Switching for Efficient Reasoning
arXiv 2026
Reasoning Path Compression: Compressing Generation Trajectories for Efficient LLM Reasoning
arXiv 2025
LiteStage: Latency-aware Layer Skipping for Multi-stage Reasoning
arXiv 2025
FastKV: KV Cache Compression for Fast Long-Context Processing with Token-Selective Propagation
arXiv 2025
SLEB: Streamlining LLMs through Redundancy Verification and Elimination of Transformer Blocks
arXiv 2024
Affiliations
No known affiliations.
Frequent co-authors
8from 7 papers