Ngai Wong
- Papers
- 17
Cite
Notes
Only stored in your browser.
Authored papers
17OScaR: The Occam's Razor for Extreme KV Cache Quantization in LLMs and Beyond
arXiv 2026
Locate, Steer, and Improve: A Practical Survey of Actionable Mechanistic Interpretability in Large Language Models
arXiv 2026
Attention Sink in Transformers: A Survey on Utilization, Interpretation, and Mitigation
arXiv 2026
XStreamVGGT: Extremely Memory-Efficient Streaming Vision Geometry Grounded Transformer with KV Cache Compression
arXiv 2026
Shadow-FT: Tuning Instruct via Base
arXiv 2025
PhyX: Does Your Model Have the "Wits" for Physical Reasoning?
arXiv 2025
Revisiting Model Interpolation for Efficient Reasoning
arXiv 2025
LongEmotion: Measuring Emotional Intelligence of Large Language Models in Long-Context Interaction
arXiv 2025
Autoregressive Models in Vision: A Survey
arXiv 2024
LLM-Neo: Parameter Efficient Knowledge Distillation for Large Language Models
arXiv 2024
Scaling Laws with Vocabulary: Larger Models Deserve Larger Vocabularies
arXiv 2024
Mixture-of-Subspaces in Low-Rank Adaptation
arXiv 2024
A Survey on the Honesty of Large Language Models
arXiv 2024
LoRETTA: Low-Rank Economic Tensor-Train Adaptation for Ultra-Low-Parameter Fine-Tuning of Large Language Models
arXiv 2024
Rethinking Kullback-Leibler Divergence in Knowledge Distillation for Large Language Models
arXiv 2024
Nonparametric Teaching of Implicit Neural Representations
arXiv 2024
Weight-Inherited Distillation for Task-Agnostic BERT Compression
arXiv 2023
Affiliations
Frequent co-authors
10from 17 papers