Cite
Notes
Only stored in your browser.
Attribution
SnapKV: LLM Knows What You are Looking for Before Generation
arXiv 2024
Procedural Knowledge in Pretraining Drives Reasoning in Large Language Models
Pushing Mixture of Experts to the Limit: Extremely Parameter Efficient MoE for Instruction Tuning
arXiv 2023
from 3 papers
Ahmet Üstün
researcher
Arash Ahmadian
Beyza Ermiş
Bharat Venkitesh
Bowen Yang
Deming Chen
Dwarak Talupuru
Edward Grefenstette
Hanchen Ye
Juhan Bae