Zhenglun Kong

Papers: 9

Cite

Notes

Only stored in your browser.

Attribution

Affiliations & profile: Semantic Scholar

Attribution policy →

9papers

Authored papers

Democratizing AI scientists using ToolUniverse

arXiv 2025

2025

Token Reduction Should Go Beyond Efficiency in Generative Models -- From Vision, Language to Multimodality

arXiv 2025

2025

Enabling Flexible Multi-LLM Integration for Scalable Knowledge Aggregation

arXiv 2025

2025

Fast and Memory-Efficient Video Diffusion Using Streamlined Inference

arXiv 2024

2024

Search for Efficient Large Language Models

arXiv 2024

2024

EdgeQAT: Entropy and Distribution Guided Quantization-Aware Training for the Acceleration of Lightweight LLMs on the Edge

arXiv 2024

2024

Rethinking Token Reduction for State Space Models

arXiv 2024

2024

Fully Open Source Moxin-7B Technical Report

arXiv 2024

2024

Agile-Quant: Activation-Guided Quantization for Faster Inference of LLMs on the Edge

arXiv 2023

2023

Affiliations

No known affiliations.

Frequent co-authors

from 9 papers

Pu Zhao

researcher

Yanzhi Wang

Xuan Shen

Changdi Yang

Yifan Gong

Zheng Zhan

Lei Lu

Peiyan Dong

Xue Lin

Yushu Wu