Xuezhe Ma
- Papers
- 9
Cite
Notes
Only stored in your browser.
Authored papers
9Gecko: An Efficient Neural Architecture Inherently Processing Sequences with Arbitrary Lengths
arXiv 2026
Megalodon: Efficient LLM Pretraining and Inference with Unlimited Context Length
arXiv 2024
DISTFLASHATTN: Distributed Memory-efficient Attention for Long-context LLMs Training
arXiv 2023
Look-back Decoding for Open-Ended Text Generation
arXiv 2023
Evaluating Large Language Models on Controlled Generation Tasks
arXiv 2023
Mega: Moving Average Equipped Gated Attention
arXiv 2022
Improving Stability of Fine-Tuning Pretrained Language Models via Component-Wise Gradient Norm Clipping
arXiv 2022
Better May Not Be Fairer: A Study on Subgroup Discrepancy in Image Classification
ICCV 2023 1
Towards a Unified View of Parameter-Efficient Transfer Learning
towards-a-unified-view-of-parameter-efficient
Affiliations
Frequent co-authors
10from 9 papers