Chunting Zhou
- Papers
- 9
Cite
Notes
Only stored in your browser.
Authored papers
9Mixture-of-Transformers: A Sparse and Scalable Architecture for Multi-Modal Foundation Models
arXiv 2024
Byte Latent Transformer: Patches Scale Better Than Tokens
arXiv 2024
Megalodon: Efficient LLM Pretraining and Inference with Unlimited Context Length
arXiv 2024
Instruction-tuned Language Models are Better Knowledge Learners
arXiv 2024
In-context Pretraining: Language Modeling Beyond Document Boundaries
arXiv 2023
Look-back Decoding for Open-Ended Text Generation
arXiv 2023
Mega: Moving Average Equipped Gated Attention
arXiv 2022
Towards a Unified View of Parameter-Efficient Transfer Learning
towards-a-unified-view-of-parameter-efficient
Detecting Hallucinated Content in Conditional Neural Sequence Generation
detecting-hallucinated-content-in-conditional
Affiliations
Frequent co-authors
10from 9 papers