Li Dong
- Papers
- 36
Cite
Notes
Only stored in your browser.
Authored papers
36LLM-in-Sandbox Elicits General Agentic Intelligence
arXiv 2026
A General Model for Retinal Segmentation and Quantification
arXiv 2026
Multiplex Thinking: Reasoning via Token-wise Branch-and-Merge
arXiv 2026
Breaking Training Bottlenecks: Effective and Stable Reinforcement Learning for Coding Models
arXiv 2026
Sparse-BitNet: 1.58-bit LLMs are Naturally Friendly to Semi-Structured Sparsity
arXiv 2026
SeerAttention-R: Sparse Attention Adaptation for Long Reasoning
arXiv 2025
Data Efficacy for Language Model Training
arXiv 2025
Latent Sketchpad: Sketching Visual Thoughts to Elicit Multimodal Reasoning in MLLMs
arXiv 2025
VibeVoice Technical Report
arXiv 2025
Black-Box On-Policy Distillation of Large Language Models
arXiv 2025
On-Policy RL with Optimal Reward Baseline
arXiv 2025
BitNet Distillation
arXiv 2025
Multimodal Latent Language Modeling with Next-Token Diffusion
arXiv 2024
Differential Transformer
arXiv 2024
You Only Cache Once: Decoder-Decoder Architectures for Language Models
arXiv 2024
Semi-Parametric Retrieval via Binary Bag-of-Tokens Index
arXiv 2024
Semi-Offline Reinforcement Learning for Optimized Text Generation
arXiv 2023
Kosmos-G: Generating Images in Context with Multimodal Large Language Models
arXiv 2023
Kosmos-2: Grounding Multimodal Large Language Models to the World
arXiv 2023
Large Language Model for Science: A Study on P vs. NP
arXiv 2023
Augmenting Language Models with Long-Term Memory
augmenting-language-models-with-long-term
BioCLIP: A Vision Foundation Model for the Tree of Life
CVPR 2024 1
Pre-Training to Learn in Context
arXiv 2023
A Length-Extrapolatable Transformer
arXiv 2022
BEiT v2: Masked Image Modeling with Vector-Quantized Visual Tokenizers
arXiv 2022
Image as a Foreign Language: BEiT Pretraining for All Vision and Vision-Language Tasks
arXiv 2022
StableMoE: Stable Routing Strategy for Mixture of Experts
ACL 2022 5
CROP: Zero-shot Cross-lingual Named Entity Recognition with Multilingual Labeled Sequence Translation
arXiv 2022
Language Models as Inductive Reasoners
arXiv 2022
GanLM: Encoder-Decoder Pre-training with an Auxiliary Discriminator
arXiv 2022
Improving Pretrained Cross-Lingual Language Models via Self-Labeled Word Alignment
ACL 2021 5
Allocating Large Vocabulary Capacity for Cross-lingual Language Model Pre-training
EMNLP 2021 11
Zero-shot Cross-lingual Transfer of Neural Machine Translation with Multilingual Pretrained Encoders
EMNLP 2021 11
VLMo: Unified Vision-Language Pre-Training with Mixture-of-Modality-Experts
arXiv 2021
DeltaLM: Encoder-Decoder Pre-training for Language Generation and Translation by Augmenting Pretrained Multilingual Encoders
arXiv 2021
Oscar: Object-Semantics Aligned Pre-training for Vision-Language Tasks
ECCV 2020 8
Affiliations
Frequent co-authors
10from 36 papers