Yuandong Tian
- Papers
- 29
Cite
Notes
Only stored in your browser.
Authored papers
29STEM: Scaling Transformers with Embedding Modules
arXiv 2026
Towards General-Purpose Model-Free Reinforcement Learning
arXiv 2025
SPG: Sandwiched Policy Gradient for Masked Diffusion Language Models
arXiv 2025
SWEET-RL: Training Multi-Turn LLM Agents on Collaborative Reasoning Tasks
arXiv 2025
Token Assorted: Mixing Latent and Text Tokens for Improved Language Model Reasoning
arXiv 2025
Training Large Language Models to Reason in a Continuous Latent Space
arXiv 2024
TravelPlanner: A Benchmark for Real-World Planning with Language Agents
arXiv 2024
Agent-as-a-Judge: Evaluate Agents with Agents
arXiv 2024
GaLore: Memory-Efficient LLM Training by Gradient Low-Rank Projection
arXiv 2024
MobileLLM: Optimizing Sub-billion Parameter Language Models for On-Device Use Cases
arXiv 2024
TriForce: Lossless Acceleration of Long Sequence Generation with Hierarchical Speculative Decoding
arXiv 2024
MagicPIG: LSH Sampling for Efficient LLM Generation
arXiv 2024
Beyond A*: Better Planning with Transformers via Search Dynamics Bootstrapping
arXiv 2024
AdvPrompter: Fast Adaptive Adversarial Prompting for LLMs
arXiv 2024
Composing Global Optimizers to Reasoning Tasks via Algebraic Objects in Neural Nets
arXiv 2024
Q-GaLore: Quantized GaLore with INT4 Projection and Layer-Adaptive Low-Rank Gradients
arXiv 2024
From GaLore to WeLore: How Low-Rank Weights Non-uniformly Emerge from Low-Rank Gradients
arXiv 2024
Dualformer: Controllable Fast and Slow Thinking by Learning with Randomized Reasoning Traces
arXiv 2024
On the Surprising Effectiveness of Attention Transfer for Vision Transformers
arXiv 2024
LoCoCo: Dropping In Convolutions for Long Context Compression
arXiv 2024
Efficient Streaming Language Models with Attention Sinks
arXiv 2023
H$_2$O: Heavy-Hitter Oracle for Efficient Generative Inference of Large Language Models
arXiv 2023
Deja Vu: Contextual Sparsity for Efficient LLMs at Inference Time
arXiv 2023
RLCD: Reinforcement Learning from Contrastive Distillation for Language Model Alignment
arXiv 2023
Re3: Generating Longer Stories With Recursive Reprompting and Revision
arXiv 2022
Denoised MDPs: Learning World Models Better Than the World Itself
arXiv 2022
Understanding Self-supervised Learning with Dual Deep Networks
arXiv 2020
Sample-Efficient Neural Architecture Search by Learning Action Space
arXiv 2019
FBNet: Hardware-Aware Efficient ConvNet Design via Differentiable Neural Architecture Search
fbnet-hardware-aware-efficient-convnet-design-1
Affiliations
Frequent co-authors
10from 29 papers