0

Yuandong Tian

Papers
29

Cite

Notes

Only stored in your browser.

Attribution

Affiliations & profile
Semantic Scholar
Attribution policy →
29papers

Authored papers

29

STEM: Scaling Transformers with Embedding Modules

arXiv 2026

2026

Towards General-Purpose Model-Free Reinforcement Learning

arXiv 2025

2025

SPG: Sandwiched Policy Gradient for Masked Diffusion Language Models

arXiv 2025

2025

SWEET-RL: Training Multi-Turn LLM Agents on Collaborative Reasoning Tasks

arXiv 2025

2025

Token Assorted: Mixing Latent and Text Tokens for Improved Language Model Reasoning

arXiv 2025

2025

Training Large Language Models to Reason in a Continuous Latent Space

arXiv 2024

2024

TravelPlanner: A Benchmark for Real-World Planning with Language Agents

arXiv 2024

2024

Agent-as-a-Judge: Evaluate Agents with Agents

arXiv 2024

2024

GaLore: Memory-Efficient LLM Training by Gradient Low-Rank Projection

arXiv 2024

2024

MobileLLM: Optimizing Sub-billion Parameter Language Models for On-Device Use Cases

arXiv 2024

2024

TriForce: Lossless Acceleration of Long Sequence Generation with Hierarchical Speculative Decoding

arXiv 2024

2024

MagicPIG: LSH Sampling for Efficient LLM Generation

arXiv 2024

2024

Beyond A*: Better Planning with Transformers via Search Dynamics Bootstrapping

arXiv 2024

2024

AdvPrompter: Fast Adaptive Adversarial Prompting for LLMs

arXiv 2024

2024

Composing Global Optimizers to Reasoning Tasks via Algebraic Objects in Neural Nets

arXiv 2024

2024

Q-GaLore: Quantized GaLore with INT4 Projection and Layer-Adaptive Low-Rank Gradients

arXiv 2024

2024

From GaLore to WeLore: How Low-Rank Weights Non-uniformly Emerge from Low-Rank Gradients

arXiv 2024

2024

Dualformer: Controllable Fast and Slow Thinking by Learning with Randomized Reasoning Traces

arXiv 2024

2024

On the Surprising Effectiveness of Attention Transfer for Vision Transformers

arXiv 2024

2024

LoCoCo: Dropping In Convolutions for Long Context Compression

arXiv 2024

2024

Efficient Streaming Language Models with Attention Sinks

arXiv 2023

2023

H$_2$O: Heavy-Hitter Oracle for Efficient Generative Inference of Large Language Models

arXiv 2023

2023

Deja Vu: Contextual Sparsity for Efficient LLMs at Inference Time

arXiv 2023

2023

RLCD: Reinforcement Learning from Contrastive Distillation for Language Model Alignment

arXiv 2023

2023

Re3: Generating Longer Stories With Recursive Reprompting and Revision

arXiv 2022

2022

Denoised MDPs: Learning World Models Better Than the World Itself

arXiv 2022

2022

Understanding Self-supervised Learning with Dual Deep Networks

arXiv 2020

2020

Sample-Efficient Neural Architecture Search by Learning Action Space

arXiv 2019

2019

FBNet: Hardware-Aware Efficient ConvNet Design via Differentiable Neural Architecture Search

fbnet-hardware-aware-efficient-convnet-design-1

2018

Affiliations

No known affiliations.

Frequent co-authors

10

from 29 papers