0

Min Lin

Papers
48

Cite

Notes

Only stored in your browser.

Attribution

Affiliations & profile
Semantic Scholar
Attribution policy →
48papers

Authored papers

48

Revisiting Parameter Server in LLM Post-Training

arXiv 2026

2026

Rethinking the Trust Region in LLM Reinforcement Learning

arXiv 2026

2026

Understanding R1-Zero-Like Training: A Critical Perspective

arXiv 2025

2025

PipeOffload: Improving Scalability of Pipeline Parallelism with Memory Optimization

arXiv 2025

2025

StereoGen: High-quality Stereo Image Generation from a Single Image

ICCV 2025

2025

Optimizing Anytime Reasoning via Budget Relative Policy Optimization

arXiv 2025

2025

FlowReasoner: Reinforcing Query-Level Meta-Agents

arXiv 2025

2025

Sailor2: Sailing in South-East Asia with Inclusive Multilingual LLMs

arXiv 2025

2025

SPIRAL: Self-Play on Zero-Sum Games Incentivizes Reasoning via Multi-Agent Multi-Turn Reinforcement Learning

arXiv 2025

2025

Variational Reasoning for Language Models

arXiv 2025

2025

Defeating the Training-Inference Mismatch via FP16

arXiv 2025

2025

Language Models Can Learn from Verbal Feedback Without Scalar Rewards

arXiv 2025

2025

Reinforcing General Reasoning without Verifiers

arXiv 2025

2025

Lifelong Safety Alignment for Language Models

arXiv 2025

2025

GEM: A Gym for Agentic LLMs

arXiv 2025

2025

When Attention Sink Emerges in Language Models: An Empirical View

arXiv 2024

2024

Sample-Efficient Alignment for LLMs

arXiv 2024

2024

Sailor: Open Language Models for South-East Asia

arXiv 2024

2024

Scaling up Masked Diffusion Models on Text

arXiv 2024

2024

Improved Techniques for Optimization-Based Jailbreaking on Large Language Models

arXiv 2024

2024

Pipeline Parallelism with Controllable Memory

arXiv 2024

2024

Agent Smith: A Single Image Can Jailbreak One Million Multimodal LLM Agents Exponentially Fast

arXiv 2024

2024

Cheating Automatic LLM Benchmarks: Null Models Achieve High Win Rates

arXiv 2024

2024

SimLayerKV: A Simple Framework for Layer-Level KV Cache Reduction

arXiv 2024

2024

Bootstrapping Language Models with DPO Implicit Rewards

arXiv 2024

2024

Balancing Pipeline Parallelism with Vocabulary Parallelism

arXiv 2024

2024

Beyond Memorization: The Challenge of Random Memory Access in Language Models

arXiv 2024

2024

Scaling Laws with Vocabulary: Larger Models Deserve Larger Vocabularies

arXiv 2024

2024

RegMix: Data Mixture as Regression for Language Model Pre-training

arXiv 2024

2024

Chain of Preference Optimization: Improving Chain-of-Thought Reasoning in LLMs

arXiv 2024

2024

Stochastic Taylor Derivative Estimator: Efficient amortization for arbitrary differential operators

arXiv 2024

2024

LoraHub: Efficient Cross-Task Generalization via Dynamic LoRA Composition

arXiv 2023

2023

Instant3D: Instant Text-to-3D Generation

arXiv 2023

2023

Cleanba: A Reproducible and Efficient Distributed Reinforcement Learning Platform

arXiv 2023

2023

Automatic Functional Differentiation in JAX

arXiv 2023

2023

From Zero to Hero: Examining the Power of Symbolic Tasks in Instruction Tuning

arXiv 2023

2023

Finetuning Text-to-Image Diffusion Models for Fairness

arXiv 2023

2023

Bag of Tricks for Training Data Extraction from Language Models

arXiv 2023

2023

On Calibrating Diffusion Probabilistic Models

on-calibrating-diffusion-probabilistic-models

2023

Intriguing Properties of Data Attribution on Diffusion Models

arXiv 2023

2023

BAFFLE: A Baseline of Backpropagation-Free Federated Learning

arXiv 2023

2023

NU-MCC: Multiview Compressive Coding with Neighborhood Decoder and Repulsive UDF

nu-mcc-multiview-compressive-coding-with

2023

Nonparametric Generative Modeling with Conditional Sliced-Wasserstein Flows

arXiv 2023

2023

On Evaluating Adversarial Robustness of Large Vision-Language Models

NeurIPS 2023 11

2023

A Recipe for Watermarking Diffusion Models

arXiv 2023

2023

Better Diffusion Models Further Improve Adversarial Training

arXiv 2023

2023

EnvPool: A Highly Parallel Reinforcement Learning Environment Execution Engine

arXiv 2022

2022

Robustness and Accuracy Could Be Reconcilable by (Proper) Definition

arXiv 2022

2022

Affiliations

No known affiliations.

Frequent co-authors

10

from 48 papers