Weizhu Chen

SwS: Self-aware Weakness-driven Problem Synthesis in Reinforcement Learning for LLM Reasoning

arXiv 2025

COSMOS: A Hybrid Adaptive Optimizer for Memory-Efficient Training of LLMs

arXiv 2025

ThetaEvolve: Test-time Learning on Open Problems

arXiv 2025

LongRoPE2: Near-Lossless LLM Context Window Scaling

arXiv 2025

Segmenting Text and Learning Their Rewards for Improved RLHF in Language Model

arXiv 2025

Gold-Medal-Level Olympiad Geometry Solving with Efficient Heuristic Auxiliary Constructions

arXiv 2025

Beyond Pass@1: Self-Play with Variational Problem Synthesis Sustains RLVR

arXiv 2025

Decoder-Hybrid-Decoder Architecture for Efficient Reasoning with Long Generation

arXiv 2025

Rho-1: Not All Tokens Are What You Need

arXiv 2024

2024

Relative Preference Optimization: Enhancing LLM Alignment through Contrasting Responses across Identical and Diverse Prompts

arXiv 2024

2024

MTL-LoRA: Low-Rank Adaptation for Multi-Task Learning

arXiv 2024

2024

ToRA: A Tool-Integrated Reasoning Agent for Mathematical Problem Solving

arXiv 2023

In-Context Learning Unlocked for Diffusion Models

in-context-learning-unlocked-for-diffusion

AdaLoRA: Adaptive Budget Allocation for Parameter-Efficient Fine-Tuning

arXiv 2023

LoftQ: LoRA-Fine-Tuning-Aware Quantization for Large Language Models

arXiv 2023

Learning From Mistakes Makes LLM Better Reasoner

arXiv 2023

Seeking Neural Nuggets: Knowledge Transfer in Large Language Models from a Parametric Perspective

arXiv 2023

AGIEval: A Human-Centric Benchmark for Evaluating Foundation Models

arXiv 2023

RepoCoder: Repository-Level Code Completion Through Iterative Retrieval and Generation

arXiv 2023

CRITIC: Large Language Models Can Self-Correct with Tool-Interactive Critiquing

arXiv 2023

Supervised Knowledge Makes Large Language Models Better In-context Learners

arXiv 2023

AnnoLLM: Making Large Language Models to Be Better Crowdsourced Annotators

arXiv 2023

Tensor Programs V: Tuning Large Neural Networks via Zero-Shot Hyperparameter Transfer

arXiv 2022

GENIUS: Sketch-based Language Model Pre-training via Extreme and Selective Masking for Text Generation and Augmentation

arXiv 2022

Less is More: Task-aware Layer-wise Distillation for Language Model Compression

arXiv 2022

OmniTab: Pretraining with Natural and Synthetic Data for Few-shot Table-based Question Answering

omnitab-pretraining-with-natural-and

Mixing and Shifting: Exploiting Global and Local Dependencies in Vision MLPs

arXiv 2022

Diffusion-GAN: Training GANs with Diffusion

arXiv 2022

LoRA: Low-Rank Adaptation of Large Language Models

lora-low-rank-adaptation-of-large-language-1

TAPEX: Table Pre-training via Learning a Neural SQL Executor

tapex-table-pre-training-via-learning-a-1

Adversarial Retriever-Ranker for dense text retrieval

adversarial-retriever-ranker-for-dense-text-1

DSEE: Dually Sparsity-embedded Efficient Tuning of Pre-trained Language Models

dsee-dually-sparsity-embedded-efficient