Tong Zhang
- Papers
- 52
Cite
Notes
Only stored in your browser.
Authored papers
52Code as Agent Harness
arXiv 2026
Channel-wise Vector Quantization
arXiv 2026
Orchard: An Open-Source Agentic Modeling Framework
arXiv 2026
Recursive Multi-Agent Systems
arXiv 2026
AgentSPEX: An Agent SPecification and EXecution Language
arXiv 2026
GUI-Libra: Training Native GUI Agents to Reason and Act with Action-aware Supervision and Partially Verifiable RL
arXiv 2026
PRL: Process Reward Learning Improves LLMs' Reasoning Ability and Broadens the Reasoning Boundary
arXiv 2026
EmbodiedBench: Comprehensive Benchmarking Multi-modal Large Language Models for Vision-Driven Embodied Agents
arXiv 2025
Thinking vs. Doing: Agents that Reason by Scaling Test-Time Interaction
arXiv 2025
A Minimalist Approach to LLM Reasoning: from Rejection Sampling to Reinforce
arXiv 2025
GUI-Actor: Coordinate-Free Visual Grounding for GUI Agents
arXiv 2025
RM-R1: Reward Modeling as Reasoning
arXiv 2025
MMHCL: Multi-Modal Hypergraph Contrastive Learning for Recommendation
arXiv 2025
Monte Carlo Diffusion for Generalizable Learning-Based RANSAC
arXiv 2025
LongCat-Video Technical Report
arXiv 2025
Chain-of-Experts: Unlocking the Communication Power of Mixture-of-Experts Models
arXiv 2025
Beyond Correctness: Harmonizing Process and Outcome Rewards through RL Training
arXiv 2025
GAR: Generative Adversarial Reinforcement Learning for Formal Theorem Proving
arXiv 2025
Self-rewarding correction for mathematical reasoning
arXiv 2025
MA-LoT: Multi-Agent Lean-based Long Chain-of-Thought Reasoning enhances Formal Theorem Proving
arXiv 2025
Optimizing Chain-of-Thought Reasoners via Gradient Variance Minimization in Rejection Sampling and RL
arXiv 2025
Prioritizing Image-Related Tokens Enhances Vision-Language Pre-Training
arXiv 2025
Adapt-Pruner: Adaptive Structural Pruning for Efficient Small Language Model Training
arXiv 2025
Self-Ensembling Gaussian Splatting for Few-Shot Novel View Synthesis
ICCV 2025
LISA: Layerwise Importance Sampling for Memory-Efficient Large Language Model Fine-Tuning
arXiv 2024
Image Textualization: An Automatic Framework for Creating Accurate and Detailed Image Descriptions
arXiv 2024
Arithmetic Control of LLMs for Diverse User Preferences: Directional Preference Alignment with Multi-Objective Rewards
arXiv 2024
Coherent and Multi-modality Image Inpainting via Latent Space Optimization
arXiv 2024
Personalized Visual Instruction Tuning
arXiv 2024
Leveraging Locality to Boost Sample Efficiency in Robotic Manipulation
arXiv 2024
MatchDiffusion: Training-free Generation of Match-cuts
ICCV 2025
Scaling Mesh Generation via Compressive Tokenization
CVPR 2025 1
TheoremLlama: Transforming General-Purpose LLMs into Lean4 Experts
arXiv 2024
Regularizing Hidden States Enables Learning Generalizable Reward Model for LLMs
arXiv 2024
SINDER: Repairing the Singular Defects of DINOv2
arXiv 2024
Entropy-Regularized Process Reward Model
arXiv 2024
TAGCOS: Task-agnostic Gradient Clustered Coreset Selection for Instruction Tuning Data
arXiv 2024
Active Prompting with Chain-of-Thought for Large Language Models
arXiv 2023
RAGTruth: A Hallucination Corpus for Developing Trustworthy Retrieval-Augmented Language Models
arXiv 2023
CVTHead: One-shot Controllable Head Avatar with Vertex-feature Transformer
arXiv 2023
R-Tuning: Instructing Large Language Models to Say `I Don't Know'
arXiv 2023
Mixture-of-Domain-Adapters: Decoupling and Injecting Domain Knowledge to Pre-trained Language Models Memories
arXiv 2023
Automatic Prompt Augmentation and Selection with Chain-of-Thought from Labeled Data
arXiv 2023
What is Essential for Unseen Goal Generalization of Offline Goal-conditioned RL?
arXiv 2023
TempSAL -- Uncovering Temporal Information for Deep Saliency Prediction
arXiv 2023
Towards Robust Offline Reinforcement Learning under Diverse Data Corruption
arXiv 2023
Mitigating the Alignment Tax of RLHF
arXiv 2023
Plum: Prompt Learning using Metaheuristic
arXiv 2023
VolRecon: Volume Rendering of Signed Ray Distance Functions for Generalizable Multi-View Reconstruction
CVPR 2023 1
Involution: Inverting the Inherence of Convolution for Visual Recognition
CVPR 2021 1
ZEN 2.0: Continue Training and Adaption for N-gram Enhanced Text Encoders
arXiv 2021
Weakly Supervised Disentangled Generative Causal Representation Learning
disentangled-generative-causal-representation
Affiliations
Frequent co-authors
10from 52 papers