Fuzheng Zhang
- Papers
- 17
Cite
Notes
Only stored in your browser.
Authored papers
17Agentic Reinforced Policy Optimization
arXiv 2025
Agentic Entropy-Balanced Policy Optimization
arXiv 2025
RLEP: Reinforcement Learning with Experience Replay for LLM Reasoning
arXiv 2025
ASPO: Asymmetric Importance Sampling Policy Optimization
arXiv 2025
CE-GPPO: Controlling Entropy via Gradient-Preserving Clipping Policy Optimization in Reinforcement Learning
arXiv 2025
Klear-Reasoner: Advancing Reasoning Capability via Gradient-Preserving Clipping Policy Optimization
arXiv 2025
Stabilizing Knowledge, Promoting Reasoning: Dual-Token Constraints for RLVR
arXiv 2025
TUNA: Comprehensive Fine-grained Temporal Understanding Evaluation on Dense Dynamic Videos
arXiv 2025
Modality Curation: Building Universal Embeddings for Advanced Multimodal Information Retrieval
arXiv 2025
Mavors: Multi-granularity Video Representation for Multimodal Large Language Model
arXiv 2025
VidCapBench: A Comprehensive Benchmark of Video Captioning for Controllable Text-to-Video Generation
arXiv 2025
Leanabell-Prover: Posttraining Scaling in Formal Reasoning
arXiv 2025
Capybara-OMNI: An Efficient Paradigm for Building Omni-Modal Language Models
arXiv 2025
TEMPLE:Temporal Preference Learning of Video LLMs via Difficulty Scheduling and Pre-SFT Alignment
arXiv 2025
Improving Large Language Models via Fine-grained Reinforcement Learning with Minimum Editing Constraint
arXiv 2024
Breaking the Stage Barrier: A Novel Single-Stage Approach to Long Context Extension for Large Language Models
arXiv 2024
ConSERT: A Contrastive Framework for Self-Supervised Sentence Representation Transfer
ACL 2021 5
Affiliations
Frequent co-authors
10from 17 papers