Zhongyuan Peng
- Papers
- 11
Cite
Notes
Only stored in your browser.
Authored papers
11Retrieval-Infused Reasoning Sandbox: A Benchmark for Decoupling Retrieval and Reasoning Capabilities
arXiv 2026
CoDiQ: Test-Time Scaling for Controllable Difficult Question Generation
arXiv 2026
SCALER:Synthetic Scalable Adaptive Learning Environment for Reasoning
arXiv 2026
FormalMATH: Benchmarking Formal Mathematical Reasoning of Large Language Models
arXiv 2025
CodeCriticBench: A Holistic Code Critique Benchmark for Large Language Models
arXiv 2025
CriticLean: Critic-Guided Reinforcement Learning for Mathematical Formalization
arXiv 2025
Can Large Language Models Detect Errors in Long Chain-of-Thought Reasoning?
arXiv 2025
IV-Bench: A Benchmark for Image-Grounded Video Perception and Reasoning in Multimodal LLMs
arXiv 2025
A Comparative Study on Reasoning Patterns of OpenAI's o1 Model
arXiv 2024
MTU-Bench: A Multi-granularity Tool-Use Benchmark for Large Language Models
arXiv 2024
RoleLLM: Benchmarking, Eliciting, and Enhancing Role-Playing Abilities of Large Language Models
arXiv 2023
Affiliations
Frequent co-authors
10from 11 papers