Wanjia Zhao
- Papers
- 8
Cite
Notes
Only stored in your browser.
Authored papers
8Rethinking Memory Mechanisms of Foundation Agents in the Second Half: A Survey
arXiv 2026
CopT: Contrastive On-Policy Thinking with Continuous Spaces for General and Agentic Reasoning
arXiv 2026
DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning
preprint
OpenThoughts: Data Recipes for Reasoning Models
arXiv 2025
DeepSeek-Prover-V2: Advancing Formal Mathematical Reasoning via Reinforcement Learning for Subgoal Decomposition
arXiv 2025
SiriuS: Self-improving Multi-agent Systems via Bootstrapped Reasoning
arXiv 2025
DeepSeek-V3 Technical Report
arXiv 2024
DeepSeek-Prover-V1.5: Harnessing Proof Assistant Feedback for Reinforcement Learning and Monte-Carlo Tree Search
arXiv 2024
Affiliations
Frequent co-authors
10from 8 papers