LLM Trainer RL Env (Community)
Fresh
AlphaZero-inspired MCTS environment for training LLMs through tree search and policy learning with teacher ensemble guidance and dual reward systems
- Type
- RL Env
- License
- unknown
- Size
- v0.1.50
- Published
- Oct 2025
Cite
Notes
Only stored in your browser.
Lift evidence
1| Eval | Tools known to lift | Source paper |
|---|---|---|
| GSM8K | LLM Trainer RL Env (Community) | - |