Cite
Notes
Only stored in your browser.
Attribution
MARBLE: Multi-Aspect Reward Balance for Diffusion RL
arXiv 2026
Alleviating Sparse Rewards by Modeling Step-Wise and Long-Term Sampling Effects in Flow-Based GRPO
from 2 papers
Canyu Zhao
Chunhua Shen
Hao Chen
Hao Jiang
Hongwei Zhang
Jiacheng Li
Jiamang Wang
Jinlong Liu
Ju Huang
Mushui Liu