Cite
Notes
Only stored in your browser.
Attribution
Deep Policy Gradient Methods Without Batch Updates, Target Networks, or Replay Buffers
arXiv 2024
Trajectory-Aware Eligibility Traces for Off-Policy Reinforcement Learning
arXiv 2023
Scalable Real-Time Recurrent Learning Using Columnar-Constructive Networks
from 3 papers
A. Rupam Mahmood
Alireza Azimi
Brett Daley
Christopher Amato
Colin Bellinger
Fahim Shariar
Gautham Vasan
Haseeb Shah
Jiamin He
Khurram Javed