0

BlackGoose Rimer: Harnessing RWKV-7 as a Simple yet Superior Replacement for Transformers in Large-Scale Time Series Modeling

A novel approach using RWKV-7 with meta-learning improves the performance of transformer-based time series models, achieving significant reductions in training time and parameter use.

Year
2025
Venue
arXiv 2025
Authors
2
Hosting
Abstract onlyARXIV-DEFAULT

Cite

Notes

Only stored in your browser.

Attribution

Abstract & full text
arxiv.org/abs/2503.06121ARXIV-DEFAULT
TL;DR
Semantic Scholar
Attribution policy →

Abstract

Time series models face significant challenges in scaling to handle large and complex datasets, akin to the scaling achieved by large language models (LLMs). The unique characteristics of time series data and the computational demands of model scaling necessitate innovative approaches. While researchers have explored various architectures such as Transformers, LSTMs, and GRUs to address these challenges, we propose a novel solution using RWKV-7, which incorporates meta-learning into its state update mechanism. By integrating RWKV-7's time mix and channel mix components into the transformer-based time series model Timer, we achieve a substantial performance improvement of approximately 1.13 to 43.3x and a 4.5x reduction in training time with 1/23 parameters, all while utilizing fewer parameters. Our code and model weights are publicly available for further research and development at https://github.com/Alic-Li/BlackGoose_Rimer.

Authors

2