Foundation vs. Specialized Models: Evaluating Catastrophic Forgetting in Continual Time Series Forecasting

While Time Series Foundation Models (TSFMs) excel in zero-shot tasks, their behavior under continual fine tuning is poorly understood. We present the first systematic study of catastrophic forgetting in TSFMs (TimesFM-2.0, Chronos-2) versus a specialized SamFormer model across synthetic and real-world energy forecasting benchmarks. Our results show that while fine-tuning improves new task accuracy, it consistently triggers forgetting, though larger models exhibit greater inherent robustness. Notably, employing forgetting mitigation techniques such as DER, levels the playing field: it provides disproportionate gains to smaller models, allowing them to match TSFM performance by the end of the continual learning sequence. These findings suggest that in realistic, non-stationary scenarios, the high computational cost of large foundation models may not be justified over smaller models equipped with effective mitigation strategies.