0

Functional Clustering of Survival Data via Smoothed Log-Hazard Trajectories: A Risk-Dynamics Perspective

This paper investigates clustering in survival data by shifting the analytical focus from cumulative survival probabilities to instantaneous risk, as characterized by the hazard function.

Preview
Year
2026
Hosting
Full text hostedCC-BY-4.0

Cite

Notes

Only stored in your browser.

Attribution

Abstract & full text
arxiv.org/abs/2606.01239CC-BY-4.0
TL;DR
Semantic Scholar
Attribution policy →

Abstract

This paper investigates clustering in survival data by shifting the analytical focus from cumulative survival probabilities to instantaneous risk, as characterized by the hazard function. We model smoothed log-hazard trajectories as functional objects that capture the temporal evolution of risk and propose a clustering framework based on Functional Principal Component Analysis applied to B-spline smoothed log-hazard trajectories. The number of retained functional principal components is selected before clustering using a 95% cumulative explained-variance rule, and clustering is then performed on the unstandardized FPCA scores. The proposed methodology is evaluated through simulation studies covering progressively complex scenarios, including overlapping and crossing hazard functions, cohort imbalance, heterogeneous risk profiles, and outlier contamination. The framework is further illustrated on two real-world clinical datasets, the German Breast Cancer Study and the Primary Biliary Cirrhosis dataset. Results show that the proposed log-hazard-based functional clustering framework provides an interpretable representation of relative temporal risk dynamics, with competitive internal cohesion and explicit robustness diagnostics when compared with cumulative-survival-based benchmarks.