Early Prediction of Liver Cirrhosis Up to Two Years in Advance: A Machine Learning Study Benchmarking Against the FIB-4 and APRI Scores

Objective: Develop and evaluate machine learning (ML) models for predicting incident liver cirrhosis (LC) one and two years prior to diagnosis using routinely collected electronic health record (EHR) data and benchmark their performance against the FIB-4 and APRI clinical scores. Methods: We conducted a retrospective cohort study using de-identified EHR data from a large academic health system. XGBoost models were developed for 1- and 2-year prediction horizons, with model-specific feature selection and Bayesian hyperparameter tuning applied to improve predictive performance. The model was then evaluated on held-out test sets, and its performance was compared with FIB-4 and APRI using accuracy, precision, recall, F1, area under the precision-recall curve (PR AUC), and area under the receiver operating characteristic curve (AUC). Results: Final modeling cohorts included 60,481 patients for the 1-year prediction and 47,322 for the 2-year prediction. Across both prediction windows, the tuned ML models consistently outperformed FIB-4 and APRI. The XGBoost models achieved AUCs of 0.872 and 0.839 for the 1- and 2-year predictions, respectively, compared with 0.756 and 0.723 for FIB-4 and 0.798 and 0.761 for APRI. Improvements were larger on the precision-recall metric, with PR AUCs of 0.657 and 0.562 for XGBoost compared with 0.456 and 0.373 for FIB-4 and 0.504 and 0.421 for APRI. Performance gains persisted with longer prediction horizons, indicating maintained early risk discrimination. Conclusions: Machine learning models leveraging routine EHR data substantially outperform the traditional FIB-4 and APRI scores for early prediction of liver cirrhosis. These models enable earlier and more accurate risk stratification and can be integrated into clinical workflows as automated decision-support tools to support proactive cirrhosis prevention and management.