Federated Temporal Attention Intelligence for Cyber-Resilient IoMT: Lightweight Digital Twins and PPO-Driven Honeypot Deception

The rapid proliferation of Internet of Medical Things (IoMT) devices introduces critical cybersecurity vulnerabilities in healthcare environments where resource-constrained medical devices operate under strict latency requirements and stringent data-privacy regulations. To address these challenges, this paper presents the Lightweight Digital Twin and Federated Reinforcement Learning (LDT-FRL) framework, a privacy-preserving defense architecture integrating four complementary mechanisms: a Temporal Attention Encoder (TAE) built on a GRU backbone with learned temporal self-attention for flow-level threat classification; lightweight LSTM-based Digital Twins trained on normal-class traffic to generate per-device anomaly scores that gate the TAE classifier through a learned sigmoid coupling; a Federated Proximal Policy Optimization (PPO) agent selecting among ALLOW, ISOLATE, and HONEYPOT_REDIRECT actions based on a seven-dimensional state; and an intelligent honeypot layer that converts redirected suspicious traffic into actionable threat intelligence. A federated aggregation strategy employing EMA-smoothed per-client validation losses as inverse-weighted FedAvg coefficients stabilizes global model updates under non-IID client distributions. Evaluated on CICDDoS 2019 and TON-IoT benchmarks, LDT-FRL achieves 99.66% and 99.95% test accuracy respectively, with macro-F1 scores of 0.9913 and 0.9995, converging 81% faster than the DTFL-CD baseline while attaining perfect F1=1.000 on the severely imbalanced MITM class. Explainability analysis via SHAP, LIME, Grad-CAM, and counterfactual methods confirms that the TAE focuses on semantically meaningful flow features, providing interpretable evidence for each defense decision.