Investigating ECG Diagnosis with Ambiguous Labels using Partial Label Learning

Label ambiguity is an inherent and largely unaddressed challenge in real-world electrocardiogram (ECG) diagnosis, arising from overlapping conditions and diagnostic disagreements. However, current ECG models are trained assuming clean and non-ambiguous annotations, limiting both the development and meaningful evaluation of models under real-world conditions. Although Partial Label Learning (PLL) frameworks are designed to learn from ambiguous labels, their effectiveness in medical time-series domains, ECG in particular, remains largely underexplored. We present the first systematic study of PLL methods for ECG diagnosis under both real and controlled ambiguity. First, we adapt nine PLL algorithms to multi-label ECG diagnosis under label ambiguity, and perform detailed evaluations on real clinical settings with multi-annotator diagnostic disagreements. Next, to study PLL effects on ECG in more depth under controlled settings, we introduce a diverse set of clinically motivated synthetic label ambiguities. Our experiments demonstrate that PLL methods vary substantially in robustness across ambiguity types and levels. Moreover, we observe that PLL generally outperforms standard supervised training under label ambiguity, highlighting the value of such frameworks. Through extensive analysis, we identify key limitations of current PLL approaches for clinical settings and outline future directions for developing robust and clinically aligned ambiguity-aware learning frameworks for ECG diagnosis.