An AI Co-Data-Scientist for Prioritizing Candidate Biomarkers from Wearable Sensor Data

Wearable devices generate continuous physiological and behavioral data, but converting these signals into clinically reviewable biomarker hypotheses remains labor-intensive. We introduce CoDaS, an AI co-data-scientist that integrates multi-agent hypothesis generation, deterministic statistical analysis, adversarial validation and literature-grounded interpretation under human oversight. Across three wearable cohorts comprising 9,279 participant-observations, CoDaS prioritized candidate associations for mental-health and metabolic endpoints after internal checks for replication, stability, robustness and leakage. The system identified related circadian-instability signals associated with depression, including sleep-duration variability in DWB (ρ = 0.252, p < 0.001) and sleep-onset variability in GLOBEM (ρ = 0.126, p < 0.001), and derived a wearable cardiovascular-fitness index associated with insulin resistance (steps/resting heart rate; ρ = -0.374, p < 0.001). Adding these features to demographic models produced modest gains (ΔR^2 = 0.040 for depression, 0.021 for insulin resistance). In a 12-clinician review totaling approximately 25 active hours, clinician validity judgments aligned with CoDaS confidence tiers (ρ = 0.67, p = 0.005), whereas added clinical value and confidence to act were rated lower. CoDaS supports traceable, hypothesis-generating prioritization of wearable candidate biomarkers.