0

QBioFusion-QSAR: Morgan-Anchored Quantum Multiple Kernel Learning for Small-Data Ligand Classification

Small quantitative structure-activity relationship (QSAR) studies are difficult when close molecular analogues have different activity labels. This paper asks whether a quantum kernel can add similarity information to a Morgan/Tanimoto fingerprint model, and which molecules…

Preview
Year
2026
Hosting
Excerpt onlyCC-BY-NC-SA-4.0

Cite

Notes

Only stored in your browser.

Attribution

Abstract & full text
arxiv.org/abs/2606.21213CC-BY-NC-SA-4.0
TL;DR
Semantic Scholar
Attribution policy →

Abstract

Small quantitative structure-activity relationship (QSAR) studies are difficult when close molecular analogues have different activity labels. This paper asks whether a quantum kernel can add similarity information to a Morgan/Tanimoto fingerprint model, and which molecules account for the change. QBioFusion-QSAR uses quantum multiple kernel learning (QMKL): a support vector machine combines a Morgan/Tanimoto kernel with a quantum fidelity kernel constructed from fold-local components derived from RDKit and Mordred descriptors and Deep-PK features. Linear and radial basis function descriptor kernels are included as classical controls. On the 54-molecule PsychLight-A benchmark, Morgan/Tanimoto was the strongest single representation. In the primary stratified five-fold evaluation, QMKL increased accuracy from 0.815 to 0.833 and Matthews correlation coefficient (MCC) from 0.613 to 0.645. Matched-regularization auditing attributed the change to N-Me-5-HT and N-Me-tryptamine changing from false-negative to true-positive predictions; activity-cliff subset MCC increased from 0.07 to 0.22. Repeating the five-fold protocol over ten random partitionings showed that learned QMKL did not exceed Morgan/Tanimoto on mean MCC; paired held-out bootstrap intervals for the matched comparison also span zero. These results support QBioFusion-QSAR as an auditable QMKL framework for identifying localized residual quantum-kernel contributions in small-data, activity-cliff-aware ligand classification.