Quantum Occam Learning: Sample-Supported Expressibility for Circuit-Based Quantum Learning

A central principle in quantum machine learning is that an ansatz should be expressive enough to represent the quantum data of interest. Yet, the expressibility is statistically meaningful only insofar as it can be learned from finitely many copies of an unknown quantum state. In this work, we develop an information-theoretic Occam theory for quantum data generated by finite-size quantum circuits. For the class S_{n,G} of n-qubit pure states preparable with at most G two-qubit gates, a metric-entropy argument gives the realizable sample law \widetildeΘ(G/ε^2) in the circuit-limited regime. For an arbitrary source \hatρ, we introduce the best G-gate approximation error d_G(\hatρ) and the approximate circuit complexity C_η(\hatρ). We prove an agnostic quantum Occam theorem: with M copies, one can learn up to the best G-gate approximation error plus a statistical penalty \widetilde{O}(\sqrt{G/M}). We then remove the need to know G in advance through an adaptive model-selection theorem whose oracle inequality selects the circuit complexity justified by the data. Matching lower bounds yield a sample-supported expressibility law: at trace-distance accuracy ε, M samples can support only G_{\rm supported} \simeq Mε^2 gates, up to logarithmic factors and tomography saturation at 2^n. Thus, the circuit complexity becomes an adaptive statistical resource rather than a static promise. Our framework turns bounded circuit complexity into a model-selection principle for quantum machine learning.