Black-Box Assisted Regression: Phase Transitions and Minimax Optimality

Foundation models are often used as fixed black-box predictors for downstream tasks with limited labeled data, but their predictions may be biased and unsafe to trust blindly. We study this setting through black-box assisted nonparametric regression: a learner observes labeled samples and can query a fixed predictor f_0, while the target f^* is close to f_0 in L_2(P_X) up to an unknown radius δ. We give a finite-sample minimax characterization showing a phase transition at δ_c(n) \asymp n^{-β/(2β+d)}, with leading risk \min{δ^2, n^{-2β/(2β+d)}}. We then analyze a Safe Residual Estimator: it learns a correction around f_0, initializes the residual head at zero so the initial predictor equals f_0, and uses holdout selection to revert to f_0 when the learned correction is not supported by validation data. Here, "safe" means avoiding negative transfer, i.e., performing worse than the black-box predictor alone. The estimator matches the leading minimax term up to an additive validation-selection cost. Synthetic regression experiments verify the predicted phase transition, while CIFAR-100 with CLIP and AG News with Qwen3-8B provide practice-facing evidence that the same residual-correction tradeoff is useful beyond the formal squared-loss regression setting.