Algorithms for Learning Sparse Additive Models with Interactions in High Dimensions

A function f: \mathbb{R}^d \rightarrow \mathbb{R} is a Sparse Additive Model (SPAM), if it is of the form f(x) = \sum_{l \in S}ϕ_{l}(x_l) where S \subset [d], |S| \ll d. Assuming ϕ's, S to be unknown, there exists extensive work for estimating f from its samples. In this work, we consider a generalized version of SPAMs, that also allows for the presence of a sparse number of second order interaction terms. For some S_1 \subset [d], S_2 \subset {[d] \choose 2}, with |S_1| \ll d, |S_2| \ll d^2, the function f is now assumed to be of the form: \sum_{p \in S_1}ϕ_{p} (x_p) + \sum_{(l,l^{\prime}) \in S_2}ϕ_{(l,l^{\prime})} (x_l,x_{l^{\prime}}). Assuming we have the freedom to query f anywhere in its domain, we derive efficient algorithms that provably recover S_1,S_2 with finite sample bounds. Our analysis covers the noiseless setting where exact samples of f are obtained, and also extends to the noisy setting where the queries are corrupted with noise. For the noisy setting in particular, we consider two noise models namely: i.i.d Gaussian noise and arbitrary but bounded noise. Our main methods for identification of S_2 essentially rely on estimation of sparse Hessian matrices, for which we provide two novel compressed sensing based schemes. Once S_1, S_2 are known, we show how the individual components ϕ_p, ϕ_{(l,l^{\prime})} can be estimated via additional queries of f, with uniform error bounds. Lastly, we provide simulation results on synthetic data that validate our theoretical findings.