We design efficient distance approximation algorithms for several classes of structured high-dimensional distributions. Specifically, we show algorithms for the following problems:

- Given sample access to two Bayesian networks $P_1$ and $P_2$ over known directed acyclic graphs $G_1$ and $G_2$ having n nodes and bounded in-degree, approximate $d_{TV}(P_1,P_2)$ to within additive error ϵ using $poly(n,ϵ)$ samples and time
- Given sample access to two ferromagnetic Ising models $P_1$ and $P_2$ on n variables with bounded width, approximate $d_{TV}(P_1,P_2)$ to within additive error ϵ using $poly(n,ϵ)$ samples and time
- Given sample access to two n-dimensional Gaussians $P_1$ and $P_2$, approximate $d_{TV}(P_1,P_2)$ to within additive error ϵ using poly(n,ϵ) samples and time
- Given access to observations from two causal models $P$ and $Q$ on n variables that are defined over known causal graphs, approximate $d_{TV}(P_a,Q_a)$ to within additive error $ϵ$ using $poly(n,ϵ)$ samples, where Pa and Qa are the interventional distributions obtained by the intervention $do(A=a)$ on $P$ and $Q$ respectively for a particular variable $A$.

Our results are the first efficient distance approximation algorithms for these well-studied problems. They are derived using a simple and general connection to distribution learning algorithms. The distance approximation algorithms imply new efficient algorithms for $tolerant$ testing of closeness of the above-mentioned structured high-dimensional distributions.

Type

Publication

In *Advances in Neural Information Processing Systems (NeurIPS)*