\(\DeclarePairedDelimiterX{\Set}[2]{\{}{\}}{#1 \nonscript\;\delimsize\vert\nonscript\; #2}\) \( \DeclarePairedDelimiter{\set}{\{}{\}}\) \( \DeclarePairedDelimiter{\parens}{\left(}{\right)}\) \(\DeclarePairedDelimiterX{\innerproduct}[1]{\langle}{\rangle}{#1}\) \(\newcommand{\ip}[1]{\innerproduct{#1}}\) \(\newcommand{\bmat}[1]{\left[\hspace{2.0pt}\begin{matrix}#1\end{matrix}\hspace{2.0pt}\right]}\) \(\newcommand{\barray}[1]{\left[\hspace{2.0pt}\begin{matrix}#1\end{matrix}\hspace{2.0pt}\right]}\) \(\newcommand{\mat}[1]{\begin{matrix}#1\end{matrix}}\) \(\newcommand{\pmat}[1]{\begin{pmatrix}#1\end{pmatrix}}\) \(\newcommand{\mathword}[1]{\mathop{\textup{#1}}}\)
Needs:
Optimal Tree Density Approximators
Tree Normals
Normal Conditionals
Needed by:
Maximum Likelihood Tree Normals
Links:
Sheet PDF
Graph PDF

Tree Approximators of a Normal

Why

What is the optimal tree approximator of a multivariate normal density?

Result

Let $g: \R ^n \to \R $ be a normal density with mean $\mu \in \R ^d$ and covariance $\Sigma \in \mathbf{S} ^d_{++}$. The normal density $f^*_T: \R ^d \to \R $ with mean $\mu $ and precision matrix $P$ defined by

  • $P_{11} = \Sigma _{11}^{-1} + \sum_{\pa{j} = 1} \Sigma _{j1}^2\Sigma _{11}^{-2}\Sigma _{j\mid 1}^{-1}$
  • for $i = 2, \dots , d$, $P_{ii} = \Sigma _{i\mid\pa{i}}^{-1} + \sum_{\pa{j} = i} \Sigma _{ji}^2\Sigma _{ii}^{-2}\Sigma _{j\mid i}^{-1}$
  • $i, j = 1, \dots d$ and $i = \pa{j}$, $P_{ij} = P_{ji} = -\Sigma _{ji}\Sigma _{jj}^{-1}\Sigma _{j \mid i}^{-1}$
where $\pa{i}$ is the parent of $i$ in an optimal approximator tree $T$ ($i = 2, \dots , n)$ is an optimal tree approximator of $g$.

Using Proposition 1 of Best Tree Density Approximators, express an optimal tree approximator of $g$ by

\[ (1/c) \exp\left( -\frac{1}{2} \left( \Sigma _{11}^{-1}\bar{x}_1^2 + \sum_{i \neq 1} \left(\bar{x}_i - \Sigma _{i,\pa{i}}\Sigma _{\pa{i},\pa{i}}^{-1}\bar{x}_{\pa{i}}\right)^2\Sigma _{i\mid\pa{i}}^{-1} \right) \right) \]

where $\bar{x}_i = x_i - \mu _i$ and $c = \sqrt{(2\pi )^d\Sigma _{11} \prod_{i \neq 1} \Sigma _{i \mid \pa{i}}}$.

Second, express the quadratic in the exponential as

\[ \Sigma _{11}^{-1}\bar{x}_1^2 + \sum_{i \neq 1} \left[ \Sigma _{i\mid\pa{i}}^{-1} \bar{x}_i^2 - 2 \Sigma _{i,\pa{i}}\Sigma _{\pa{i},\pa{i}}^{-1}\Sigma _{i\mid\pa{i}}^{-1} \bar{x}_i\bar{x}_{\pa{i}} + \Sigma _{i,\pa{i}}^2\Sigma _{\pa{i},\pa{i}}^{-2}\Sigma _{i\mid\pa{i}}^{-1} \bar{x}_{\pa{i}}^2 \right] \]

With $P$ defined as earlier, we can express the above as $\bar{x}^{\tp} P \bar{x}$.

Third, note that $c$ is $\sqrt{(2\pi )^d\det P^{-1}}$ since $f^*_T$ is a density and so integrates to one.

Notice that $f^*_T$ is a tree normal density.

Empirical normal

In particular, notice that we can approximate the empirical normal density of a dataset with a density that factors according to a tree.

Copyright © 2023 The Bourbaki Authors — All rights reserved — Version 13a6779cc About Show the old page view