\(\DeclarePairedDelimiterX{\Set}[2]{\{}{\}}{#1 \nonscript\;\delimsize\vert\nonscript\; #2}\) \( \DeclarePairedDelimiter{\set}{\{}{\}}\) \( \DeclarePairedDelimiter{\parens}{\left(}{\right)}\) \(\DeclarePairedDelimiterX{\innerproduct}[1]{\langle}{\rangle}{#1}\) \(\newcommand{\ip}[1]{\innerproduct{#1}}\) \(\newcommand{\bmat}[1]{\left[\hspace{2.0pt}\begin{matrix}#1\end{matrix}\hspace{2.0pt}\right]}\) \(\newcommand{\barray}[1]{\left[\hspace{2.0pt}\begin{matrix}#1\end{matrix}\hspace{2.0pt}\right]}\) \(\newcommand{\mat}[1]{\begin{matrix}#1\end{matrix}}\) \(\newcommand{\pmat}[1]{\begin{pmatrix}#1\end{pmatrix}}\) \(\newcommand{\mathword}[1]{\mathop{\textup{#1}}}\)
Needs:
Distribution Approximators
Tree Distributions
Relative Entropy
Needed by:
Optimal Tree Distribution Approximators
Tree Density Approximators
Links:
Sheet PDF
Graph PDF

Tree Distribution Approximators

Why

We want to approximate a given distribution with one which factors according to a tree.

Definition

Given $q: A \to [0, 1]$, we want to find a distribution $p$ on $A$ and tree $T$ on $\set{1, \dots , n}$ to

\[ \begin{aligned} \text{minimize} & \quad d_{kl}(q, p) \\ \text{subject to} & \quad p \text{ factors according to } T. \end{aligned} \]

where $d_{kl}$ is the relative entropy as a criterion of approximation. We call such a distribution a tree distribution approximator (or tree approximator) and we call the tree the approximator tree.

Result

Let $A_1, \dots , A_n$ be finite non-empty sets. Define $A = \prod_{i = 1}^{n} A_i$. Let $q: A \to [0, 1]$ a distribution and $T$ a tree on $\set{1, \dots , n}$. The distribution $p^*_T: A \to [0, 1]$ defined by

\[ p^*_T = q_1 \prod_{i \neq 1} q_{i \mid \pa{i}} \]

minimizes the relative entropy with $q$ among all distributions on $A$ which factor according to $T$.

Let $p: A \to [0, 1]$ be a distribution which factors according to $T$. First, express

\[ p = p_1 \prod_{i \neq i} p_{i \mid \pa_i} \]

where $\pa_i$ is the parent of vertex $i$ in $T$ rooted at vertex 1 ($i = 2, \dots , n$).

Second, recall that the relative entropy of $q$ with $p$ is $H(q, p) - H(q)$. Since $H(q)$ does not depend on $p$, $p$ is a minimizer of the relative of $q$ with $p$ if and only if $p$ is a minimizer of $H(q, p)$.

Third, express

\[ \begin{aligned} H(q, p) &= - \sum_{a \in A} q(a) \log p(a) \\ &= - \sum_{a \in A} q(a) (\log p_1(a_1) + \sum_{i \neq 1} \log p_{i \mid \pa_i}(a_i, a_{\pa_i})) \\ &= H(q_1, p_1) + \sum_{i \neq 1} \sum_{a_{\pa_i} \in A_{\pa_i}} q_{\pa_i}(a_{\pa_i})H(q_{i\mid\pa_i}(\cdot , a_{\pa_i}), p_{i\mid\pa_i}(\cdot , a_{\pa_i})) \end{aligned} \]

which separates across $p_1$ an $p_{i \mid \pa_i}(\cdot , a_{\pa_i})$ for $i = 2, \dots , n$ and $a_{pa_i} \in A_{\pa_i}$.

Fourth, recall $H(\cdot , \cdot ) \geq 0$ and is zero on repeated pairs. By this, we mean, for example, $H(p_1, p_1) = 0$. So $p_1 = q_1$ and $p_{i \mid \pa_i} = q_{i \mid \pa_i}$ are solutions.

The foregoing proposition states the form of an optimal approximator given a tree. A natural next question is to select the tree.

Copyright © 2023 The Bourbaki Authors — All rights reserved — Version 13a6779cc About Show the old page view