Rooted Tree Distributions

Why

We consider distributions over products of several finite sets. These distributions have many outcomes, and so tabulating the probabilities associated with each outcome is usually laborious, and most often infeasible. So instead of working with all such distributions, we work with those for which the probabilities can be expressed by tabulating fewer numbers.

Definition

A distribution over a product of $n$ finite sets has $n$ one-variable marginals (one for each component) and $n(n-1)$ two-variable conditionals (one for each component given each other component). We will be interested in those distributions over this product which can be written as a product of these marginals and conditionals.

A distribution over a product of $n$ finite (non-empty) sets factors according to a rooted tree on $\set{1, \dots , n}$ if the probability of each outcome can be expressed as the product of the marginal probability of the component corresponding to the root of the tree and the conditionals corresponding to the edges in the tree.

Notation

Let $A_1, \dots , A_n$ be finite sets and define $A = \prod_{i = 1}^{n}$. Let $p: A \to [0, 1]$ be a distribution. If for every $a \in A$, $$p(a) = p_k(a_k)∏_{(i, j) ∈ E} p_{i |j}(a_i, a_j)$$ where $k \in \set{1, \dots , n}$, $E \subset \set{1, \dots , n}^2$ and $(\set{1, \dots , n}, E)$ is a tree, then $p$ factors according to that same tree rooted at vertex $k$.

Example

A rooted tree.

Let $T = (\set{1, 2, 3, 4, 5, 6}, \set{\set{1, 2}, \set{2, 3}, \set{2, 4}, \set{4, 5}, \set{4, 6}})$ (visualized above). Root $T$ at vertex 1. Then by chain rule $p$ always satisfies

\[ p = p_{6 \mid 5, 4, 3, 2, 1}p_{5 \mid 4, 3, 2, 1}p_{4 \mid 3, 2, 1} p_{3 \mid 2, 1}p_{2 \mid 1}. \]

If $p$ factors according to $T$ rooted at vertex 1 then it satisfies

\[ p = p_{6 \mid 4}p_{5 \mid 4}p_{4 \mid 2}p_{3 \mid 2}p_{2 \mid 1}. \]

So, for instance,

\[ p_{6 \mid 5, 4, 3, 2, 1} = p_{6 \mid 4}. \]

In other words, this particular conditional distribution does not depend on $a_1$, $a_2$, $a_3$ or $a_5$. Similarly for $p_{5 \mid 4}$, $p_{4 \mid 2}$ and $p_{3 \mid 2}$.