Maximum Conditional Estimates

Why

We want to estimate a random vector $x: \Omega \to \R ^d$ from a random vector $y: \Omega \to \R ^n$.

Definition

Denote by $g: \R ^d \times \R ^n \to \R $ the joint density for $(x, y)$.¹ Denote the conditional density for $x$ given $y$ by $g_{x \mid y}: \R ^d \times \R ^n \to \R $. In this setting, $g_{x \mid y}$ is called the posterior density, $g_{x}$ is called the prior density, and $g_{y \mid x}$ is called the likelihood density and $g_{y}$ is called the marginal likelihood density.

As usual (and assuming $g_{y} > 0$), the posterior is related to the likelihood, prior and marginal likelihood by

\[ g_{x \mid y} \equiv \frac{g_x g_{y \mid x}}{g_{y}}. \]

A maximum conditional estimate for $x: \Omega \to \R ^n$ given that $y$ has taken the value $\gamma \in \R ^n$ is a maximizer $\xi \in \R ^d$ of $g_{x \mid y}(\xi , \gamma )$. It is also called the maximum a posteriori estimate or MAP estimate. The maximum conditional estimate is natural, in part, because it also maximizes the joint density, since $g(\xi , \gamma ) = g_y(\gamma ) g_{x \mid y}(\xi , \gamma )$ for all $\xi \in \R ^d$ and $\gamma \in \R ^n$.

Future editions will comment on the existence of such a density. ↩︎