\(\DeclarePairedDelimiterX{\Set}[2]{\{}{\}}{#1 \nonscript\;\delimsize\vert\nonscript\; #2}\) \( \DeclarePairedDelimiter{\set}{\{}{\}}\) \( \DeclarePairedDelimiter{\parens}{\left(}{\right)}\) \(\DeclarePairedDelimiterX{\innerproduct}[1]{\langle}{\rangle}{#1}\) \(\newcommand{\ip}[1]{\innerproduct{#1}}\) \(\newcommand{\bmat}[1]{\left[\hspace{2.0pt}\begin{matrix}#1\end{matrix}\hspace{2.0pt}\right]}\) \(\newcommand{\barray}[1]{\left[\hspace{2.0pt}\begin{matrix}#1\end{matrix}\hspace{2.0pt}\right]}\) \(\newcommand{\mat}[1]{\begin{matrix}#1\end{matrix}}\) \(\newcommand{\pmat}[1]{\begin{pmatrix}#1\end{pmatrix}}\) \(\newcommand{\mathword}[1]{\mathop{\textup{#1}}}\)
Needs:
Norms
Matrix Trace
Matrix Transpose
Covariance
Real Matrix Inverses
Estimators
Needed by:
None.
Links:
Sheet PDF
Graph PDF

Affine MMSE Estimators

Definition

We want to estimate a random variable $x: \Omega \to \R ^n$ from a random variable $y: \Omega \to \R ^n$ using an estimator $\phi : \R ^m \to \R ^n$ which is affine.1 In other words, $\phi (\xi ) = A\xi + b$ for some $A \in \R ^{n \times m}$ and $b \in \R ^n$. We will use the mean squared error cost.

We want to find $A$ and $b$ to minimize

\[ \E {\norm{Ax + b - y}^2}. \]

Express $\E (\norm{Ax + b - y}^2)$ as $\E ((Ax + b - y)^\top (Ax + b - y))$

\[ \mat{ & + & \tr(A \E (xx^\top ) A^\top ) & + & \E (x)^\top A^\top b & - & \tr(A^\top \E (yx^\top )) \\ & + & b^\top A \E (x) & + & b^\top b & - & b^\top \E (y) \\ & - & \tr(A \E (xy^\top ) ) & - & \E (y)^\top b & + & \E (yy^\top ) } \]

The gradients with respect to $b$ are

\[ \mat{ & + & 0 & + & A \E (x) & - & 0 \\ & + & A \E (x) & + & 2 b & - & \E (y) \\ & - & 0 & - & \E (y) & + & 0 } \]

so $2A\E (x) + 2b - 2\E (y)$.
The gradients with respect to $A$ are

\[ \mat{ && + & \E (xx^\top )A^\top + \E (xx^\top )^\top A^\top & + & \E (x)b^\top & - & \E (yx^\top )^\top \\ & + & \E (x)b^\top & + & 0 & - & 0 \\ & - & \E (xy^\top ) & - & 0 & + & 0 } \]

so $2\E (xx^\top )A^\top + 2\E (x)b^\top - 2\E (xy^\top )$.
We want $A$ and $b$ solutions to

\[ \begin{aligned} A\E (x) + b - \E (y) &= 0 \\ \E (xx^\top )A^\top + \E (x)b^\top - \E (xy^\top ) &= 0 \end{aligned} \]

so first get $b = \E (y) - A\E (x)$.
Then express

\[ \begin{aligned} \E (xx^\top )A^\top + \E (x)(\E (y) - A \E (x))^\top - \E (xy^\top ) = 0.\\ \E (xx^\top )A^\top + \E (x)\E (y)^\top - \E (x)\E (x)^\top A^\top - \E (xy^\top ) = 0. \\ (\E (xx^\top ) - \E (x)\E (x)^\top )A^\top = \E (xy^\top ) - \E (x)\E (y)^\top . \\ \cov(x, x)A^\top = \cov(x, y). \end{aligned} \]

So $A^\top = (\cov(x,x))^{-1}\cov(x, y)$ means $A = \cov(y, x)(\cov(x, x))^{-1}$ is a solution. Then $b = \E (y) - \cov(y, x)\cov(x,x)^{-1}\E (x)$. So to summarize, the estimator $\phi (x) = Ax + b$ is

\[ \cov(y, x)(\cov{x, x})^{-1} x + \E (y) - \cov(y, x)(\cov(x,x))^{-1}\E (x) \]

or

\[ \E (y) + \cov(y, x) (\cov{x, x})^{-1} (x - \E (x)) \]


  1. Actually, the development flips this. Future editions will correct. ↩︎
Copyright © 2023 The Bourbaki Authors — All rights reserved — Version 13a6779cc About Show the old page view