\(\DeclarePairedDelimiterX{\Set}[2]{\{}{\}}{#1 \nonscript\;\delimsize\vert\nonscript\; #2}\) \( \DeclarePairedDelimiter{\set}{\{}{\}}\) \( \DeclarePairedDelimiter{\parens}{\left(}{\right)}\) \(\DeclarePairedDelimiterX{\innerproduct}[1]{\langle}{\rangle}{#1}\) \(\newcommand{\ip}[1]{\innerproduct{#1}}\) \(\newcommand{\bmat}[1]{\left[\hspace{2.0pt}\begin{matrix}#1\end{matrix}\hspace{2.0pt}\right]}\) \(\newcommand{\barray}[1]{\left[\hspace{2.0pt}\begin{matrix}#1\end{matrix}\hspace{2.0pt}\right]}\) \(\newcommand{\mat}[1]{\begin{matrix}#1\end{matrix}}\) \(\newcommand{\pmat}[1]{\begin{pmatrix}#1\end{pmatrix}}\) \(\newcommand{\mathword}[1]{\mathop{\textup{#1}}}\)
Needs:
Projections On Affine Sets
Projections On Affine Sets
Matrix Trace
Needed by:
None.
Links:
Sheet PDF
Graph PDF

Minimum Residual Affine Sets

Why

We want to find a low-dimensional affine set into which we can project some high-dimensional data.

Problem

For $a \in \R ^n$ and $U \in \R ^{n \times k}$, the set $W(a, U) = \Set{a + Uz}{z \in \R ^n}$ is an affine set. Denote the projection of $x \in \R ^n$ onto $W(a, U)$ by $\proj_{W(a, U)}(x)$.

\begin{problem} Given $x^{(1)}, \dots , x^{(m)} \in \R ^n$, and a dimension $k$, find $a \in \R ^n$ and $U \in \R ^{n \times k}$ with $U^\top U = I$ to minimize

\[ \textstyle \sum_{i = 1}^{m} \norm{x^{(i)} - \proj_{W(a, U)}(x^{(i)})}^2, \]

the sum of squared distances between $x^{(i)}$ and its projection on $W(a, U)$. \end{problem}

Express $\proj_{W(a, U)}(x)$ as $UU^\top x + (I - UU^\top )a$ (see \sheetref{projections_on_affine_sets}{Projections on Affine Sets}). We want to find $a \in \R ^n$ and $U \in \R ^{n \times k}$ to minimize

\[ \sum_{i = 1}^{n} \norm{x^{(i)} - UU^\top x^{(i)} - (I-UU^\top )a}^2. \]

Fix $U \in \R ^{n \times k}$. Define $A \in \R ^{nm \times n}$, $B \in \R ^{mn \times mn}$, and $\tilde{x} \in \R ^{nm}$ by

\[ \scriptsize A = \bmat{ I - UU^\top \\ \vdots \\ I - UU^\top \\ },\text{ } B = \bmat{ I - UU^\top & \cdots & 0\\ & \ddots & \\ 0 & \cdots & I - UU^\top \\ }, \text{ and } \tilde{x} = \bmat{ x^{(1)} \\ \vdots \\ x^{(m)} }. \]

Then the objective is equivalent to

\[ \norm{Aa - B \tilde{x}}^2 \]

Any minimizer $a^\star$ satisfies the normal equations

\[ A^\top Aa^\star = A^\top B\tilde{x} \]

Since $(I - UU^\top )^\top = I - UU^\top $ and $(I - UU^\top )^2 = I - UU^\top $,

\[ \textstyle A^\top A = \sum_{i = 1}^{m} I - UU^\top = m(I - UU^\top ) \]

and

\[ \textstyle A^\top B = \bmat{I - UU^\top & \cdots & I - UU^\top }. \]

Consequently, we can express $A^\top Aa^\star = A^\top B\tilde{x}$ as

\[ \textstyle m(I - UU^\top )a^\star = \sum_{i = 1}^{m} (I - UU^\top )x^{(i)}. \]

So $a^\star$ is any vector satisfying

\[ \textstyle (I - UU^\top )a^\star = (I- UU^\top )(1/m)\sum_{i = 1}^{m} (I - UU^\top )x^{(i)}. \]

One such point satisfying the above is $\bar{x} = (1/m)\sum_{i = 1}^{m}x^{(i)}$. An expedient choice, as it does not depnd on $U$.

Now we want to find $U \in \R ^{n \times k}$ to minimize

\[ \textstyle \sum_{i = 1}^{m} \norm{(I-UU^\top )(x^{(i)} - \bar{x})}^2. \]

Express the $i$th term of the sum as

\[ \textstyle \begin{aligned} \norm{(I - UU^\top )(x^{(i)} - \bar{x})}^2 & = (x-\bar{x})(I - UU^\top )^\top (I - UU^\top )(x^{(i)} - \bar{x}) \\ & = (x^{(i)} - \bar{x})^\top (I - UU^\top )(x ^{(i)} - \bar{x}) \\ & = \norm{x^{(i)} - \bar{x}}^2 - \norm{U^\top (x^{(i)} - \bar{x})}^2. \end{aligned} \]

The first term is a constant with respect to $U$. Define $\bar{X} \in \R ^{n \times m}$ by

\[ \bar{X} = \bmat{x^{(1)} - \bar{x} & \cdots & x^{(m)} - \bar{x}}. \]

Express the sum of the second terms by

\[ \norm{U^\top \bar{X}}_F = \tr \bar{X}^\top UU^\top \bar{X} = \tr(U^\top \bar{X}\bar{X}^\top U). \]

So we seek $U \in \R ^{n \times k}$ with $U^\top U = I$ to maximize

\[ \tr(U^\top \bar{X}\bar{X}^\top U). \]

Copyright © 2023 The Bourbaki Authors — All rights reserved — Version 13a6779cc About Show the old page view