\(\DeclarePairedDelimiterX{\Set}[2]{\{}{\}}{#1 \nonscript\;\delimsize\vert\nonscript\; #2}\) \( \DeclarePairedDelimiter{\set}{\{}{\}}\) \( \DeclarePairedDelimiter{\parens}{\left(}{\right)}\) \(\DeclarePairedDelimiterX{\innerproduct}[1]{\langle}{\rangle}{#1}\) \(\newcommand{\ip}[1]{\innerproduct{#1}}\) \(\newcommand{\bmat}[1]{\left[\hspace{2.0pt}\begin{matrix}#1\end{matrix}\hspace{2.0pt}\right]}\) \(\newcommand{\barray}[1]{\left[\hspace{2.0pt}\begin{matrix}#1\end{matrix}\hspace{2.0pt}\right]}\) \(\newcommand{\mat}[1]{\begin{matrix}#1\end{matrix}}\) \(\newcommand{\pmat}[1]{\begin{pmatrix}#1\end{pmatrix}}\) \(\newcommand{\mathword}[1]{\mathop{\textup{#1}}}\)
Needs:
Normal Linear Model Regressors
Feature Maps
Needed by:
None.
Links:
Sheet PDF
Graph PDF

Featurized Probabilistic Linear Models

Why

It is natural to embed a dataset.

Definition

Let $(x: \Omega \to \R ^d, A \in \R ^{n \times d}, e: \Omega \to \R ^n)$ be a probabilistic linear model over the probability space $(\Omega , \mathcal{A} , \mathbfsf{P} )$. Let $\phi : \R ^d \to \R ^{d'}$ be a feature map.

We call the sequence $(x, A, e, \phi )$ a featurized probabilistic linear model (also embedded probabilistic linear model). We interpret the model as a random field $h: \Omega \to (\R ^d \to \R )$ which is a linear function of the features

\[ h_{\omega }(a) = \transpose{\phi (a)}x(\omega ). \]

Denote the data matrix of the embedded feature vectors by $\phi (A)$. In other words, $\phi (A) \in \R ^{n \times d'}$ is a matrix whose rows are feature vectors. Then $(x, A, e, \phi )$ corresponds to the probabilistic linear model $(x, \phi (A), e)$.

Normal case

In the normal (Gaussian) case, the parameter posterior $g_{x \mid y}(\cdot , \gamma )$ is a normal density with mean

\[ \Sigma _{x}\transpose{\phi (A)}\inv{(\phi (A)\Sigma _{x}\transpose{\phi (A)} + \Sigma _{e})} \gamma \]

and covariance

\[ \inv{(\inv{\Sigma _{x}} + \transpose{\phi (A)}\inv{\Sigma _{e}}\phi (A))}. \]

The predictive density for $a \in \R ^d$ is normal with mean

\[ \transpose{\phi (a)}\Sigma _{x}\transpose{\phi (A)}\inv{(\phi (A)\Sigma _{x}\transpose{\phi (A)} + \Sigma _{e})}\gamma . \]

and covariance

\[ \transpose{\phi _a}\Sigma _{x}\phi _a - \transpose{\phi _a}\Sigma _{x}\transpose{\phi (A)}\inv{(\phi (A)\Sigma _{x}\transpose{\phi (A)} + \Sigma _e)}\phi (A)\Sigma _{x}\phi _a. \]

So the featurized linear regressor is the predictor $h: \R ^d \to \R $ defined by

\[ h(a) = \transpose{\phi (a)}\Sigma _{x}\transpose{\phi (A)}\inv{(\phi (A)\Sigma _{x}\transpose{\phi (A)} + \Sigma _{e})}\gamma . \]

Copyright © 2023 The Bourbaki Authors — All rights reserved — Version 13a6779cc About Show the old page view