It is natural to embed a dataset.
Let $(x: \Omega \to \R ^d, A \in \R ^{n \times d}, e: \Omega \to \R ^n)$ be a probabilistic linear model over the probability space $(\Omega , \mathcal{A} , \mathbfsf{P} )$. Let $\phi : \R ^d \to \R ^{d'}$ be a feature map.
We call the sequence $(x, A, e, \phi )$ a
featurized probabilistic linear
model (also embedded
probabilistic linear model).
We interpret the model as a random field $h:
\Omega \to (\R ^d \to \R )$ which is a linear
function of the features
\[
h_{\omega }(a) = \transpose{\phi (a)}x(\omega ).
\]
Denote the data matrix of the embedded feature vectors by $\phi (A)$. In other words, $\phi (A) \in \R ^{n \times d'}$ is a matrix whose rows are feature vectors. Then $(x, A, e, \phi )$ corresponds to the probabilistic linear model $(x, \phi (A), e)$.
In the normal (Gaussian) case, the parameter
posterior $g_{x \mid y}(\cdot , \gamma )$ is a
normal density with mean
\[
\Sigma _{x}\transpose{\phi (A)}\inv{(\phi (A)\Sigma _{x}\transpose{\phi (A)}
+ \Sigma _{e})} \gamma
\] \[
\inv{(\inv{\Sigma _{x}} +
\transpose{\phi (A)}\inv{\Sigma _{e}}\phi (A))}.
\]
The predictive density for $a \in \R ^d$ is
normal with mean
\[
\transpose{\phi (a)}\Sigma _{x}\transpose{\phi (A)}\inv{(\phi (A)\Sigma _{x}\transpose{\phi (A)}
+ \Sigma _{e})}\gamma .
\] \[
\transpose{\phi _a}\Sigma _{x}\phi _a -
\transpose{\phi _a}\Sigma _{x}\transpose{\phi (A)}\inv{(\phi (A)\Sigma _{x}\transpose{\phi (A)}
+ \Sigma _e)}\phi (A)\Sigma _{x}\phi _a.
\]
So the featurized linear
regressor is the predictor $h: \R ^d \to
\R $ defined by
\[
h(a) =
\transpose{\phi (a)}\Sigma _{x}\transpose{\phi (A)}\inv{(\phi (A)\Sigma _{x}\transpose{\phi (A)}
+ \Sigma _{e})}\gamma .
\]