If we model some measured value as a random variable with induced distribution $p: V \to \R $, then one interpretation of $p(v)$ for $v \in V$ is the proportion of times in a large number of trials that we expect to measure the value $v$.1
Given a distribution $p: \Omega \to \R $ and a real-valued outcome variable $x: \Omega \to \R $, the expectation (or mean) of $x$ under $p$ is $\sum_{\omega \in \Omega } p(\omega )x(\omega )$.
We denote the expectation of $x$ under $p$ by $\E _p(x)$. When there is no chance of ambiguity, we write $\E (x)$.
Let $x, y : \Omega \to \R $ be two outcome variables and $p: \Omega \to \R $ a distribution. Let $\alpha , \beta \in \R $. Define $z = \alpha x + \beta y$ by $z(\omega ) = \alpha x(\omega ) + \beta y(\omega )$. Then $\E (z) = \alpha \E (x) + \beta \E (z)$. Many authors refer to this property as the linearity of expecation.
Suppose $\Omega = \set{1, 2, 3, 4, 5}$ with
$p(1) = 0.1$, $p(2) = 0.15$, $p(3) = 0.1$,
$p(5) = 0.25$ and $p(5) = 0.4$.
Define $x: \Omega \to \R $ by
\[
x(a) = \begin{cases}
-1 & \text{ if } a = 1 \text{ or } a = 2, \\
1 & \text{ if } a = 3 \text{ or } a = 4, \\
2 & \text{ if } a = 5. \\
\end{cases}
\] \[
\E x = -1 -0.15 + 0.1 + 0.25 + 2(0.4) = 0.9.
\]
Denote by $p_x: V \to \R $ the induced
distribution of $x: \Omega \to V$ (where $V
\subset \R $).
Then $\E (x) = \sum_{v \in V} p_x(v)v$ since
\[
\begin{aligned}
\textstyle \sum_{\omega \in \Omega } p(\omega )x(\omega ) &=
\sum_{v \in V} \sum_{\omega \in x^{-1}(v)} x(\omega )p(\omega )
\\
\textstyle &= \sum_{v \in V} v \sum_{\omega \in x^{-1}(v)}
p(\omega ) \\
\textstyle & = \sum_{v \in V} x(v) p_x(v).
\end{aligned}
\]
We interpret the mean as the center of mass of the induced distribution.