We want to select a normal density which summarizes well a dataset.
Let $D = (x^1, \dots , x^n)$ be a dataset in $\R $. We want to select a density from among normal densities, which require specifying a mean and covariance.
Following the principle of maximum likelihood,
we want to solve
\[
\maximizationn{\mu ,\sigma \in \R }{
\prod_{k = 1}^{n} \normaldensity{x^k}{\mu }{\sigma }
}{
\sigma > 0 \\
}
\]
\[ \frac{1}{n} \sum_{k = 1}^{n} x^k \]
and covariance\[ \frac{1}{n} \sum_{k = 1}^{n} \left(x^k - \frac{1}{n} \sum_{k = 1}^{n} x^k\right)^2. \]
Then $f$ is a maximum likelihood normal density.\[ \sum_{k = 1}^{n} \left( \frac{1}{2\sigma ^2}(x^k - \mu )^2 - \frac{1}{2}\log2\pi \sigma ^2\right) \]
The partial derivative of the log likelihood with respect to the mean $(\partial_{\mu } \ell ): \R ^2 \to \R $ is\[ (\partial_\mu \ell )(\mu , \sigma ^2) = - \sum_{k = 1}^{n} \frac{1}{\sigma ^2}(x - \mu ) \]
and with respect to the covariance $(\partial_{\sigma ^2} \ell ): \R ^2 \to \R $ is\[ (\partial_{\sigma ^2} \ell )(\mu , \sigma ^2) = \left(\frac{-1}{2(\sigma ^2)^{2}}\sum_{k = 1}^{n}(x^k - \mu )^2\right)- \frac{1}{2\sigma ^2} \]
We are interested in finding $\mu _0 \in \R $ and $\sigma ^2_0 > 0$, at which $\partial_\mu \ell (\mu _0, \sigma ^2_0) = 0$ and $\partial_{\sigma ^2} \ell (\mu _0, \sigma ^2_0) = 0$. So we have two equations. First, notice that $\partial_\mu \ell $ is zero if an only if its first argument (the mean) is $\frac{1}{n} \sum_{k = 1}^{n} x^k$. Second, notice that for all $\mu , \sigma ^2$, $\partial_{\sigma ^2}\ell $ is zero if and only if\[ \sigma ^2 = \sum_{k = 1}^{n} (x^k - \mu )^2. \]
So the pair\[ \left(\frac{1}{n}\sum_{k = 1}^{k} x^k, \frac{1}{n} \sum_{k = 1}^{n} (x_k - \frac{1}{n} \sum_{k = 1}^{n} x^k)^2\right) \]
is a stationary point of $\ell $.