Probabilistic Models

Why

We have a space $X$ and a family of probability measures $\mathcal{P} $ on this space. Assume $x \in X$ drawn from a fixed, unknown measure $P \in \mathcal{P} $. Given $x$, how should we guess $P$?

Definition

A probabilistic model (or statistical model, parametric statistical model, statistical experiment) is a family of probability measures over the same measurable space $(X, \mathcal{F} )$. Call the index set the parameter set or set of parameters. The set $X$ is called the sample space. A statistic is any function on the sample space.

Notation

Let $(X, \mathcal{F} )$ denote a measurable space. We usually denote the parameter by $\Theta $, and denote the family

\[ \mathcal{P} = \Set{\mathbfsf{P} _\theta : \mathcal{F} \to [0,1]}{\mathbfsf{P} _\theta \text{ a measure}, \theta \in \Theta }. \]

Often $\Theta \subset \R ^d$.

Example: coin flips

The usual model for $n$ flips of a coin takes $X = \set{0,1}^n$, the set of binary $n$-tuples. For $\theta \in [0, 1]$, a distribution $p_\theta (x) = \theta ^t(1-\theta )^{n-t}$ where $t = t(x) = x_1 + \cdots + x_n$ is defined on $X$. A probability measure $\mathbfsf{P} _\theta $ is defined on $\powerset{X}$ in the the usual way. Thus, the probabilistic model is $\Set{\mathbfsf{P} _\theta }{\theta \in [0,1]}$. Given $x$, we are asked to guess $\theta $.

Decisions

A decision procedure (estimator, statistical procedure) is a measurable function $A: \mathcal{X} \to \mathcal{A} $ where $\mathcal{A} $ is a a set, called the actions or decisions. Often $\mathcal{A} = \Theta $, in which case $A(x)$ givens an estimate of $\theta $, which we denote $\hat{\theta }(x)$.

Judging decisions

Given a loss function $L: \mathcal{A} \times \Theta \to \Rbar$, the risk of $A$ is

\[ R(A, \theta ) = \E L(A(x), \theta ). \]