We consider a simple distribution on sequences.^{1}

A memory chain
(markov chain^{2}
, memory model,
markov model)
on a set $A$ of
length $n$ is a joint
distribution $p: A^n \to [0, 1]$ satisfying

\[ p(a) = f(a_1) \prod_{i = 2}^{n} g(a_i, a_{i-1}) \]

for some distribution $f: A \to [0, 1]$ and $g: A^2 \to [0, 1]$ is a function for which $g(\cdot , \alpha ): A \to [0, 1]$ is a distribution on $A$ for every $\alpha \in A$.
$p$ so defined is a distribution.
The function $g$ is the conditional distribition
$p_{i \mid i-1}$ for $i = 2, \dots , n$.
The function $f$ is the first marginal $p_1$.

For this reason, we call $g$ the conditional distribution of the Markov chain. We call $f$ the initial distribution.

Let $A = \prod_{i} A_i$ and $p: A \to [0, 1]$ be a distribution. On one hand, $p$ is said to be memoryless (or have zero-order memory) if $p = \prod_i p_i$. In particular, $p_{i \mid i-1}(\alpha , \beta ) = p_{i}(\alpha )$ for every $i = 2, \dots , n$. On the other hand, $p$ is said to have first-order memory if $p = p_1 \prod_{i = 2}^n p_{i \mid i-1}$

If, in addition, $A_i = A_1$ for all $i = 1, \dots , n$, then we say that $p$ has homogenous conditionals if $p_{i \mid i-1} = p_{j \mid j-1}$ for $i \neq j = 2, \dots , n$. In this language, a memory chain is a joint distribution with first-order memory and homogogenous conditionals.