\(\DeclarePairedDelimiterX{\Set}[2]{\{}{\}}{#1 \nonscript\;\delimsize\vert\nonscript\; #2}\) \( \DeclarePairedDelimiter{\set}{\{}{\}}\) \( \DeclarePairedDelimiter{\parens}{\left(}{\right)}\) \(\DeclarePairedDelimiterX{\innerproduct}[1]{\langle}{\rangle}{#1}\) \(\newcommand{\ip}[1]{\innerproduct{#1}}\) \(\newcommand{\bmat}[1]{\left[\hspace{2.0pt}\begin{matrix}#1\end{matrix}\hspace{2.0pt}\right]}\) \(\newcommand{\barray}[1]{\left[\hspace{2.0pt}\begin{matrix}#1\end{matrix}\hspace{2.0pt}\right]}\) \(\newcommand{\mat}[1]{\begin{matrix}#1\end{matrix}}\) \(\newcommand{\pmat}[1]{\begin{pmatrix}#1\end{pmatrix}}\) \(\newcommand{\mathword}[1]{\mathop{\textup{#1}}}\)
Needs:
Set Numbers
Lists
Outcome Probabilities
Needed by:
None.
Links:
Sheet PDF
Graph PDF

Decision Processes

Why

We want to talk about making a sequence of decisions.

Definition

Let $S$ and $A$ be finite sets. Let $T: S \times A \to (S \to [0, 1])$ so that for each $s \in S$ and $a \in A$, $T_{sa}: S \to [0, 1]$ is a probability distribution over $S$. We call the ordered triple $(S, A, T)$ a finite state-action process.

A trajectory in the state set $S$ and action set $A$ is a sequence in $S \times A$.

Let $r: S \times A \times S \to \R $, $N \in \N $.

A decision process is a sequence $(S, A, T, r, \gamma , $, consists of two sets, a function set, an action

Other terminology

Decision processes are commonly called markov decision processes.1


  1. As usual, we avoid this terminology in connection with the projects guidelines against using particular names. ↩︎
Copyright © 2023 The Bourbaki Authors — All rights reserved — Version 13a6779cc About Show the old page view