\(\DeclarePairedDelimiterX{\Set}[2]{\{}{\}}{#1 \nonscript\;\delimsize\vert\nonscript\; #2}\) \( \DeclarePairedDelimiter{\set}{\{}{\}}\) \( \DeclarePairedDelimiter{\parens}{\left(}{\right)}\) \(\DeclarePairedDelimiterX{\innerproduct}[1]{\langle}{\rangle}{#1}\) \(\newcommand{\ip}[1]{\innerproduct{#1}}\) \(\newcommand{\bmat}[1]{\left[\hspace{2.0pt}\begin{matrix}#1\end{matrix}\hspace{2.0pt}\right]}\) \(\newcommand{\barray}[1]{\left[\hspace{2.0pt}\begin{matrix}#1\end{matrix}\hspace{2.0pt}\right]}\) \(\newcommand{\mat}[1]{\begin{matrix}#1\end{matrix}}\) \(\newcommand{\pmat}[1]{\begin{pmatrix}#1\end{pmatrix}}\) \(\newcommand{\mathword}[1]{\mathop{\textup{#1}}}\)
Needs:
Classifier Errors
Real Matrices
Needed by:
None.
Links:
Sheet PDF
Graph PDF

Confusion Matrices

Why

We can summarize the (label, prediction) pairs for a particular classifier on a particular dataset in a matrix.

Boolean case

Let $A$ be a nonempty set and $B = \set{-1, 1}$. For a dataset $(a^1, b^i), \dots , (a^n, b^n)$ in $A \times B$, and classifier $G: A \to B$, the confusion matrix $C$ is defined

\[ C = \bmat{ \text{\# true negatives} & \text{\# false negatives} \\ \text{\# false positives} & \text{\# true positives} } = \bmat{ C_{\text{tn}} & C_{\text{fn}} \\ C_{\text{fp}} & C_{\text{tp}} }. \]

Using this notation, $C_{\text{tn}} + C_{\text{fn}} + C_{\text{fp}} + C_{\text{tp}} = n$. $N_\text{n} := C_{\text{tn}} + C_{\text{fp}}$ is the number of negative examples. $N_\text{p} := C_{\text{fn}} + C_{\text{tp}}$ is the number of positive examples.

The diagonal elements of the confusion matrix give the numbers of correct predictions. The off-diagonal entries give the numbers of incorrect predictions for the two types of errors (see \sheetref{classifier_errors}{Classifier Errors}).

In this notation, the false positive rate is $C_\text{fp}/n$, the false negative rate is $C_{\text{fn}}/n$ and the error rate is the sum of these, $(C_{\text{fn}} + C_{\text{fp}})/n$.

The true positive rate is $C_{\text{tp}} / (C_{\text{fn}} + C_{\text{tp}})$. The true negative rate is $C_{\text{tn}} / (C_{\text{tn}} + C_{\text{fp}})$. The false alarm rate is $C_{\text{fp}} / (C_{\text{tn}} + C_{\text{fp}})$. The precision is $C_{\text{tp}}/C_{\text{tp}} + C_{\text{fp}}$

Copyright © 2023 The Bourbaki Authors — All rights reserved — Version 13a6779cc About Show the old page view