Vectors as Matrices

Why

Vectors can be identified with matrices of width 1.

Canonical identification

We identify $\R ^{n}$ with $\R ^{n \times 1}$ in the obvious way. For this reason, we call $x \in \R ^{n \times 1}$ (meaning $x \in \R ^{n}$) a column vector.

For the reasons that we identify $\R ^n$ with $\R ^{n \times 1}$, we write the vector $a = (a_1, a_2, a_3) \in \R ^3$ as

\[ \bmat{a_1 \\ a_2 \\ a_3} \text{ or } \pmat{ a_1 \\ a_2 \\ a_3}. \]

We could as easily also identify $\R ^{n}$ with $\R ^{1 \times n}$. We avoid this convention. However, by analogy with the language “column vector,” we refer to the matrix $y \in \R ^{1 \times n}$ as a row vector.

Matrix transpose

We frequently move from $\R ^{n \times 1}$ and $\R ^{1 \times n}$. If $a \in \R ^{n \times 1}$, we denote $b \in \R ^{1 \times n}$ defined by $b_i = a_i$ by $a^\top $.

More generally, given a matrix $A \in \R ^{m \times n}$, we denote the matrix $B \in \R ^{m \times n}$ defined by $B_{ij} = A_{ji}$ by $A^\top $. Notice that the entries of $i$ and $j$ have swapped. We call the matrix $B$ the transpose of $A$, and similarly call $a^\top $ the transpose of the vector $a$. Clearly, $(A^\top )^\top = A$, which includes $(a^\top )^\top = a$.

Reals as vectors

There is a similar, and similarly obvious, identification of scalars $a \in \R $ with the 1-vectors $\R ^{1}$ (and so with the 1 by 1 matrices $\R ^{1 \times 1}$). Given our definition of matrix-vector products, if we identify $a \in \R $ with $A \in \R ^{1 \times 1}$ where $A_{11} = a$, then $Ax = ax$.

Familiar concepts, new notation

These identifications and the notation of transposition give allow us to write several familiar concepts in a compact notation. We write the norm as

\[ \norm{x} = \sqrt{x_1^2 + x_2^2 + \cdots + x_n^2} = \sqrt{x^\top x}. \]

We write the inner product as

\[ \ip{x,y} = x_1y_1 + x_2y_2 + \cdots + x_ny_n = x^\top y. \]

We express the symmetry of the inner product by $x^\top y = y^\top x$.