Linear algebra reference

07 August 2025
linear algebra

Linear algebra entails vector spaces, linear transformations and representations through matrices and systems of linear equations.

The most basic object in linear algebra is the vector. A vector in an $n$ -dimensional space is an ordered collection of $n$ elements represented as a column.

$v = (\begin{matrix} v_{1} \\ v_{2} \\ ⋮ \\ v_{n} \end{matrix})$

Vectors can be added component-wise. For vectors $u$ and $v$ of the same dimension, their sum follows.

$u + v = (\begin{matrix} u_{1} + v_{1} \\ u_{2} + v_{2} \\ ⋮ \\ u_{n} + v_{n} \end{matrix})$

Scalar multiplication scales each component of a vector by a scalar value $c$ .

$c v = (\begin{matrix} c v_{1} \\ c v_{2} \\ ⋮ \\ c v_{n} \end{matrix})$

These operations satisfy some algebraic properties. Addition is commutative and associative. Scalar multiplication distributes over vector addition and scalar addition.

$u + v = v + u$

$u + (v + w) = (u + v) + w$

$c (u + v) = c u + c v$

$(c + d) v = c v + d v$

The zero vector $0$ consists of all zero components and represents the additive identity for vectors.

$v + 0 = v$

Every vector $v$ has an additive inverse $- v$ such that their sum equals the zero vector.

$v + (- v) = 0$

A vector space is a set of vectors that is closed under addition and scalar multiplication. Real coordinate space $R^{n}$ is the prototypical example of an $n$ -dimensional vector space.

Vectors $v_{1}, v_{2}, \dots, v_{k}$ are linearly independent if the following relation has only the trivial solution $c_{1} = c_{2} = \dots = c_{k} = 0$ .

$c_{1} v_{1} + c_{2} v_{2} + \dots + c_{k} v_{k} = 0$

If there exists a non-trivial solution, the vectors are linearly dependent, implying that at least one vector can be expressed as a linear combination of the others.

The span of a set of vectors $v_{1}, v_{2}, \dots, v_{k}$ consists of all possible linear combinations of these vectors.

$span (v_{1}, v_{2}, \dots, v_{k}) = c_{1} v_{1} + c_{2} v_{2} + \dots + c_{k} v_{k} : c_{1}, c_{2}, \dots, c_{k} \in R$

A basis for a vector space is a linearly independent set of vectors that spans the entire space. Every vector in the space can be uniquely expressed as a linear combination of basis vectors.

The dimension of a vector space equals the number of vectors in any basis for that space. For instance, $R^{n}$ has dimension n.

The standard basis for $R^{n}$ consists of $n$ vectors, each with a single 1 and zeros elsewhere.

$e_{1} = (\begin{matrix} 1 \\ 0 \\ ⋮ \\ 0 \end{matrix}), e_{2} = (\begin{matrix} 0 \\ 1 \\ ⋮ \\ 0 \end{matrix}), \dots, e_{n} = (\begin{matrix} 0 \\ 0 \\ ⋮ \\ 1 \end{matrix})$

The dot product (inner product) of two vectors $u$ and $v$ in $R^{n}$ is defined as follows.

$u \cdot v = u_{1} v_{1} + u_{2} v_{2} + \dots + u_{n} v_{n} = \sum_{i = 1}^{n} u_{i} v_{i}$

The dot product is commutative and distributive over addition. It also satisfies these scaling properties.

$u \cdot v = v \cdot u$

$u \cdot (v + w) = u \cdot v + u \cdot w$

$(c u) \cdot v = c (u \cdot v) = u \cdot (c v)$

It let's us define the norm (i.e. length) of a vector. The Euclidean norm of vector $v$ is calculated using this formula.

$| v | = \sqrt{v \cdot v} = \sqrt{v_{1}^{2} + v_{2}^{2} + \dots + v_{n}^{2}}$

Two vectors are orthogonal if their dot product equals zero. A set of vectors is orthogonal if every pair of distinct vectors in the set is orthogonal.

$u \cdot v = 0$

An orthogonal set of nonzero vectors is automatically linearly independent. An orthogonal basis greatly simplifies many calculations in linear algebra.

A vector can be normalized by dividing it by its norm, resulting in a unit vector pointing in the same direction.

$\hat{v} = \frac{v}{| v |}$

The angle between two nonzero vectors $u$ and $v$ can be calculated using the dot product.

$\cos θ = \frac{u \cdot v}{| u |, | v |}$

The Cauchy-Schwarz inequality establishes an upper bound for the dot product.

$| u \cdot v | \leq | u |, | v |$

Equality holds if and only if one vector is a scalar multiple of the other.

The triangle inequality follows from Cauchy-Schwarz and states that the norm of a sum of vectors cannot exceed the sum of their norms.

$| u + v | \leq | u | + | v |$

The cross product is a binary operation defined only for three-dimensional vectors. For vectors $u = (u_{1}, u_{2}, u_{3})$ and $v = (v_{1}, v_{2}, v_{3})$ , their cross product is:

$u \times v = (\begin{matrix} u_{2} v_{3} - u_{3} v_{2} \\ u_{3} v_{1} - u_{1} v_{3} \\ u_{1} v_{2} - u_{2} v_{1} \end{matrix})$

The cross product results in a vector perpendicular to both input vectors, with magnitude equal to the area of the parallelogram they form.

$| u \times v | = | u |, | v | \sin θ$

Unlike the dot product, the cross product is anti-commutative. It also satisfies these distributive properties.

$u \times v = - (v \times u)$

$u \times (v + w) = u \times v + u \times w$

$(c u) \times v = c (u \times v) = u \times (c v)$

Matrices provide a way to represent linear transformations and systems of linear equations. An $m \times n$ matrix has $m$ rows and $n$ columns.

$A = (\begin{matrix} a_{11} & a_{12} & \dots & a_{1 n} \\ a_{21} & a_{22} & \dots & a_{2 n} \\ ⋮ & ⋮ & ⋱ & ⋮ \\ a_{m 1} & a_{m 2} & \dots & a_{m n} \end{matrix})$

Matrix addition is performed element-wise for matrices of the same dimensions.

$A + B = (\begin{matrix} a_{11} + b_{11} & a_{12} + b_{12} & \dots & a_{1 n} + b_{1 n} \\ a_{21} + b_{21} & a_{22} + b_{22} & \dots & a_{2 n} + b_{2 n} \\ ⋮ & ⋮ & ⋱ & ⋮ \\ a_{m 1} + b_{m 1} & a_{m 2} + b_{m 2} & \dots & a_{m n} + b_{m n} \end{matrix})$

Scalar multiplication of a matrix scales each entry by the scalar.

$c A = (\begin{matrix} c a_{11} & c a_{12} & \dots & c a_{1 n} \\ c a_{21} & c a_{22} & \dots & c a_{2 n} \\ ⋮ & ⋮ & ⋱ & ⋮ \\ c a_{m 1} & c a_{m 2} & \dots & c a_{m n} \end{matrix})$

Matrix multiplication is more complicated. For an $m \times n$ matrix $A$ and an $n \times p$ matrix $B$ , their product $A B$ is an $m \times p$ matrix with entries calculated as:

$(A B)_{i j} = \sum_{k = 1}^{n} a_{i k} b_{k j}$

Matrix multiplication is associative but generally not commutative.

$(A B) C = A (B C)$

$A B \neq B A (in general)$

The identity matrix $I_{n}$ is an $n \times n$ square matrix with ones on the main diagonal and zeros elsewhere. It is mostly used as the multiplicative identity.

$A I_{n} = I_{n} A = A$

Matrix-vector multiplication treats the vector as a column matrix. For an $m \times n$ matrix $A$ and an $n \times 1$ vector $v$ , their product is:

$A v = (\begin{matrix} a_{11} v_{1} + a_{12} v_{2} + \dots + a_{1 n} v_{n} \\ a_{21} v_{1} + a_{22} v_{2} + \dots + a_{2 n} v_{n} \\ ⋮ \\ a_{m 1} v_{1} + a_{m 2} v_{2} + \dots + a_{m n} v_{n} \end{matrix})$

A system of linear equations can be expressed in matrix form $A x = b$ . In this case, $A$ is the coefficient matrix, $x$ is the vector of unknowns and $b$ is the constant vector.

$(\begin{matrix} a_{11} & a_{12} & \dots & a_{1 n} \\ a_{21} & a_{22} & \dots & a_{2 n} \\ ⋮ & ⋮ & ⋱ & ⋮ \\ a_{m 1} & a_{m 2} & \dots & a_{m n} \end{matrix}) (\begin{matrix} x_{1} \\ x_{2} \\ ⋮ \\ x_{n} \end{matrix}) = (\begin{matrix} b_{1} \\ b_{2} \\ ⋮ \\ b_{m} \end{matrix})$

The transpose of an $m \times n$ matrix $A$ is an $n \times m$ matrix $A^{T}$ obtained by reflecting $A$ across its main diagonal.

$(A^{T})_{i j} = A_{j i}$

The transpose operation satisfies these properties.

$(A + B)^{T} = A^{T} + B^{T}$

$(c A)^{T} = c A^{T}$

$(A B)^{T} = B^{T} A^{T}$

A square matrix $A$ is symmetric if it equals its transpose.

$A = A^{T}$

The determinant is a scalar value associated with square matrices. For a $2 \times 2$ matrix, the determinant is defined as follows.

$det (\begin{matrix} a & b \\ c & d \end{matrix}) = a d - b c$

For a $3 \times 3$ matrix, the determinant can be calculated using the Laplace expansion.

$det (\begin{matrix} a & b & c \\ d & e & f \\ g & h & i \end{matrix}) = a det (\begin{matrix} e & f \\ h & i \end{matrix}) - b det (\begin{matrix} d & f \\ g & i \end{matrix}) + c det (\begin{matrix} d & e \\ g & h \end{matrix})$

For larger matrices, the determinant can be calculated recursively using cofactor expansion along any row or column.

The determinant has important properties. It equals zero if and only if the matrix is singular (non-invertible). It also behaves multiplicatively.

$det (A B) = det (A) \cdot det (B)$

$det (A^{T}) = det (A)$

$det (c A) = c^{n} det (A) for an n \times n matrix A$

The inverse of a square matrix $A$ , denoted $A^{- 1}$ , satisfies the following relation.

$A A^{- 1} = A^{- 1} A = I$

A matrix is invertible if and only if its determinant is nonzero. The inverse of a $2 \times 2$ matrix is:

${(\begin{matrix} a & b \\ c & d \end{matrix})}^{- 1} = \frac{1}{a d - b c} (\begin{matrix} d & - b \\ - c & a \end{matrix})$

For larger matrices, the inverse can be found using the adjugate matrix.

$A^{- 1} = \frac{1}{det (A)} adj (A)$

Matrix inversion satisfies these properties.

$(A^{- 1})^{- 1} = A$

$(A B)^{- 1} = B^{- 1} A^{- 1}$

$(A^{T})^{- 1} = (A^{- 1})^{T}$

The rank of a matrix equals the dimension of the vector space spanned by its rows (or columns). It can be determined as the number of linearly independent rows or columns.

A matrix with full rank has as many linearly independent rows or columns as possible. For an $m \times n$ matrix, the maximum possible rank is $min (m, n)$ .

The nullspace (kernel) of a matrix $A$ consists of all vectors $x$ such that $A x = 0$ . These vectors form a subspace of $R^{n}$ .

$null (A) = x \in R^{n} : A x = 0$

The dimension of the nullspace is related to the rank through the rank-nullity theorem.

$\dim (null (A)) + rank (A) = n$

A system of linear equations $A x = b$ has a unique solution if and only if $A$ has full column rank. If $A$ has less than full column rank, the system either has infinitely many solutions or no solution.

Elementary row operations can transform a matrix without changing the solution set of the corresponding linear system. These operations include scaling a row, adding a multiple of one row to another and swapping rows.

Gaussian elimination uses elementary row operations to convert a matrix to row echelon form. In this form, all zero rows appear at the bottom and each leading entry of a nonzero row is to the right of the leading entry in the row above.

Row reduction continues to reduced row echelon form, where each leading entry is 1 and each column containing a leading 1 has zeros elsewhere. The resulting matrix is unique and reveals the solution structure of the corresponding linear system.

LU decomposition expresses a matrix $A$ as the product of a lower triangular matrix $L$ and an upper triangular matrix $U$ . This factorization is used in solving linear systems and finding determinants.

$A = L U$

For matrices with linearly independent columns, the QR decomposition expresses $A$ as the product of an orthogonal matrix $Q$ and an upper triangular matrix $R$ . It is used for least squares problems and eigenvalue algorithms.

$A = Q R$

Eigenvalues and eigenvectors provide details regarding the behavior of linear transformations. An eigenvector $v$ of a square matrix $A$ is a nonzero vector that, when transformed by $A$ , remains parallel to its original direction, possibly scaled by a scalar value $λ$ (the eigenvalue).

$A v = λ v$

The characteristic equation helps find eigenvalues.

$det (A - λ I) = 0$

The eigenvalues are the roots of this polynomial equation. Once the eigenvalues are found, the corresponding eigenvectors can be determined by solving $(A - λ I) v = 0$ .

The eigenspace corresponding to an eigenvalue $λ$ consists of all eigenvectors with that eigenvalue including the zero vector. It forms a subspace of $R^{n}$ .

$E_{λ} = v \in R^{n} : A v = λ v$

A matrix is diagonalizable if there exists an invertible matrix $P$ such that $P^{- 1} A P$ is a diagonal matrix $D$ . The columns of $P$ are eigenvectors of $A$ and the diagonal entries of $D$ are the corresponding eigenvalues.

$A = P D P^{- 1}$

A matrix is diagonalizable if and only if it has $n$ linearly independent eigenvectors.

For symmetric matrices, all eigenvalues are real and eigenvectors corresponding to distinct eigenvalues are orthogonal. Every symmetric matrix is orthogonally diagonalizable, meaning there exists an orthogonal matrix $Q$ such that $Q^{T} A Q$ is diagonal.

The spectral theorem for symmetric matrices states that any symmetric matrix can be diagonalized by an orthogonal matrix of eigenvectors.

$A = Q D Q^{T}$

The singular value decomposition (SVD) generalizes the eigendecomposition to any $m \times n$ matrix. SVD expresses $A$ as the product of three matrices.

$A = U Σ V^{T}$

$U$ is an $m \times m$ orthogonal matrix, $Σ$ is an $m \times n$ diagonal matrix with non-negative entries (the singular values) and $V$ is an $n \times n$ orthogonal matrix.

The columns of $U$ are the left singular vectors, the columns of $V$ are the right singular vectors and the diagonal entries of $Σ$ are the singular values of $A$ .

SVD has numerous applications, including least squares problems, image compression and pseudoinverse computation. The pseudoinverse $A^{+}$ of a matrix $A$ is defined using SVD.

$A^{+} = V Σ^{+} U^{T}$

$Σ^{+}$ is found by taking the reciprocal of each nonzero singular value and transposing the resulting matrix.

Vector spaces can be equipped with an inner product, generalizing the dot product in $R^{n}$ . An inner product satisfies positive definiteness, linearity in the first argument and conjugate symmetry.

$⟨ u, u ⟩ > 0 for u \neq 0$

$⟨ α u + β v, w ⟩ = α ⟨ u, w ⟩ + β ⟨ v, w ⟩$

$⟨ u, v ⟩ = \overset{―}{⟨ v, u ⟩}$

An inner product space is a vector space equipped with an inner product. The norm in an inner product space is defined using the inner product.

$| v | = \sqrt{⟨ v, v ⟩}$

Gram-Schmidt process orthogonalizes a set of vectors in an inner product space. Given linearly independent vectors $v_{1}, v_{2}, \dots, v_{k}$ , the process produces an orthogonal set $u_{1}, u_{2}, \dots, u_{k}$ spanning the same subspace.

$u_{1} = v_{1}$

$u_{j} = v_{j} - \sum_{i = 1}^{j - 1} \frac{⟨ v_{j}, u_{i} ⟩}{⟨ u_{i}, u_{i} ⟩} u_{i}$

The resulting vectors can be normalized to create an orthonormal set.

Linear transformations map vectors from one vector space to another and preserve vector addition and scalar multiplication. For a linear transformation $T$ from vector space $V$ to vector space $W$ , the following properties hold.

$T (u + v) = T (u) + T (v)$

$T (c v) = c T (v)$

Every linear transformation between finite-dimensional vector spaces can be represented by a matrix. The columns of this matrix are the images of the basis vectors under the transformation.

The kernel (nullspace) of a linear transformation $T$ consists of all vectors that $T$ maps to the zero vector. The range (image) consists of all vectors that can be obtained as $T (v)$ for some vector $v$ in the domain.

$\ker (T) = v \in V : T (v) = 0$

$range (T) = T (v) : v \in V$

The rank-nullity theorem applies to linear transformations.

$\dim (\ker (T)) + \dim (range (T)) = \dim (V)$

Change of basis transforms vectors from one coordinate system to another. If $B$ and $C$ are bases for a vector space $V$ , then there exists a transition matrix $P$ such that the coordinates of a vector $v$ in basis $C$ can be obtained from its coordinates in basis $B$ .

$[v]_{C} = P [v]_{B}$

The matrix of a linear transformation $T$ changes with the bases chosen for the domain and codomain. If $A$ is the matrix of $T$ with respect to bases $B$ for $V$ and $D$ for $W$ and if $P$ and $Q$ are transition matrices between bases, then the matrix of $T$ with respect to the new bases is:

$A^{'} = Q A P^{- 1}$

Quadratic forms generalize the concept of squares to higher dimensions. For an $n \times n$ symmetric matrix $A$ , the quadratic form $q (x) = x^{T} A x$ maps vectors to scalars.

$q (x) = \sum_{i = 1}^{n} \sum_{j = 1}^{n} a_{i j} x_{i} x_{j}$

A quadratic form is positive definite if $q (x) > 0$ for all nonzero vectors $x$ . It is positive semidefinite if $q (x) \geq 0$ for all vectors $x$ .

Principal axis theorem (spectral theorem) diagonalizes quadratic forms through an orthogonal change of variables. For a symmetric matrix $A$ , there exists an orthogonal matrix $Q$ such that $Q^{T} A Q$ is diagonal.

$x^{T} A x = y^{T} D y$

$y = Q^{T} x$ and $D$ is a diagonal matrix containing the eigenvalues of $A$ .

← Previous
A dirty way to measure GFLOPS