Skip to main content
Abdi Moalim

Linear algebra reference

Linear algebra entails vector spaces, linear transformations and representations through matrices and systems of linear equations.

The most basic object in linear algebra is the vector. A vector in an n-dimensional space is an ordered collection of n elements represented as a column.

v=(v1v2vn)

Vectors can be added component-wise. For vectors u and v of the same dimension, their sum follows.

u+v=(u1+v1u2+v2un+vn)

Scalar multiplication scales each component of a vector by a scalar value c.

cv=(cv1cv2cvn)

These operations satisfy some algebraic properties. Addition is commutative and associative. Scalar multiplication distributes over vector addition and scalar addition.

u+v=v+u

u+(v+w)=(u+v)+w

c(u+v)=cu+cv

(c+d)v=cv+dv

The zero vector 0 consists of all zero components and represents the additive identity for vectors.

v+0=v

Every vector v has an additive inverse v such that their sum equals the zero vector.

v+(v)=0

A vector space is a set of vectors that is closed under addition and scalar multiplication. Real coordinate space Rn is the prototypical example of an n-dimensional vector space.

Vectors v1,v2,,vk are linearly independent if the following relation has only the trivial solution c1=c2==ck=0.

c1v1+c2v2++ckvk=0

If there exists a non-trivial solution, the vectors are linearly dependent, implying that at least one vector can be expressed as a linear combination of the others.

The span of a set of vectors v1,v2,,vk consists of all possible linear combinations of these vectors.

span(v1,v2,,vk)=c1v1+c2v2++ckvk:c1,c2,,ckR

A basis for a vector space is a linearly independent set of vectors that spans the entire space. Every vector in the space can be uniquely expressed as a linear combination of basis vectors.

The dimension of a vector space equals the number of vectors in any basis for that space. For instance, Rn has dimension n.

The standard basis for Rn consists of n vectors, each with a single 1 and zeros elsewhere.

e1=(100),e2=(010),,en=(001)

The dot product (inner product) of two vectors u and v in Rn is defined as follows.

uv=u1v1+u2v2++unvn=i=1nuivi

The dot product is commutative and distributive over addition. It also satisfies these scaling properties.

uv=vu

u(v+w)=uv+uw

(cu)v=c(uv)=u(cv)

It let's us define the norm (i.e. length) of a vector. The Euclidean norm of vector v is calculated using this formula.

|v|=vv=v12+v22++vn2

Two vectors are orthogonal if their dot product equals zero. A set of vectors is orthogonal if every pair of distinct vectors in the set is orthogonal.

uv=0

An orthogonal set of nonzero vectors is automatically linearly independent. An orthogonal basis greatly simplifies many calculations in linear algebra.

A vector can be normalized by dividing it by its norm, resulting in a unit vector pointing in the same direction.

v^=v|v|

The angle between two nonzero vectors u and v can be calculated using the dot product.

cosθ=uv|u|,|v|

The Cauchy-Schwarz inequality establishes an upper bound for the dot product.

|uv||u|,|v|

Equality holds if and only if one vector is a scalar multiple of the other.

The triangle inequality follows from Cauchy-Schwarz and states that the norm of a sum of vectors cannot exceed the sum of their norms.

|u+v||u|+|v|

The cross product is a binary operation defined only for three-dimensional vectors. For vectors u=(u1,u2,u3) and v=(v1,v2,v3), their cross product is:

u×v=(u2v3u3v2u3v1u1v3u1v2u2v1)

The cross product results in a vector perpendicular to both input vectors, with magnitude equal to the area of the parallelogram they form.

|u×v|=|u|,|v|sinθ

Unlike the dot product, the cross product is anti-commutative. It also satisfies these distributive properties.

u×v=(v×u)

u×(v+w)=u×v+u×w

(cu)×v=c(u×v)=u×(cv)

Matrices provide a way to represent linear transformations and systems of linear equations. An m×n matrix has m rows and n columns.

A=(a11a12a1na21a22a2nam1am2amn)

Matrix addition is performed element-wise for matrices of the same dimensions.

A+B=(a11+b11a12+b12a1n+b1na21+b21a22+b22a2n+b2nam1+bm1am2+bm2amn+bmn)

Scalar multiplication of a matrix scales each entry by the scalar.

cA=(ca11ca12ca1nca21ca22ca2ncam1cam2camn)

Matrix multiplication is more complicated. For an m×n matrix A and an n×p matrix B, their product AB is an m×p matrix with entries calculated as:

(AB)ij=k=1naikbkj

Matrix multiplication is associative but generally not commutative.

(AB)C=A(BC)

ABBA (in general)

The identity matrix In is an n×n square matrix with ones on the main diagonal and zeros elsewhere. It is mostly used as the multiplicative identity.

AIn=InA=A

Matrix-vector multiplication treats the vector as a column matrix. For an m×n matrix A and an n×1 vector v, their product is:

Av=(a11v1+a12v2++a1nvna21v1+a22v2++a2nvnam1v1+am2v2++amnvn)

A system of linear equations can be expressed in matrix form Ax=b. In this case, A is the coefficient matrix, x is the vector of unknowns and b is the constant vector.

(a11a12a1na21a22a2nam1am2amn)(x1x2xn)=(b1b2bm)

The transpose of an m×n matrix A is an n×m matrix AT obtained by reflecting A across its main diagonal.

(AT)ij=Aji

The transpose operation satisfies these properties.

(A+B)T=AT+BT

(cA)T=cAT

(AB)T=BTAT

A square matrix A is symmetric if it equals its transpose.

A=AT

The determinant is a scalar value associated with square matrices. For a 2×2 matrix, the determinant is defined as follows.

det(abcd)=adbc

For a 3×3 matrix, the determinant can be calculated using the Laplace expansion.

det(abcdefghi)=adet(efhi)bdet(dfgi)+cdet(degh)

For larger matrices, the determinant can be calculated recursively using cofactor expansion along any row or column.

The determinant has important properties. It equals zero if and only if the matrix is singular (non-invertible). It also behaves multiplicatively.

det(AB)=det(A)det(B)

det(AT)=det(A)

det(cA)=cndet(A) for an n×n matrix A

The inverse of a square matrix A, denoted A1, satisfies the following relation.

AA1=A1A=I

A matrix is invertible if and only if its determinant is nonzero. The inverse of a 2×2 matrix is:

(abcd)1=1adbc(dbca)

For larger matrices, the inverse can be found using the adjugate matrix.

A1=1det(A)adj(A)

Matrix inversion satisfies these properties.

(A1)1=A

(AB)1=B1A1

(AT)1=(A1)T

The rank of a matrix equals the dimension of the vector space spanned by its rows (or columns). It can be determined as the number of linearly independent rows or columns.

A matrix with full rank has as many linearly independent rows or columns as possible. For an m×n matrix, the maximum possible rank is min(m,n).

The nullspace (kernel) of a matrix A consists of all vectors x such that Ax=0. These vectors form a subspace of Rn.

null(A)=xRn:Ax=0

The dimension of the nullspace is related to the rank through the rank-nullity theorem.

dim(null(A))+rank(A)=n

A system of linear equations Ax=b has a unique solution if and only if A has full column rank. If A has less than full column rank, the system either has infinitely many solutions or no solution.

Elementary row operations can transform a matrix without changing the solution set of the corresponding linear system. These operations include scaling a row, adding a multiple of one row to another and swapping rows.

Gaussian elimination uses elementary row operations to convert a matrix to row echelon form. In this form, all zero rows appear at the bottom and each leading entry of a nonzero row is to the right of the leading entry in the row above.

Row reduction continues to reduced row echelon form, where each leading entry is 1 and each column containing a leading 1 has zeros elsewhere. The resulting matrix is unique and reveals the solution structure of the corresponding linear system.

LU decomposition expresses a matrix A as the product of a lower triangular matrix L and an upper triangular matrix U. This factorization is used in solving linear systems and finding determinants.

A=LU

For matrices with linearly independent columns, the QR decomposition expresses A as the product of an orthogonal matrix Q and an upper triangular matrix R. It is used for least squares problems and eigenvalue algorithms.

A=QR

Eigenvalues and eigenvectors provide details regarding the behavior of linear transformations. An eigenvector v of a square matrix A is a nonzero vector that, when transformed by A, remains parallel to its original direction, possibly scaled by a scalar value λ (the eigenvalue).

Av=λv

The characteristic equation helps find eigenvalues.

det(AλI)=0

The eigenvalues are the roots of this polynomial equation. Once the eigenvalues are found, the corresponding eigenvectors can be determined by solving (AλI)v=0.

The eigenspace corresponding to an eigenvalue λ consists of all eigenvectors with that eigenvalue including the zero vector. It forms a subspace of Rn.

Eλ=vRn:Av=λv

A matrix is diagonalizable if there exists an invertible matrix P such that P1AP is a diagonal matrix D. The columns of P are eigenvectors of A and the diagonal entries of D are the corresponding eigenvalues.

A=PDP1

A matrix is diagonalizable if and only if it has n linearly independent eigenvectors.

For symmetric matrices, all eigenvalues are real and eigenvectors corresponding to distinct eigenvalues are orthogonal. Every symmetric matrix is orthogonally diagonalizable, meaning there exists an orthogonal matrix Q such that QTAQ is diagonal.

The spectral theorem for symmetric matrices states that any symmetric matrix can be diagonalized by an orthogonal matrix of eigenvectors.

A=QDQT

The singular value decomposition (SVD) generalizes the eigendecomposition to any m×n matrix. SVD expresses A as the product of three matrices.

A=UΣVT

U is an m×m orthogonal matrix, Σ is an m×n diagonal matrix with non-negative entries (the singular values) and V is an n×n orthogonal matrix.

The columns of U are the left singular vectors, the columns of V are the right singular vectors and the diagonal entries of Σ are the singular values of A.

SVD has numerous applications, including least squares problems, image compression and pseudoinverse computation. The pseudoinverse A+ of a matrix A is defined using SVD.

A+=VΣ+UT

Σ+ is found by taking the reciprocal of each nonzero singular value and transposing the resulting matrix.

Vector spaces can be equipped with an inner product, generalizing the dot product in Rn. An inner product satisfies positive definiteness, linearity in the first argument and conjugate symmetry.

u,u>0 for u0

αu+βv,w=αu,w+βv,w

u,v=v,u

An inner product space is a vector space equipped with an inner product. The norm in an inner product space is defined using the inner product.

|v|=v,v

Gram-Schmidt process orthogonalizes a set of vectors in an inner product space. Given linearly independent vectors v1,v2,,vk, the process produces an orthogonal set u1,u2,,uk spanning the same subspace.

u1=v1

uj=vji=1j1vj,uiui,uiui

The resulting vectors can be normalized to create an orthonormal set.

Linear transformations map vectors from one vector space to another and preserve vector addition and scalar multiplication. For a linear transformation T from vector space V to vector space W, the following properties hold.

T(u+v)=T(u)+T(v)

T(cv)=cT(v)

Every linear transformation between finite-dimensional vector spaces can be represented by a matrix. The columns of this matrix are the images of the basis vectors under the transformation.

The kernel (nullspace) of a linear transformation T consists of all vectors that T maps to the zero vector. The range (image) consists of all vectors that can be obtained as T(v) for some vector v in the domain.

ker(T)=vV:T(v)=0

range(T)=T(v):vV

The rank-nullity theorem applies to linear transformations.

dim(ker(T))+dim(range(T))=dim(V)

Change of basis transforms vectors from one coordinate system to another. If B and C are bases for a vector space V, then there exists a transition matrix P such that the coordinates of a vector v in basis C can be obtained from its coordinates in basis B.

[v]C=P[v]B

The matrix of a linear transformation T changes with the bases chosen for the domain and codomain. If A is the matrix of T with respect to bases B for V and D for W and if P and Q are transition matrices between bases, then the matrix of T with respect to the new bases is:

A=QAP1

Quadratic forms generalize the concept of squares to higher dimensions. For an n×n symmetric matrix A, the quadratic form q(x)=xTAx maps vectors to scalars.

q(x)=i=1nj=1naijxixj

A quadratic form is positive definite if q(x)>0 for all nonzero vectors x. It is positive semidefinite if q(x)0 for all vectors x.

Principal axis theorem (spectral theorem) diagonalizes quadratic forms through an orthogonal change of variables. For a symmetric matrix A, there exists an orthogonal matrix Q such that QTAQ is diagonal.

xTAx=yTDy

y=QTx and D is a diagonal matrix containing the eigenvalues of A.