A Singularly Valuable Decomposition: The SVD of a Matrix
Dan Kalman
The American University
Washington, DC 20016
February 13, 2002
Every teacher of linear algebra should be familiar with the matrix singular value decomposition (or
SVD). It has interesting and attractive algebraic properties, and conveys important geometrical and
theoretical insights about linear transformations. The close connection between the SVD and the well
known theory of diagonalization for
A Singularly Valuable Decomposition: The SVD of a Matrix
Dan KalmanThe American UniversityWashington, DC 20016February 13, 2002
Every teacher of linear algebra should be familiar with the matrix
singular value decomposition
(orSVD). It has interesting and attractive algebraic properties, and conveys important geometrical andtheoretical insights about linear transformations. The close connection between the SVD and the wellknown theory of diagonalization for symmetric matrices makes the topic immediately accessible to linearalgebra teachers, and indeed, a natural extension of what these teachers already know. At the sametime, the SVD has fundamental importance in several diﬀerent applications of linear algebra. Strangwas aware of these facts when he introduced the SVD in his now classical text [22, page 142], observing
...
it is not nearly as famous as it should be.
Golub and Van Loan ascribe a central signiﬁcance to the SVD in their deﬁnitive explication of numericalmatrix methods [8, page xiv] stating
...
perhaps the most recurring theme in the book is the practical and theoretical value of [theSVD].
Additional evidence of the signiﬁcance of the SVD is its central role in a number of papers in recentyears in
Mathematics Magazine
and
The American Mathematical Monthly
(for example [2, 3, 17, 23]).Although it is probably not feasible to include the SVD in the ﬁrst linear algebra course, it deﬁnitelydeserves a place in more advanced undergraduate courses, particularly those with a numerical or appliedemphasis. My primary goals in this article are to bring the topic to the attention of a broad audience,and to reveal some of the facets that give it both practical and theoretical signiﬁcance.1
Theory
The SVD is intimately related to the familiar theory of diagonalizing a symmetric matrix. Recall thatif
A
is a symmetric real
n
×
n
matrix, there is an orthogonal matrix
V
and a diagonal
D
such that
A
=
VDV
T
. Here the columns of
V
are eigenvectors for
A
and form an orthonormal basis for
Rn
; thediagonal entries of
D
are the eigenvalues of
A
. To emphasize the connection with the SVD, we will referto
VDV
T
as the eigenvalue decomposition, or
EVD
, for
A
.For the SVD we begin with an arbitrary real
m
×
n
matrix
A.
As we shall see, there are orthogonalmatrices
U
and
V
and a diagonal matrix, this time denoted Σ
,
such that
A
=
U
Σ
V
T
.
In this case,
U
is
m
×
m
and
V
is
n
×
n
, so that Σ is rectangular with the same dimensions as
A
. The diagonal entriesof Σ
,
that is the Σ
ii
=
σ
i
, can be arranged to be nonnegative and in order of decreasing magnitude.The positive ones are called the
singular values
of
A
. The columns of
U
and
V
are called left and right
singular vectors,
for
A
.The analogy between the EVD for a symmetric matrix and SVD for an arbitrary matrix can beextended a little by thinking of matrices as linear transformations. For a symmetric matrix
A,
thetransformation takes
Rn
to itself, and the columns of
V
deﬁne an especially nice basis. When vectorsare expressed relative to this basis, we see that the transformation simply dilates some components andcontracts others, according to the magnitudes of the eigenvalues (with a reﬂection through the srcintossed in for negative eigenvalues). Moreover, the basis is orthonormal, which is the best kind of basisto have.Now let’s look at the SVD for an
m
×
n
matrix
A
. Here the transformation takes
Rn
to a diﬀerentspace,
Rm
, so it is reasonable to ask for a natural basis for each of domain and range. The columnsof
V
and
U
provide these bases. When they are used to represent vectors in the domain and rangeof the transformation, the nature of the transformation again becomes transparent: it simply dilatessome components and contracts others, according to the magnitudes of the singular values, and possiblydiscards components or appends zeros as needed to account for a change in dimension. From thisperspective, the SVD tells us how to choose orthonormal bases so that the transformation is representedby a matrix with the simplest possible form, that is, diagonal.How do we choose the bases
{
v
1
, v
2
,
···
,v
n
}
and
{
u
1
, u
2
,
···
,u
m
}
for the domain and range? Thereis no diﬃculty in obtaining a diagonal representation. For that, we need only
Av
i
=
σ
i
u
i
, which is easilyarranged. Select an orthonormal basis
{
v
1
, v
2
,
···
,v
n
}
for
Rn
so that the ﬁrst
k
elements span the rowspace of
A
and the remaining
n
−
k
elements span the null space of
A,
where
k
is the rank of
A.
Then for1
≤
i
≤
k
deﬁne
u
i
to be a unit vector parallel to
Av
i
, and extend this to a basis for
Rm
. Relative tothese bases,
A
will have a diagonal representation. But in general, although the
v
’s are orthogonal, thereis no reason to expect the
u
’s to be. The possibility of choosing the
v
–basis so that its orthogonality ispreserved under
A
is the key point. We show next that the EVD of the
n
×
n
symmetric matrix
A
T
A
provides just such a basis, namely, the eigenvectors of
A
T
A.
Let
A
T
A
=
VDV
T
, with the diagonal entries
λ
i
of
D
arranged in nonincreasing order, and let the2
columns of
V
(which are eigenvectors of
A
T
A
) be the orthonormal basis
{
v
1
, v
2
,
···
,v
n
}
. Then
Av
i
·
Av
j
= (
Av
i
)
T
(
Av
j
) =
v
T i
A
T
Av
j
=
v
T i
(
λ
j
v
j
) =
λ
j
v
i
·
v
j
so the image set
{
Av
1
, Av
2
,
···
,Av
n
}
is orthogonal, and the nonzero vectors in this set form a basisfor the range of
A
. Thus, the eigenvectors of
A
T
A
and their images under
A
provide orthogonal basesallowing
A
to be expressed in a diagonal form.To complete the construction, we normalize the vectors
Av
i
. The eigenvalues of
A
T
A
again appearin this step. Taking
i
=
j
in the calculation above gives

Av
i

2
=
λ
i
, which means
λ
i
≥
0
.
Since theseeigenvalues were assumed to be arranged in nonincreasing order, we conclude that
λ
1
≥
λ
2
≥ ··· ≥
λ
k
≥
0 and, since the rank of
A
equals
k, λ
i
= 0 for
i > k.
The orthonormal basis for the range is thereforedeﬁned by
u
i
=
Av
i

Av
i

=1
√
λ
i
Av
i
; 1
≤
i
≤
k
If
k < m,
we extend this to an orthonormal basis for
Rm
.This completes the construction of the desired orthonormal bases for
Rn
and
Rm
. Setting
σ
i
=
√
λ
i
we have
Av
i
=
σ
i
u
i
for all
i
≤
k.
Assembling the
v
i
as the columns of a matrix
V
and the
u
i
to form
U,
this shows that
AV
=
U
Σ
,
where Σ has the same dimensions as
A,
has the entries
σ
i
along themain diagonal, and has all other entries equal to zero. Hence,
A
=
U
Σ
V
T
,
which is the singular valuedecomposition of
A.
In summary, an
m
×
n
real matrix
A
can be expressed as the product
U
Σ
V
T
, where
V
and
U
are orthogonal matrices and Σ is a diagonal matrix, as follows. The matrix
V
is obtained from thediagonal factorization
A
T
A
=
VDV
T
,
in which the diagonal entries of
D
appear in nonincreasingorder; the columns of
U
come from normalizing the nonvanishing images under
A
of the columns of
V,
and extending (if necessary) to an orthonormal basis for
Rm
; the nonzero entries of Σ are the respectivesquare roots of corresponding diagonal entries of
D.
The preceding construction demonstrates that the SVD exists, and gives some idea of what it tellsabout a matrix. There are a number of additional algebraic and geometric insights about the SVD thatare derived with equal ease. Before proceeding to these insights, two remarks should be made. First,the SVD encapsulates the most appropriate bases for the domain and range of the linear transformationdeﬁned by the matrix
A
. There is a beautiful relationship between these bases and the four fundamentalsubspaces associated with
A
: the range and nullspace, and their orthogonal complements. It is the fullpicture provided by the SVD and these subspaces that Strang has termed the fundamental theorem of linear algebra. He also invented a diagram schematically illustrating the relationship of the bases andthe four subspaces (see Fig. (1)). Strang’s article [23] is recommended for a detailed discussion of thistopic.The second remark concerns computation. There is often a gap between mathematical theory andcomputational practice. In theory, we envision arithmetic operations being carried out on real numbersin inﬁnite precision. But when we carry out arithmetic on a digital computer we compute not with reals,but with a ﬁnite set of rationals, and the results can only approximate with limited precision the real3
v 1 ...v k v k+1 ...v n u 1 ...u k u k+1 ...u m A
n u l l ( A )
n u l l ( A )
⊥
r a n g e ( A )
r a n g e ( A )
⊥
Figure 1: Strang’s Diagramcomputations. Procedures that seem elegant and direct in theory can sometimes be dramatically ﬂawedas prescriptions for computational algorithms. The SVD illustrates this point. Our construction appearsto oﬀer a straightforward algorithm for the SVD: form
A
T
A
, compute its eigenvalues and eigenvectors,and then ﬁnd the SVD as described above. Here practice and theory go their separate ways. As weshall see later, the computation of
A
T
A
can be subject to a serious loss of precision. It turns out thatdirect methods exist for ﬁnding the SVD of
A
without forming
A
T
A
, and indeed in many applicationsthe practical importance of the SVD is that it allows one to ﬁnd the EVD of
A
T
A
without ever formingthis numerically treacherous product.Let us now proceed to several additional aspects of the SVD.
The SVD and EVD.
Our construction showed how to determine the SVD of A from the EVDof the symmetric matrix
A
T
A
. Conversely, it is easy to recover the EVD of
A
T
A
from the SVD of
A
. For suppose the singular value decomposition
A
=
U
Σ
V
T
is given. Clearly,
A
T
A
=
V
Σ
T
Σ
V
T
and4