Gidon Eshel
491 Hinds
Dept. of the Geophysical Sciences,
5734 S. Ellis Ave., The Univ. of Chicago,
Chicago, IL 60637
(773) 702-0440,
geshel@midway.uchicago.edu
![]() |
Figure 1 represents the operation of a matrix
on
a vector
(the upper-left space). That is, it shows
schematically what happens when an arbitrary vector from
's
domain (the space corresponding dimensionally to
's row
dimension N) is mapped by
onto the range space (the space
corresponding dimensionally to
's column dimension
M). Hence the schematic shows what happens to
from the
upper-left space as
transforms it to the range, the
lower-right space. Put differently, this schematic represents the forward problem, while the inverse problem will be shown later.
To enable visualization, both the range and the domain spaces are depicted as three-dimensional. This, of course, is just a metaphor, meant to make the schematic comprehensible by our feeble sense of space; both the domain and the range can obviously be of arbitrary dimensions.
Both spaces are divided into 2 orthogonal subspaces;
(upper
left) comprises the row and null spaces,
and
,
respectively. The row space is shown in red, while
's nullspace is blue. [Note that
is a shorthand for
.] An arbitrary
vector
comprises 2 orthogonal
contributions;
,
where
is the nullspace part, while
is the
row-space part.
You might find it puzzling (or overly restrictive) that the 2 spaces
are presented as aligned with the principal coordinates. This is
a fair point - there in nothing to guarantee that this will prove the
case! However, we can always transform the coordinate system so that
this will be true. This will be the case when
(domain,
upper-left) is spanned by
,
the columns of
,
while
(range, lower-right) is spanned by
,
the
columns of
,
and
arising from
's
SVD representation
.
When we take
,
a number of things
happen. First,
is mapped to the zero vector
in the `target' space (the adjoint space, in
);
by construction. This is shown by the uppermost arrow, from
's tip in the upper-left space to
in the
lower-right space. Conversely,
's row-space part,
,
is mapped by
onto the
corresponding point in the adjoint space
;
.
Note that this is exactly the same as
operating on the entire
,
because the nullspace component
is mapped to zero, by construction, thus not changing the end result
in any way. Note also that there is nothing the combination of
and
can do to give rise to
.
You might ask yourself the following question: If we are solving here
the forward problem, why in the world are we going to be so
stupid as to choose an
which has a nonzero nullspace
component? This is a very sensible question, but one which can be
answered very easily. The forward problem suggests a physical
model. That is, we envision certain physics, which collectively
account for the particular
of our problem, and with which
the problem is integrated forward,
,
,
etc. Without boundary conditions, however,
will be rank-deficient. In other words, it will have a
nonempty nullspace, a situation also explicitly addressed by our
schematic.
Now we turn to the inverse problem, shown in figure 2. While the
spaces appear the same, the problem is different. Here we start by
obtaining a set of observations, which we put in the components of the
vector
.
Next, we envision certain physics which gave rise to
,
and, as before, construct
according to those
physics. However, the parameters on which the solution depends (the
components of
)
are now solved-for optimally. The optimality
of the solution means that the left nullspace part of
,
(which we think of as `noise', hence the
notation),
is as small as possible. However the solution is obtained [i.e.,
whether the mapping
from
(lower-right) to
(upper-left) is achieved using
derived from
's SVD representation, or simply by
if
is both square and full-rank], there is nothing we can do about
the (hopefully minor) inconsistencies
in the set of linear
equations
;
they are simply mapped onto the zero
vector
,
as the uppermost arrow
shows.
Given this, the best we can do is solve the closest problem to
we know how to solve,
,
where
is the projection of the data vector
onto
's column space;
.
If you think our troubles are now over, hold your horses! there still
exists the all-too-realistic possibility that while we have reduced
the insoluble problem
to the tractable
,
the solution may not be unique! This will
happen when
's rank q is smaller than its row dimension;
q<N. In this case,
's
columns,
,
are going to span a nonempty nullspace in
.
In such a case, any nullspace component
can be
added to the solution without any change in the goodness of the
fit. Then, the solution
performs a second
optimization, much less straightforward than minimizing
:
it simply picks the shortest possible x,
the one with no nullspace component. That is, it sets
.
This may be justified as the least conjectural, or most
parsimonious, solution (given our imperfect information), but it is
far from clear that these are the solution's most desirable
attributes.