Section 12 The Structure of \(\R^n\)
Focus Questions
By the end of this section, you should be able to give precise and thorough answers to the questions listed below. You may want to keep these questions in mind to focus your thoughts as you complete the section.
What properties make \(\R^n\) a vector space?
What is a subspace of \(\R^n\text{?}\)
What properties do we need to verify to show that a set of vectors is a subspace of \(\R^n\text{?}\) Why?
What important structure does the span of a set of vectors in \(\R^n\) have?
Subsection Application: Connecting GDP and Consumption in Romania
It is common practice in the sciences to run experiments and collect data. Once data is collected it is necessary to find some way to analyze the data and predict future behavior from the data. One method is to find a curve that best “fits” the data, and one widely used method for curve fitting is called the least squares method.
For example, economists are often interested in consumption, which is the purchase of goods and services for use by households. In “A Statistical Analysis of GDP and Final Consumption Using Simple Linear Regression, the Case of Romania 1990-2010”, 27 the authors collect data and then use simple linear regression to compare GDP (gross domestic product) to consumption in Romania. The data they used is seen in Table 12.1, with a corresponding scatterplot of the data (with consumption as independent variable and GDP as dependent variable). The units for GDP and consumption are milliions of leu (the currency of Romania is the leu — on December 21, 2018, one leu was worth approximately $0.25 U.S.) The authors conclude their paper with the following statement:
However, we can appreciate that linear regression model describes the correlation between the value of gross domestic product and the value of final consumption and may be transcribed following form: PIB = -3127.51+ 1.22 CF. Analysis of correlation between GDP and final consumption (private consumption and public consumption) will result in an increase of 1.22 units of monetary value of gross domestic product. We can conclude that the Gross Domestic Product of our country is strongly influenced by the private and public consumption.
Year | GDP | Consumption |
\(1990\) | \(85.8\) | \(68.0\) |
\(1991\) | \(220.4\) | \(167.3\) |
\(1992\) | \(602.9\) | \(464.3\) |
\(1993\) | \(2003.9\) | \(1523.6\) |
\(1994\) | \(4977.3\) | \(3845.2\) |
\(1995\) | \(7648.9\) | \(6257.7\) |
\(1996\) | \(11384.2\) | \(9713.8\) |
\(1997\) | \(25529.8\) | \(21972.2\) |
\(1998\) | \(37055.1\) | \(33311.2\) |
\(1999\) | \(55191.4\) | \(49311.9\) |
\(2000\) | \(80984.6\) | \(69587.4\) |
\(2001\) | \(117945.8\) | \(100731.7\) |
\(2002\) | \(152017.0\) | \(127118.8\) |
\(2003\) | \(197427.6\) | \(168818.7\) |
\(2004\) | \(247368.0\) | \(211054.6\) |
\(2005\) | \(288954.6\) | \(251038.1\) |
\(2006\) | \(344650.6\) | \(294867.6\) |
\(2007\) | \(416006.8\) | \(344937.0\) |
\(2008\) | \(514700.0\) | \(420917.5\) |
\(2009\) | \(498007.5\) | \(402246.0\) |
\(2010\) | \(513640.8\) | \(405422.4\) |
As we can see from the scatterplot, the relationship between the GDP and consumption is not exactly linear, but looks to be very close. To make correlations between GDP and consumption as the authors did, we need to understand how they determined their approximate linear relationship between the variables. With a good approximation function we can then compare the variables, extrapolate from the data, and make predictions or interpolate and estimate between data points. For example, we could use our approximation function to predict, as the authors did, how changes in consumption affect GDP (or vice versa). Later in this section we will see how to find the least squares line to fit this data — the best linear approximation to the data. This involves finding a vector in a certain subspace of \(\R^2\) that is closest to a given vector. Linear least squares approximation is a special case of a more general process that we will encounter in later sections where we learn how to project sets onto subspaces.
Subsection Introduction
The set \(\R^n\) with vector addition and scalar multiplication has a nice algebraic structure. These operations satisfy a number of properties, such as associativity and commutativity of vector addition, the existence of an additive identity and additive inverse, distribution of scalar multiplication over vector addition, and others. These properties make it easier to work with the whole space since we can express the vectors as linear combinations of basis vectors in a unique way. This algebraic structure makes \(\R^n\) a vector space.
There are many subsets of \(\R^n\) that have this same structure. These subsets are called subspaces of \(\R^n\text{.}\) These are the sets of vectors for which the addition of any two vectors is defined within the set, the scalar multiple of any vector by any scalar is defined within the set and the set contains the zero vector. One type of subset with this structure is the span of a set of vectors.
Recall that the span of a set of vectors \(\{\vv_1, \vv_2, \ldots, \vv_k\}\) in \(\R^n\) is the set of all linear combinations of the vectors. For example, if \(\vv_1=\left[ \begin{array}{c} 1\\1\\0 \end{array} \right]\) and \(\vv_2=\left[ \begin{array}{c} 1\\0\\1 \end{array} \right]\text{,}\) then a linear combination of these two vectors is of the form
One linear combination can be obtained by letting \(c_1=2, c_2=-3\text{,}\) which gives the vector \(2\vv_1-3\vv_2=\left[ \begin{array}{c} -1\\-3\\2 \end{array} \right]\text{.}\) All such linear combinations form the span of the vectors \(\vv_!\) and \(\vv_2\text{.}\) In this case, these vectors will form a plane through the origin in \(\R^3\text{.}\)
Now we will investigate if the span of two vectors form a subspace, i.e. if it has the same structure as a vector space.
Preview Activity 12.1.
Let \(\vw_1\) and \(\vw_2\) be two vectors in \(\R^n\text{.}\) Let \(W = \Span \{\vw_1, \vw_2\}\text{.}\)
(a)
For \(W\) to be a subspace of \(\R^n\text{,}\) the sum of any two vectors in \(W\) must also be in \(W\text{.}\)
(i)
Pick two specific examples of vectors \(\vu, \vy\) in \(W\) (keeping \(\vw_1, \vw_2\) unknown/general vectors). For example, one specific \(\vu\) would be \(2\vw_1-3\vw_2\) as we used in the above example. Find the sum of \(\vu, \vy\text{.}\) Is the sum also in \(W\text{?}\) Explain.
What does it mean for a vector to be in \(W\text{?}\)
(ii)
Now let \(\vu\) and \(\vy\) be arbitrary vectors in \(W\text{.}\) Explain why \(\vu + \vy\) is in \(W\text{.}\)
(b)
For \(W\) to be a subspace of \(\R^n\text{,}\) any scalar multiple of any vector in \(W\) must also be in \(W\text{.}\)
(i)
Pick a specific example \(\vu\) in \(W\text{.}\) Explain why \(2\vu, -3\vu, \pi\vu\) are all also in \(W\text{.}\)
(ii)
Now let \(a\) be an arbitrary scalar and let \(\vu\) be an arbitrary vector in \(W\text{.}\) Explain why the vector \(a \vu\) is in \(W\text{.}\)
(c)
For \(W\) to be a subspace of \(\R^n\text{,}\) the zero vector must also be in \(W\text{.}\) Explain why the zero vector is in \(W\text{.}\)
(d)
Does vector addition being commutative for vectors in \(\R^n\) imply that vector addition is also commutative for vectors in \(W\text{?}\) Explain your reasoning.
(e)
Suppose we have an arbitrary \(\vu\) in \(W\text{.}\) There is an additive inverse of \(\vu\) in \(\R^n\text{.}\) In other words, there is a \(\vu'\) such that \(\vu+ \vu'=\vzero\text{.}\) Should this \(\vu'\) be also in \(W\text{?}\) If so, explain why. If not, give a counterexample.
(f)
Look at the other properties of vector addition and scalar multiplication of vectors in \(\R^n\) listed in Theorem 4.4 in Section 4. Which of these properties should also hold for vectors in \(W\text{?}\)
Subsection Vector Spaces
The set of \(n\)-dimensional vectors with the vector addition and scalar multiplication satisfy many properties, such as addition being commutative and associative, existence of an additive identity, and others. The set \(\R^n\) with these properties is an example of a vector space, a general structure examples of which include many other algebraic structures as we will see later.
Definition 12.3.
A set \(V\) on which an operation of addition and a multiplication by scalars is defined is a vector space if for all \(\vu\text{,}\) \(\vv\text{,}\) and \(\vw\) in \(V\) and all scalars \(a\) and \(b\text{:}\)
\(\vu + \vv\) is an element of \(V\) (we say that \(V\) is closed under addition in \(V\)),
\(\vu + \vv = \vv + \vu\) (we say that addition in \(V\) is commutative),
\((\vu + \vv) + \vw = \vu + (\vv + \vw)\) (we say that addition in \(V\) is associative),
there is a vector \(\vzero\) in \(V\) so that \(\vu + \vzero = \vu\) (we say that \(V\) contains an additive identity or zero vector \(\vzero\)),
for each \(\vx\) in \(V\) there is an element \(\vy\) in \(V\) so that \(\vx + \vy = \vzero\) (we say that \(V\) contains an additive inverse \(\vy\) for each element \(\vx\) in \(V\)),
\(a \vu\) is an element of \(V\) (we say that \(V\) is closed under multiplication by scalars),
\((a+b) \vu = a\vu + b\vu\) (we say that multiplication by scalars distributes over scalar addition),
\(a(\vu + \vv) = a\vu + a\vv\) (we say that multiplication by scalars distributes over addition in \(V\)),
\((ab) \vu = a(b\vu)\text{,}\)
\(1 \vu = \vu\text{.}\)
Theorem 4.4 in Section 4 shows that \(\R^n\) is itself a vector space. As we will see, there are many other sets that have the same algebraic structure. By focusing on this structure and the properties of these operations, we can extend the theory of vectors we developed so far to a broad range of objects, making it easier to work with them. For example, we can consider linear combinations of functions or matrices, or define a basis for different types of sets of objects. Such algebraic tools provide us with new ways of looking at these sets of objects, including a geometric intuition when working with these sets. In this section, we will analyze subsets of \(\R^n\) which behave similar to \(\R^n\) algebraically. We will call such sets subspaces. In a later chapter we will encounter different kinds of sets that are also vector spaces.
Definition 12.4.
A subset \(W\) of \(\R^n\) is a subspace of \(\R^n\) if \(W\) itself is a vector space using the same operations as in \(\R^n\text{.}\)
The following example illustrates the process for demonstrating that a subset of \(\R^n\) is a subspace of \(\R^n\text{.}\)
Example 12.5.
There are many subsets of \(\R^n\) that are themselves vector spaces. Consider as an example the set \(W\) of vectors in \(\R^2\) defined by
In other words, \(W\) is the set of vectors in \(\R^2\) whose second component is 0. To see that \(W\) is itself a vector space, we need to demonstrate that \(W\) satisfies all of the properties listed in Definition 12.3.
To prove the first property, we need to show that the sum of any two vectors in \(W\) is again in \(W\text{.}\) So we need to choose two arbitrary vectors in \(W\text{.}\) Let \(\vu = \left[ \begin{array}{c} x \\ 0 \end{array} \right]\) and \(\vv = \left[ \begin{array}{c} y \\ 0 \end{array} \right]\) be vectors in \(W\text{.}\) Note that
Since the second component of \(\vu + \vv\) is 0, it follows that \(\vu + \vv\) is in \(W\text{.}\) Thus, the set \(W\) is closed under addition.
For the second property, that addition is commutative in \(W\text{,}\) we can just use the fact that if \(\vu\) and \(\vv\) are in \(W\text{,}\) they are also vectors in \(\R^2\) and \(\vu+\vv=\vv+\vu\) is satisfied in \(\R^2\text{.}\) So the property also holds in \(W\text{.}\)
A similar argument can be made for property (3).
Property (4) states the existence of the additive identity in \(W\text{.}\) Note that \(\vzero\) is an additive identity in \(\R^2\) and if it is also an element in \(W\text{,}\) then it will automatically be the additive identity of \(W\text{.}\) Since the zero vector can be written as \(\vzero=\left[ \begin{array}{c} x \\ 0 \end{array} \right]\) with \(x=0\text{,}\) \(\vzero\) is in \(W\text{.}\) Thus, \(W\) satisfies property 4.
We will postpone property (5) for a bit since we can show that other properties imply property (5).
Property (6) is a closure property, just like property (1). We need to verify that any scalar multiple of any vector in \(W\) is again in \(W\text{.}\) Consider an arbitrary vector \(\vu\) and an arbitrary scalar \(a\text{.}\) Now
Since the vector \(a \vu\) has a 0 as its second component, we see that \(a \vu\) is in \(W\text{.}\) Thus, \(W\) is closed under scalar multiplication.
Properties (7), (8), (9) and (10) only depend on the operations of addition and multiplication by scalars in \(\R^2\text{.}\) Since these properties depend on the operations and not the vectors, these properties will transfer to \(W\text{.}\)
We still have to justify property (5) though. Note that since \(1-1=0\) in real numbers, by applying property (7) with \(a=1\text{,}\) \(b=-1\text{,}\) we find that
Therefore, \((-1)\vu\) is an additive inverse for \(\vu\text{.}\) Therefore, to show that the additive inverse of any \(\vu\) in \(W\) is also in \(W\text{,}\) we simply note that any multiple of \(\vu\) is also in \(W\) and hence \((-1)\vu\) must also be in \(W\text{.}\)
Since \(W\) satisfies all of the properties of a vector space, \(W\) is a vector space. Any subset of \(\R^n\) that is itself a vector space using the same operations as in \(\R^n\) is called a subspace of \(\R^n\text{.}\)
Example 12.5 and our work Preview Activity 12.1 bring out some important ideas. When checking that a subset \(W\) of a vector space \(\R^n\) is also a vector space, we can use the fact that all of the properties of the operations in \(\R^n\) are transferred to any closed subset \(W\text{.}\) This implies that properties (2), (3), (7)-(10) are all automatically satisfied for \(W\) as well. Property (5) follows from the others. So we only need to check properties (1), (4) and (6). In fact, as we argued in the above example, property (4) also needs to be checked by simply checking that \(\vzero\) of \(\R^n\) is in \(W\text{.}\) We summarize this result in the following theorem.
Theorem 12.6.
A subset \(W\) of \(\R^n\) is a subspace of \(\R^n\) if
whenever \(\vu\) and \(\vv\) are in \(W\) it is also true that \(\vu + \vv\) is in \(W\) (that is, \(W\) is closed under addition),
whenever \(\vu\) is in \(W\) and \(a\) is a scalar it is also true that \(a\vu\) is in \(W\) (that is, \(W\) is closed under scalar multiplication),
\(\vzero\) is in \(W\text{.}\)
The next activity provides some practice using Theorem 12.6.
Activity 12.2.
Use Theorem 12.6 to answer the following questions. Justify your responses. For sets which lie inside \(\R^2\text{,}\) sketch a pictorial representation of the set and explain why your picture confirms your answer.
(a)
Is the set \(W = \left\{ \left[ \begin{array}{c} x \\ y \end{array} \right] \middle| y = 2x\right\}\) a subspace of \(\R^2\text{?}\)
(b)
Is the set \(W = \left\{ \left[ \begin{array}{c} x \\ 0 \\ 1 \end{array} \right] \middle| x \text{ is a scalar } \right\}\) a subspace of \(\R^3\text{?}\)
(c)
Is the set \(W = \left\{ \left[ \begin{array}{c} x \\ x+y \end{array} \right] \middle| x, y \text{ are scalars } \right\}\) a subspace of \(\R^2\text{?}\)
(d)
Is the set \(W = \left\{ \left[ \begin{array}{c} x \\ y \end{array} \right] \middle| y = 2x+1\right\}\) a subspace of \(\R^2\text{?}\)
(e)
Is the set \(W = \left\{ \left[ \begin{array}{c} x \\ y \end{array} \right] \middle| y=x^2\right\}\) a subspace of \(\R^2\text{?}\)
(f)
Is the set \(W = \left\{ \left[ \begin{array}{c} 0 \\ 0 \\ 0 \\ 0 \end{array} \right]\right\}\) a subspace of \(\R^4\text{?}\)
(g)
Is the set \(W = \left\{ \left[ \begin{array}{c} x \\ y \\ z \end{array} \right] \middle| x^2+y^2+z^2 \leq 1\right\}\) a subspace of \(\R^3\text{?}\) Note that \(W\) is the unit sphere (a.k.a. unit ball) in \(\R^3\text{.}\)
(h)
Is the set \(W = \R^2\) a subspace of \(\R^3\text{?}\)
There are several important points that we can glean from Activity 12.2.
A subspace is a vector space within a larger vector space, similar to a subset being a set within a larger set.
The set containing the zero vector in \(\R^n\) is a subspace of \(\R^n\text{,}\) and it is the only finite subspace of \(\R^n\text{.}\)
Every subspace of \(\R^n\) must contain the zero vector.
No nonzero subspace is bounded — since a subspace must include all scalar multiples of its vectors, a subspace cannot be contained in a finite sphere or box.
Since vectors in \(\R^k\) have \(k\) components, vectors in \(\R^k\) are not contained in \(\R^n\) when \(n \neq k\text{.}\) However, if \(n > k\text{,}\) then we can think of \(\R^n\) as containing a copy (what we call an isomorphic image) of \(\R^k\) as the set of vectors with zeros as the last \(n-k\) components.
Subsection The Subspace Spanned by a Set of Vectors
One of the most convenient ways to represent a subspace of \(\R^n\) is as the span of a set of vectors. In Preview Activity 12.1 we saw that the span of two vectors is a subspace of \(\R^n\text{.}\) In the next theorem we verify this result for the span of an arbitrary number of vectors, extending the ideas you used in Preview Activity 12.1. Expressing a set of vectors as the span of some number of vectors is a quick way of justifying that this set is a subspace and it also provides us a geometric intuition for the set of vectors.
Theorem 12.7.
Let \(\vv_1\text{,}\) \(\vv_2\text{,}\) \(\ldots\text{,}\) \(\vv_k\) be vectors in \(\R^n\text{.}\) Then \(\Span \{\vv_1, \vv_2, \ldots, \vv_k\}\) is a subspace of \(\R^n\text{.}\)
Proof.
Let \(\vv_1\text{,}\) \(\vv_2\text{,}\) \(\ldots\text{,}\) \(\vv_k\) be vectors in \(\R^n\text{.}\) Let \(W = \Span\{\vv_1, \vv_2, \ldots, \vv_k\}\text{.}\) To show that \(W\) is a subspace of \(\R^n\) we need to show that \(W\) is closed under addition and multiplication by scalars and that \(\vzero\) is in \(W\text{.}\)
First we show that \(W\) is closed under addition. Let \(\vu\) and \(\vw\) be vectors in \(W\text{.}\) This means that \(\vu\) and \(\vw\) are linear combinations of \(\vv_1\text{,}\) \(\vv_2\text{,}\) \(\ldots\text{,}\) \(\vv_k\text{.}\) So there are scalars \(a_1\text{,}\) \(a_2\text{,}\) \(\ldots\text{,}\) \(a_k\) and \(b_1\text{,}\) \(b_2\text{,}\) \(\ldots\text{,}\) \(b_k\) so that
To demonstrate that \(\vu + \vw\) is in \(W\text{,}\) we need to show that \(\vu + \vw\) is a linear combination of \(\vv_1\text{,}\) \(\vv_2\text{,}\) \(\ldots\text{,}\) \(\vv_k\text{.}\) Using the properties of vector addition and scalar multiplication, we find
Thus \(\vu + \vw\) is a linear combination of \(\vv_1\text{,}\) \(\vv_2\text{,}\) \(\ldots\text{,}\) \(\vv_k\) and \(W\) is closed under vector addition.
Next we show that \(W\) is closed under scalar multiplication. Let \(\vu\) be in \(W\) and \(c\) be a scalar. Then
and \(c\vu\) is a linear combination of \(\vv_1\text{,}\) \(\vv_2\text{,}\) \(\ldots\text{,}\) \(\vv_k\) and \(W\) is closed under multiplication by scalars.
Finally, we show that \(\vzero\) is in \(W\text{.}\) Since
\(\vzero\) is in \(W\text{.}\)
Since \(W\) satisfies all of the properties of a subspace as given in definition of a subspace, we conclude that \(W\) is a subspace of \(\R^n\text{.}\)
The subspace \(W=\Span\{\vv_1, \vv_2, \ldots, \vv_k\}\) is called the subspace of \(\R^n\) spanned by \(\vv_1, \vv_2, \ldots, \vv_k\). We also use the phrase “subspace generated by \(\vv_1, \vv_2, \ldots, \vv_k\)” since the vectors \(\vv_1, \vv_2, \ldots, \vv_k\) are the building blocks of all vectors in \(W\text{.}\)
Activity 12.3.
(a)
Describe geometrically as best as you can the subspaces of \(\R^3\) spanned by the following two sets of vectors. \(\left\{\left[ \begin{array}{c} 1 \\ 0 \\0 \end{array} \right]\right\}\text{,}\) \(\left\{\left[ \begin{array}{c} 1 \\ 0\\0 \end{array} \right], \left[ \begin{array}{r} 0 \\ 1\\0 \end{array} \right]\right\}\)
(b)
Express the following set of vectors as the span of some vectors to show that this set is a subspace. Can you give a geometric description of the set?
One additional conclusion we can draw from Activity 12.2 and Activity 12.3 is that subspaces of \(\R^n\) are made up of “flat” subsets. The span of a single nonzero vector is a line (which is flat), and the span of a set of two distinct nonzero vectors is a plane (which is also flat). So subspaces of \(\R^n\) are linear (or “flat”) subsets of \(\R^n\text{.}\) That is why we can recognize that the non-flat parabola in Activity 12.2 is not a subspace of \(\R^2\text{.}\)
Subsection Examples
What follows are worked examples that use the concepts from this section.
Example 12.8.
Let \(W = \left\{\left[ \begin{array}{c} 2r+s+t \\ r+t \\ r+s \end{array} \right] : r,s,t \in \R \right\}\text{.}\)
(a)
Show that \(W\) is a subspace of \(\R^3\text{.}\)
Solution.
Every vector in \(W\) has the form
for some real numbers \(r\text{,}\) \(s\text{,}\) and \(t\text{.}\) Thus,
As a span of a set of vectors, we know that \(W\) is a subspace of \(\R^3\text{.}\)
(b)
Describe in detail the geometry of the subspace \(W\) (e.g., is it a line, a union of lines, a plane, a union of planes, etc.)
Solution.
Let \(\vv_1 = \left[ \begin{array}{c} 2 \\ 1 \\ 1 \end{array} \right]\text{,}\) \(\vv_2 = \left[ \begin{array}{c} 1 \\ 0 \\ 1 \end{array} \right]\text{,}\) and \(\vv_3 = \left[ \begin{array}{c} 1 \\ 1 \\ 0 \end{array} \right]\text{.}\) The reduced row echelon form of \([\vv_1 \ \vv_2 \ \vv_3]\) is \(\left[ \begin{array}{ccr} 1\amp 0\amp 1\\0\amp 1\amp -1\\0\amp 0\amp 0 \end{array} \right]\text{.}\) The pivot columns of \([\vv_1 \ \vv_2 \ \vv_3]\) form a linearly independent set with the same span as \(\{\vv_1, \vv_2, \vv_3\}\text{,}\) So \(W = \Span\{\vv_1, \vv_2\}\) and \(W\) forms the plane in \(\R^3\) through the origin and the points \((2,1,1)\) and \((1,0,1)\text{.}\)
Example 12.9.
(a)
Let \(X = \Span\left\{ \left[ \begin{array}{c} 1\\0\\0 \end{array} \right] \right\}\) and let \(Y = \Span\left\{ \left[ \begin{array}{c} 0\\1\\0 \end{array} \right] \right\}\text{.}\) That is, \(X\) is the \(x\)-axis and \(Y\) the \(y\)-axis in three-space. Let
(i)
Is \(\left[ \begin{array}{c} 2\\3\\0 \end{array} \right]\) in \(X+Y\text{?}\) Justify your answer.
Solution.
We let \(X = \Span\left\{ \left[ \begin{array}{c} 1\\0\\0 \end{array} \right] \right\}\) and \(Y = \Span\left\{ \left[ \begin{array}{c} 0\\1\\0 \end{array} \right] \right\}\text{.}\) T
Let \(\vw = \left[ \begin{array}{c} 2\\3\\0 \end{array} \right]\text{,}\) \(\vx = 2\left[ \begin{array}{c} 1\\0\\0 \end{array} \right]\text{,}\) and \(\vy = 3\left[ \begin{array}{c} 0\\1\\0 \end{array} \right]\text{.}\) Since \(\vw = \vx+\vy\) with \(\vx \in X\) and \(\vy \in Y\) we conclude that \(\vw \in X + Y\text{.}\)
(ii)
Is \(\left[ \begin{array}{c} 1\\1\\1 \end{array} \right]\) in \(X+Y\text{?}\) Justify your answer.
Solution.
We let \(X = \Span\left\{ \left[ \begin{array}{c} 1\\0\\0 \end{array} \right] \right\}\) and \(Y = \Span\left\{ \left[ \begin{array}{c} 0\\1\\0 \end{array} \right] \right\}\text{.}\) T
Every vector in \(X\) has the form \(a \ve_1\) for some scalar \(a\) (where \(\ve_1 = \left[ \begin{array}{c} 1\\0\\0 \end{array} \right]\text{,}\) and every vector in \(Y\) has the form \(b \ve_2\) for some scalar \(b\) (where \(\ve_2 = \left[ \begin{array}{c} 0\\1\\0 \end{array} \right]\)). So every vector in \(X+Y\) is of the form \(a\ve_1 + b\ve_2 = \left[ \begin{array}{c} a\\b\\0 \end{array} \right]\text{.}\) Since the vector \(\left[ \begin{array}{c} 1\\1\\1 \end{array} \right]\) does not have a \(0\) in the third component, we conclude that in \(\left[ \begin{array}{c} 1\\1\\1 \end{array} \right]\) is not in \(X+Y\text{.}\)
(iii)
Assume that \(X+Y\) is a subspace of \(\R^3\text{.}\) Describe in detail the geometry of this subspace.
Solution.
We let \(X = \Span\left\{ \left[ \begin{array}{c} 1\\0\\0 \end{array} \right] \right\}\) and \(Y = \Span\left\{ \left[ \begin{array}{c} 0\\1\\0 \end{array} \right] \right\}\text{.}\) T
As we just argued, every vector in \(X+Y\) has the form \(a\ve_1+b\ve_2\text{.}\) So \(X+Y = \Span\{\ve_1,\ve_2\}\text{,}\) which is the \(xy\)-plane in \(\R^3\text{.}\)
(b)
Now let \(W_1\) and \(W_2\) be arbitrary subspaces of \(\R^n\) for some positive integer \(n\text{.}\) Let
Show that \(W_1+W_2\) is a subspace of \(\R^n\text{.}\) The set \(W_1+W_2\) is called the sum of the subspaces \(W_1\) and \(W_2\text{.}\)
Solution.
To see why the set \(W_1+W_2\) is a subspace of \(\R^3\text{,}\) suppose that \(\vx\) and \(\vy\) are in \(W_1+W_2\text{.}\) Then \(\vx = \vu_1+\vu_2\) and \(\vy = \vz_1+\vz_2\) for some \(\vu_1, \vz_1\) in \(W_1\) and some \(\vu_2, \vz_2\) in \(W_2\text{.}\) Then
Since \(W_1\) is a subspace of \(\R^3\) it follows that \(\vu_1+\vz_1 \in W_1\text{.}\) Similarly, \(\vu_2+\vz_2 \in W_2\text{.}\) This makes \(\vx + \vy\) an element of \(W_1+W_2\text{.}\) Also, suppose that \(a\) is a scalar. Then
Since \(W_1\) is a subspace of \(\R^3\) it follows that \(a\vu_1 \in W_1\text{.}\) Similarly, \(a\vu_2 \in W_2\text{.}\) This makes \(a\vx\) an element of \(W_1+W_2\text{.}\) Finally, since \(\vzero\) is in both \(W_1\) and \(W_2\text{,}\) and \(\vzero = \vzero + \vzero\text{,}\) it follows that \(\vzero\) is an element of \(W_1+W_2\text{.}\) We conclude that \(W_1+W_2\) is a subspace of \(\R^3\text{.}\)
Subsection Summary
-
A vector space is a set \(V\) with operations of addition and scalar multiplication defined on \(V\) such that for all \(\vu\text{,}\) \(\vv\text{,}\) and \(\vw\) in \(V\) and all scalars \(a\) and \(b\text{:}\)
\(\vu + \vv\) is an element of \(V\) (we say that \(V\) is closed under the addition in \(V\)),
\(\vu + \vv = \vv + \vu\) (we say that the addition in \(V\) is commutative),
\((\vu + \vv) + \vw = \vu + (\vv + \vw)\) (we say that the addition in \(V\) is associative),
there is a vector \(\vzero\) in \(V\) so that \(\vu + \vzero = \vu\) (we say that \(V\) contains an additive identity or zero vector \(\vzero\)),
for each \(\vx\) in \(V\) there is an element \(\vy\) in \(V\) so that \(\vx + \vy = \vzero\) (we say that \(V\) contains an additive inverse \(\vy\) for each element \(\vx\) in \(V\)),
\(a \vu\) is an element of \(V\) (we say that \(V\) is closed under multiplication by scalars),
\((a+b) \vu = a\vu + b\vu\) (we say that multiplication by scalars distributes over scalar addition),
\(a(\vu + \vv) = a\vu + a\vv\) (we say that multiplication by scalars distributes over addition in \(V\)),
\((ab) \vu = a(b\vu)\text{,}\)
\(1 \vu = \vu\text{.}\)
For every \(n\text{,}\) \(\R^n\) is a vector space.
A subset \(W\) of \(\R^n\) is a subspace of \(\R^n\) if \(W\) is a vector space using the same operations as in \(\R^n\text{.}\)
-
To show that a subset \(W\) of \(\R^n\) is a subspace of \(\R^n\text{,}\) we need to prove the following:
\(\vu + \vv\) is in \(W\) whenever \(\vu\) and \(\vv\) are in \(W\) (when this property is satisfied we say that \(W\) is closed under addition),
\(a \vu\) is in \(W\) whenever \(a\) is a scalar and \(\vu\) is in \(W\) (when this property is satisfied we say that \(W\) is closed under multiplication by scalars),
\(\vzero\) is in \(W\text{.}\)
The remaining properties of a vector space are properties of the operation, and as long as we use the same operations as in \(\R^n\text{,}\) the operation properties follow the operations.
The span of any set of vectors in \(\R^n\) is a subspace of \(\R^n\text{.}\)
Exercises Exercises
1.
Each of the following regions or graphs determines a subset \(W\) of \(\R^2\text{.}\) For each region, discuss each of the subspace properties of Theorem 12.4 and explain with justification if the set \(W\) satisfies each property or not.
(a)
(b)
(c)
(d)
2.
Determine which of the following sets \(W\) is a subspace of \(\R^n\) for the indicated value of \(n\text{.}\) Justify your answer.
(a)
\(W = \{[x \ 0]^{\tr} : x \text{ is a scalar } \}\)
(b)
\(W = \{[2x+y \ x-y \ x+y]^{\tr} : x,y \text{ are scalars } \}\)
(c)
\(W = \{[x+1 \ x-1]^{\tr} : x \text{ is a scalar } \}\)
(d)
\(W = \{[xy \ xz \ yz ]^{\tr} : x,y,z \text{ are scalars } \}\)
3.
Find a subset of \(\R^2\) that is closed under addition and scalar multiplication, but that does not contain the zero vector, or explain why no such subset exists.
4.
Let \(\vv\) be a vector in \(\R^2\text{.}\) What is the smallest subspace of \(\R^2\) that contains \(\vv\text{?}\) Explain. Describe this space geometrically.
5.
What is the smallest subspace of \(\R^2\) containing the first quadrant? Justify your answer.
6.
Let \(\vu\text{,}\) \(\vv\text{,}\) and \(\vw\) be vectors in \(\R^3\) with \(\vw = \vu+\vv\text{.}\) Let \(W_1 = \Span\{\vu,\vv\}\) and \(W_2 = \Span\{\vu,\vv,\vw\}\text{.}\)
(a)
If \(\vx\) is in \(W_1\text{,}\) must \(\vx\) be in \(W_2\text{?}\) Explain.
(b)
If \(\vy\) is in \(W_2\text{,}\) must \(\vy\) be in \(W_1\text{?}\) Explain.
(c)
What is the relationship between \(\Span\{\vu,\vv\}\) and \(\Span\{\vu,\vv,\vw\}\text{?}\) Be specific.
7.
Let \(m\) and \(n\) be positive integers, and let \(\vv\) be in \(\R^n\text{.}\) Let \(W = \{A\vv : A \in \M_{m \times n}\}\text{.}\)
(a)
As an example, let \(\vv = [2 \ 1]^{\tr}\) in \(\R^2\) with \(W = \{A\vv : A \in \M_{2 \times 2}\}\text{.}\)
(i)
Show that the vector \([2 \ 1]^{\tr}\) is in \(W\) by finding a matrix \(A\) that places \([2 \ 1]^{\tr}\) in \(W\text{.}\)
(ii)
Show that the the vector \([4 \ 2]^{\tr}\) is in \(W\) by finding a matrix \(A\) that places \([4 \ 2]^{\tr}\) in \(W\text{.}\)
(iii)
Show that the vector \([6 \ -1]^{\tr}\) is in \(W\) by finding a matrix \(A\) that places \([6 \ -1]^{\tr}\) in \(W\text{.}\)
(iv)
Show that \(W = \R^2\text{.}\)
(b)
Show that, regardless of the vector \(\vv\) selected, \(W\) is a subspace of \(\R^m\text{.}\)
(c)
Characterize all of the possibilities for what the subspace \(W\) can be.
There is more than one possibility.
8.
Let \(S_1\) and \(S_2\) be subsets of \(\R^3\) such that \(\Span \ S_1 = \Span \ S_2\text{.}\) Must it be the case that \(S_1\) and \(S_2\) contain at least one vector in common? Justify your answer.
9.
Assume \(W_1\) and \(W_2\) are two subspaces of \(\R^n\text{.}\) Is \(W_1 \cap W_2\) also a subspace of \(\R^n\text{?}\) Is \(W_1 \cup W_2\) also a subspace of \(\R^n\text{?}\) Justify your answer. (Note: The notation \(W_1 \cap W_2\) refers to the vectors common to both \(W_1, W_2\text{,}\) while the notation \(W_1 \cup W_2\) refers to the vectors that are in at least one of \(W_1, W_2\text{.}\))
10.
Determine whether the plane defined by the equation \(5x+3y-2z=0\) is a subspace in \(\R^3\text{.}\)
11.
If \(W\) is a subspace of \(\R^n\) and \(\vu\) is a vector in \(\R^n\) not in \(W\text{,}\) determine whether
is a subspace of \(\R^n\text{.}\)
12.
Two students are talking about examples of subspaces.
Student 1: The \(x\)-axis in \(\R^2\) is a subspace. It is generated by the vector \(\left[ \begin{array}{c} 1\\0 \end{array} \right]\text{.}\)
Student 2: Similarly \(\R^2\) is a subspace of \(\R^3\text{.}\)
Student 1: I'm not sure if that will work. Can we fit \(\R^2\) inside \(\R^3\text{?}\) Don't we need \(W\) to be a subset of \(\R^3\) if it is a subspace of \(\R^3\text{?}\)
Student 2: Of course we can fit \(\R^2\) inside \(\R^3\text{.}\) We can think of \(\R^2\) as vectors \(\left[ \begin{array}{c} a\\b\\0 \end{array} \right]\text{.}\) That's the \(xy\)-plane.
Student 1: I don't know. The vector \(\left[ \begin{array}{c} a\\b\\0 \end{array} \right]\) is not exactly same as \(\left[ \begin{array}{c} a\\b \end{array} \right]\text{.}\)
Student 2: Well, \(\R^2\) is a plane and so is the \(xy\)-plane. So they must be equal, shouldn't they?
Student 1: But there are infinitely many planes in \(\R^3\text{.}\) They can't all be equal to \(\R^2\text{.}\) They all “look like”\(\R^2\) but I don't think we can say they are equal.Which student is correct? Is \(\R^2\) a subspace of \(\R^3\text{,}\) or not? Justify your answer.
13.
Given two subspaces \(H_1, H_2\) of \(\R^n\text{,}\) define
Show that \(H_1+H_2\) is a subspace of \(\R^n\) containing both \(H_1, H_2\) as subspaces. The space \(H_1+H_2\) is the sum of the subspaces \(H_1\) and \(H_2\text{.}\)
14.
Label each of the following statements as True or False. Provide justification for your response.
(a) True/False.
Any line in \(\R^n\) is a subspace in \(\R^n\text{.}\)
(b) True/False.
Any line through the origin in \(\R^n\) is a subspace in \(\R^n\text{.}\)
(c) True/False.
Any plane through the origin in \(\R^n\) is a subspace in \(\R^n\text{.}\)
(d) True/False.
In \(\R^4\text{,}\) the points satisfying \(xy=2t+z\) form a subspace.
(e) True/False.
In \(\R^4\text{,}\) the points satisfying \(x+3y=2z\) form a subspace.
(f) True/False.
Any two nonzero vectors generate a plane subspace in \(\R^3\text{.}\)
(g) True/False.
The space \(\R^2\) is a subspace of \(\R^3\text{.}\)
(h) True/False.
If \(W\) is a subspace of \(\R^n\) and \(\vu\) is in \(W\text{,}\) then the line through the origin and \(\vu\) is in \(W\text{.}\)
(i) True/False.
There are four types of subspaces in \(\R^3\text{:}\) \(\{\vzero\}\text{,}\) line through origin, plane through origin and the whole space \(\R^3\text{.}\)
(j) True/False.
There are four types of subspaces in \(\R^4\text{:}\) \(\{\vzero\}\text{,}\) line through origin, plane through origin and the whole space \(\R^4\text{.}\)
(k) True/False.
The vectors \(\left[ \begin{array}{c} 1\\1\\1 \end{array} \right]\text{,}\) \(\left[ \begin{array}{c} 1\\2\\1 \end{array} \right]\) and \(\left[ \begin{array}{c} 2\\3\\2 \end{array} \right]\) form a subspace in \(\R^3\text{.}\)
(l) True/False.
The vectors \(\left[ \begin{array}{c} 1\\1\\1 \end{array} \right]\) and \(\left[ \begin{array}{c} 1\\2\\1 \end{array} \right]\) form a basis of a subspace in \(\R^3\text{.}\)
Subsection Project: Least Squares Linear Approximation
We return to the problem of finding the least squares line to fit the GDP-consumption data. We will start our work in a more general setting, determining the method for fitting a linear function to fit any data set, like the GDP-consumption data, in the least squares sense. Then we will apply our result to the GDP-consumption data.
Project Activity 12.4.
Suppose we want to fit a linear function \(p(x) = mx+b\) to our data. For the sake of our argument, let us assume the general case where we have \(n\) data points labeled as \((x_1,y_1)\text{,}\) \((x_2, y_2)\text{,}\) \((x_3, y_3)\text{,}\) \(\ldots\text{,}\) \((x_n, y_n)\text{.}\) (In the GDP-consumption data \(n = 21\text{.}\)) In the unlikely event that the graph of \(p(x)\) actually passes through these data points, then we would have the system of equations
in the unknowns \(b\) and \(m\text{.}\)
(a)
As a small example to illustrate, write the system (\ref{eq:LS_system}) using the threepoints \((x_1,y_1)= (1,2)\text{,}\) \((x_2,y_2) = (3,4)\text{,}\) and \((x_3,y_3) = (5,6)\text{.}\) Identify the unknowns and then write this system in the form \(M \va = \vy\text{.}\) Explicitly identify thematrix \(M\) and the the vectors \(\va\) and \(\vy\text{.}\)
(b)
Identify the specific matrix \(M\) and the specific vectors \(\va\) and \(\vy\) using the data inTable \ref{T:GDP_consumption}. Explain why the system \(M \va = \vy\) is inconsistent.(Remember, we are treating consumption as the independent variable and GDP as the dependentvariable.) What does the result tell us about the data?
Project Activity 12.4 shows that the GDP-consumption data does not lie on a line. So instead of attempting to find coefficients \(b\) and \(m\) that give a solution to this system, which may be impossible, we instead look for a vector \(\va^*\) that provides us with something that is “close” to a solution.
If we could find \(b\) and \(m\) that give a solution to the system \(M\va = \vy\text{,}\) then \(M\va - \vy\) would be zero. If we can't make \(M\va - \vy\) exactly equal to the vector \(\vzero\text{,}\) we could instead try to minimize \(M\va-\vy\) in some way. One way is to minimize the length \(||M\va-\vy||\) of the vector \(M\va - \vy\text{.}\)
If we minimize the quantity \(||M\va-\vy||\text{,}\) then we will have minimized a function given by a sum of squares. That is, \(||M\va-\vy||\) is calculated to be
This is why the method we will derive is called the method of least squares. This method provides us with a vector “solution” in a subspace that is related to \(M\text{.}\) We can visualize \(||M \va - \vy||\) as in Figure 12.10. In this figure the data points are shown along with a linear approximation (not the best for illustrative purposes). The lengths of the vertical line segments are the summands \((b+mx_i-y_i)\) in (12.2). So we are trying to minimize the sum of the squares of these line segments.
Suppose that \(\va^*\) minimizes \(||M \va - \vy||\text{.}\) Then the vector \(M \va^*\) is the vector that is closest to \(\vy\) of all of the vectors of the form \(M \vx\text{.}\) The fact that the vectors of the form \(M \vx\) make a subspace will be useful in what follows. We verify that fact in the next project activity.
Project Activity 12.5.
Let \(A\) be an arbitrary \(m \times k\) matrix. Explain why the set \(C = \{A \vx : \vx \in \R^k\}\) is a subspace of \(\R^m\text{.}\)
Project Activity 12.5 shows us that even though the GDP-consumption system \(M \va = \vy\) does not have a solution, we can find a vector that is close to a solution in the subspace \(\{M \vx : \vx \in \R^2\}\text{.}\) That is, find a vector \(\va^*\) in \(\R^{2}\) such that \(M \va^*\) is as close (in the least squares sense) to \(\vy\) as we can get. In other words, the error \(||M \va^* - \vy||\) is as small as possible. In the following activity we see how to find \(\va^*\text{.}\)
Project Activity 12.6.
Let
the quantity we want to minimize. The variables in \(S\) are \(m\) and \(b\text{,}\) so we can think of \(S\) as a function of the two independent variables \(m\) and \(b\text{.}\) The square root makes calculations more complicated, so it is helpful to notice that \(S\) will be a minimum when \(S^2\) is a minimum. Since \(S^2\) is also function of the two variables \(b\) and \(m\text{,}\) the minimum value of \(S^2\) will occur when the partial derivatives of \(S^2\) with respect to \(b\) and \(m\) are both \(0\) (if you haven't yet taken a multivariable calculus course, you can just assume that this is correct). This yields the equations
In this activity we solve equations (12.3) and (12.4) for the unknowns \(b\) and \(m\text{.}\) (Do this in a general setting without using specific values for the \(x_i\) and \(y_i\text{.}\))
(a)
Let \(r=\sum_{i=1}^n x_i^2\text{,}\) \(s=\sum_{i=1}^n x_i\text{,}\) \(t = \sum_{i=1}^n y_i\text{,}\) and \(u =\sum_{i=1}^n x_iy_i\text{.}\) Show that the equations (12.3) and (12.4) can be written in the form
Note that this is a system of two linear equations in the unknowns \(b\) and \(m\text{.}\)
(b)
Write the system from part (a) in matrix form \(A \vx = \vb\text{.}\) Then use techniques from linear algebra to solve the linear system to show that
and
Project Activity 12.7.
Use the formulas (12.5) and (12.6) to find the values of \(b\) and \(m\) for the regression line to fit the GDP-consumption data in Table 12.1. You may use the fact that the sum of the GDP data is \(3.5164030 \times 10^6\text{,}\) the sum of the consumption data is \(2.9233750 \times 10^6\text{,}\) the sum of the squares of the consumption data is \(8.806564894 \times 10^{11}\text{,}\) and the sum of the products of the GDP and consumption data is \(1.069946378 \times 10^{12}\text{.}\) Compare to the results the authors obtained in the paper “A Statistical Analysis of GDP and Final Consumption Using Simple Linear Regression, the Case of Romania 1990-2010”.researchgate.net/publication/227382939_A_STATISTICAL_ANALYSIS_OF_GDP_AND_FINAL_CONSUMPTION_USING_SIMPLE_LINEAR_REGRESSION_THE_CASE_OF_ROMANIA_1990-2010