Section 19 Diagonalization
Focus Questions
By the end of this section, you should be able to give precise and thorough answers to the questions listed below. You may want to keep these questions in mind to focus your thoughts as you complete the section.
What is a diagonal matrix?
What does it mean to diagonalize a matrix?
What does it mean for two matrices to be similar?
What important properties do similar matrices share?
Under what conditions is a matrix diagonalizable?
When a matrix \(A\) is diagonalizable, what is the structure of a matrix \(P\) that diagonalizes \(A\text{?}\)
Why is diagonalization useful?
Subsection Application: The Fibonacci Numbers
In 1202 Leonardo of Pisa (better known as Fibonacci) published Liber Abaci (roughly translated as The Book of Calculation), in which he constructed a mathematical model of the growth of a rabbit population. The problem Fibonacci considered is that of determining the number of pairs of rabbits produced in a given time period beginning with an initial pair of rabbits. Fibonacci made the assumptions that each pair of rabbits more than one month old produces a new pair of rabbits each month, and that no rabbits die. (We ignore any issues about that might arise concerning the gender of the offspring.) If we let \(F_n\) represent the number of rabbits in month \(n\text{,}\) Fibonacci produced the model
for \(n \geq 0\) where \(F_0 = 0\) and \(F_1 = 1\text{.}\) The resulting sequence
is a very famous sequence in mathematics and is called the Fibonacci sequence. This sequence is thought to model many natural phenomena such as number of seeds in a sunflower and anything which grows in a spiral form. It is so famous in fact that it has a journal devoted entirely to it. As a note, while Fibonacci's work Liber Abaci introduced this sequence to the western world, it had been described earlier Sanskrit texts going back as early as the sixth century.
By definition, the Fibonacci numbers are calculated by recursion. This is a very ineffective way to determine entries \(F_n\) for large \(n\text{.}\) Later in this section we will derive a fascinating and unexpected formula for the Fibonacci numbers using the idea of diagonalization.
Subsection Introduction
As we have seen when studying Markov processes, each state is dependent on the previous state. If \(\vx_0\) is the initial state and \(A\) is the transition matrix, then the \(n\)th state is found by \(A^n \vx_0\text{.}\) In these situations, and others, it is valuable to be able to quickly and easily calculate powers of a matrix. We explore a way to do that in this section.
Preview Activity 19.1.
Consider a very simplified weather forecast. Let us assume there are two possible states for the weather: rainy (\(R\)) or sunny(\(S\)). Let us also assume that the weather patterns are stable enough that we can reasonably predict the weather tomorrow based on the weather today. If it is sunny today, then there is a 70% chance that it will be sunny tomorrow, and if it is rainy today then there is a 40% chance that it will be rainy tomorrow. If \(\vx_0 = \left[ \begin{array}{c} s \\ r \end{array} \right]\) is a state vector that indicates a probability \(s\) that it is sunny and probability \(r\) that it is rainy on day \(0\text{,}\) then
tells us the likelihood of it being sunny or rainy on day 1. Let \(A = \left[ \begin{array}{cc} 0.70\amp 0.40 \\ 0.30\amp 0.60 \end{array} \right]\text{.}\)
(a)
Suppose it is sunny today, that is \(\vx_0 = \left[ \begin{array}{c} 1 \\ 0 \end{array} \right]\text{.}\) Calculate \(\vx_1 = A \vx_0\) and explain how this matrix-vector product tells us the probability that it will be sunny tomorrow.
(b)
Calculate \(\vx_2 = A\vx_1\) and interpret the meaning of each component of the product.
(c)
Explain why \(\vx_2 = A^2 \vx_0\text{.}\) Then explain in general why \(\vx_n = A^n \vx_0\text{.}\)
(d)
The previous result demonstrates that to determine the long-term probability of a sunny or rainy day, we want to be able to easily calculate powers of the matrix \(A\text{.}\) Use a computer algebra system (e.g., Maple, Mathematica, Wolfram\(|\)Alpha) to calculate the entries of \(\vx_{10}\text{,}\) \(\vx_{20}\text{,}\) and \(\vx_{30}\text{.}\) Based on this data, what do you expect the long term probability of any day being a sunny one?
Subsection Diagonalization
In Preview Activity 19.1 we saw how if we can powers of a matrix we can make predictions about the long-term behavior of some systems. In general, calculating powers of a matrix can be a very difficult thing, but there are times when the process is straightforward.
Activity 19.2.
Let \(D = \left[ \begin{array}{cc} 2\amp 0 \\ 0\amp 3 \end{array} \right]\text{.}\)
(a)
Show that \(D^2 = \left[ \begin{array}{cc} 2^2\amp 0 \\ 0\amp 3^2 \end{array} \right]\text{.}\)
(b)
Show that \(D^3 = \left[ \begin{array}{cc} 2^3\amp 0 \\ 0\amp 3^3 \end{array} \right]\text{.}\) Hint.
\(D^3 = DD^2\text{.}\)
(c)
Explain in general why \(D^n = \left[ \begin{array}{cc} 2^n\amp 0 \\ 0\amp 3^n \end{array} \right]\) for any positive integer \(n\text{.}\)
Activity 19.2 illustrates that calculating powers of square matrices whose only nonzero entries are along the diagonal is rather simple. In general, if
then
for any positive integer \(k\text{.}\) Recall that a diagonal matrix is a matrix whose only nonzero elements are along the diagonal (see Definition 8.6). In this section we will see that matrices that are similar to diagonal matrices have some very nice properties, and that diagonal matrices are useful in calculations of powers of matrices.
We can utilize the method of calculating powers of diagonal matrices to also easily calculate powers of other types of matrices.
Activity 19.3.
Let \(D\) be any matrix, \(P\) an invertible matrix, and let \(A = P^{-1}DP\text{.}\)
(a)
Show that \(A^2 = P^{-1}D^2P\text{.}\)
(b)
Show that \(A^3 = P^{-1}D^3P\text{.}\)
(c)
Explain in general why \(A^n = P^{-1}D^nP\) for positive integers \(n\text{.}\)
As Activity 19.3 illustrates, to calculate the powers of a matrix of the form \(P^{-1}DP\) we only need determine the powers of the matrix \(D\text{.}\) If \(D\) is a diagonal matrix, this is especially straightforward.
Subsection Similar Matrices
Similar matrices play an important role in certain calculations. For example, Activity 19.3 showed that if we can write a square matrix \(A\) in the form \(A = P^{-1}DP\) for some invertible matrix \(P\) and diagonal matrix \(D\text{,}\) then finding the powers of \(A\) is straightforward. As we will see, the relation \(A = P^{-1}DP\) will imply that the matrices \(A\) and \(D\) share many properties.
Definition 19.1.
The \(n \times n\) matrix \(A\) is similar to the \(n \times n\) matrix \(B\) if there is an invertible matrix \(P\) such that \(A = P^{-1}BP\text{.}\)
Activity 19.4.
Let \(A = \left[ \begin{array}{cc} 1\amp 1\\2\amp 0 \end{array} \right]\) and \(B = \left[ \begin{array}{cr} 2\amp 2\\0\amp -1 \end{array} \right]\text{.}\) Assume that \(A\) is similar to \(B\) via the matrix \(P = \left[ \begin{array}{cc} 2\amp 1\\2\amp 2 \end{array} \right]\text{.}\)
(a)
Calculate \(\det(A)\) and \(\det(B)\text{.}\) What do you notice?
(b)
Find the characteristic polynomials of \(A\) and \(B\text{.}\) What do you notice?
(c)
What can you say about the eigenvalues of \(A\) and \(B\text{?}\) Explain.
(d)
Explain why \(\vx = \left[ \begin{array}{c} 1\\1 \end{array} \right]\) is an eigenvector for \(A\) with eigenvalue 2. Is \(\vx\) an eigenvector for \(B\) with eigenvalue 2? Why or why not?
Activity 19.4 suggests that similar matrices share some, but not all, properties. Note that if \(A = P^{-1}BP\text{,}\) then \(B = Q^{-1}AQ\) with \(Q = P^{-1}\text{.}\) So if \(A\) is similar to \(B\text{,}\) then \(B\) is similar to \(A\text{.}\) Similarly (no pun intended), since \(A = I^{-1}AI\) (where \(I\) is the identity matrix), then any square matrix is similar to itself. Also, if \(A = P^{-1}BP\) and \(B = M^{-1}CM\text{,}\) then \(A = (MP)^{-1}C(MP)\text{.}\) So if \(A\) is similar to \(B\) and \(B\) is similar to \(C\text{,}\) then \(A\) is similar to \(C\text{.}\) If you have studied relations, these three properties show that similarity is an equivalence relation on the set of all \(n \times n\) matrices. This is one reason why similar matrices share many important traits, as the next activity highlights.
Activity 19.5.
Let \(A\) and \(B\) be similar matrices with \(A = P^{-1}BP\text{.}\)
(a)
Use the multiplicative property of the determinant to explain why \(\det(A) = \det(B)\text{.}\) So similar matrices have the same determinants.
(b)
Use the fact that \(P^{-1}IP = I\) to show that \(A-\lambda I\) is similar to \(B - \lambda I\text{.}\)
(c)
Explain why it follows from (a) and (b) that
So similar matrices have the same characteristic polynomial, and the same eigenvalues.
We summarize some properties of similar matrices in the following theorem.
Theorem 19.2.
Let \(A\) and \(B\) be similar \(n \times n\) matrices and \(I\) the \(n \times n\) identity matrix. Then
\(\det(A) = \det(B)\text{,}\)
\(A-\lambda I\) is similar to \(B - \lambda I\text{,}\)
\(A\) and \(B\) have the same characteristic polynomial,
\(A\) and \(B\) have the same eigenvalues.
Subsection Similarity and Matrix Transformations
When a matrix is similar to a diagonal matrix, we can gain insight into the action of the corresponding matrix transformation. As an example, consider the matrix transformation \(T\) from \(\R^2\) to \(\R^2\) defined by \(T(\vx) = A \vx\text{,}\) where
We are interested in understanding what this matrix transformation does to vectors in \(\R^2\text{.}\) First we note that \(A\) has eigenvalues \(\lambda_1 = 2\) and \(\lambda_2 = 4\) with corresponding eigenvectors \(\vv_1 = \left[ \begin{array}{r} -1\\1 \end{array} \right]\) and \(\vv_2 = \left[ \begin{array}{c} 1\\1 \end{array} \right]\text{.}\) If we let \(P = [\vv_1 \ \vv_2]\text{,}\) then you can check that
and
where
Thus,
A simple calculation shows that
Let us apply \(T\) to the unit square whose sides are formed by the vectors \(\ve_1 = \left[ \begin{array}{c} 1 \\ 0 \end{array} \right]\) and \(\ve_2 = \left[ \begin{array}{c} 0 \\ 1 \end{array} \right]\) as shown in the first picture in Figure 19.3.
To apply \(T\) we first multiply \(\ve_1\) and \(\ve_2\) by \(P^{-1}\text{.}\) This gives us
So \(P^{-1}\) transforms the standard coordinate system into a coordinate system in which columns of \(P^{-1}\) determine the axes, as illustrated in the second picture in Figure 19.3. Applying \(D\) to the output scales by 2 in the first component and by \(4\) in the second component as depicted in the third picture in Figure 19.3. Finally, we apply \(P\) to translate back into the standard \(xy\) coordinate system as shown in the last picture in Figure 19.3. In this case, we can visualize that when we apply the transformation \(T\) to a vector in this system it is just scaled in the \(P^{-1}\ve_1-P^{-1}\ve_2\) system by the matrix \(D\text{.}\) Then the matrix \(P\) translates everything back to the standard \(xy\) coordinate system.
This geometric perspective provides another example of how having a matrix similar to a diagonal matrix informs us about the situation. In what follows we determine the conditions that determine when a matrix is similar to a diagonal matrix.
Subsection Diagonalization in General
In Preview Activity 19.1 and in the matrix transformation example we found that a matrix \(A\) was similar to a diagonal matrix whose columns were eigenvectors of \(A\text{.}\) This will work for a general \(n \times n\) matrix \(A\) as long as we can find an invertible matrix \(P\) whose columns are eigenvectors of \(A\text{.}\) More specifically, suppose \(A\) is an \(n \times n\) matrix with \(n\) linearly independent eigenvectors \(\vv_1\text{,}\) \(\vv_1\text{,}\) \(\ldots\text{,}\) \(\vv_n\) with corresponding eigenvalues \(\lambda_1\text{,}\) \(\lambda_1\text{,}\) \(\ldots\text{,}\) \(\lambda_n\) (not necessarily distinct). Let
Then
where
Since the columns of \(P\) are linearly independent, we know \(P\) is invertible, and so
Definition 19.4.
An \(n \times n\) matrix \(A\) is diagonalizable if there is an invertible \(n \times n\) matrix \(P\) so that \(P^{-1}AP\) is a diagonal matrix.
In other words, a matrix \(A\) is diagonalizable if \(A\) is similar to a diagonal matrix.
IMPORTANT NOTE.
The key notion to the process described above is that in order to diagonalize an \(n\times n\) matrix \(A\text{,}\) we have to find \(n\) linearly independent eigenvectors for \(A\text{.}\) When \(A\) is diagonalizable, a matrix \(P\) so that \(P^{-1}AP\) is diagonal is said to diagonalize \(A\text{.}\)
Activity 19.6.
Find an invertible matrix \(P\) that diagonalizes \(A\text{.}\)
(a)
\(A = \left[ \begin{array}{cc} 1\amp 1 \\ 0\amp 2 \end{array} \right]\)
(b)
\(A = \left[ \begin{array}{ccc} 3\amp 2\amp 4 \\ 2\amp 0\amp 2 \\ 4\amp 2\amp 3 \end{array} \right]\text{.}\)
The eigenvalues of \(A\) are 8 and \(-1\text{.}\)
It should be noted that there are square matrices that are not diagonalizable. For example, the matrix \(A = \left[ \begin{array}{cc} 1\amp 1 \\ 0\amp 1 \end{array} \right]\) has 1 as its only eigenvalue and the dimension of the eigenspace of \(A\) corresponding to the eigenvalue is one. Therefore, it will be impossible to find two linearly independent eigenvectors for \(A\text{.}\)
We showed previously that eigenvectors corresponding to distinct eigenvalue are always linearly independent, so if an \(n \times n\) matrix \(A\) has \(n\) distinct eigenvalues then \(A\) is diagonalizable. Activity 19.6 (b) shows that it is possible to diagonalize an \(n \times n\) matrix even if the matrix does not have \(n\) distinct eigenvalues. In general, we can diagonalize a matrix as long as the dimension of each eigenspace is equal to the multiplicity of the corresponding eigenvalue. In other words, a matrix is diagonalizable if the geometric multiplicity is the same is the algebraic multiplicity for each eigenvalue.
At this point we might ask one final question. We argued that if an \(n \times n\) matrix \(A\) has \(n\) linearly independent eigenvectors, then \(A\) is diagonalizable. It is reasonable to wonder if the converse is true — that is, if \(A\) is diagonalizable, must \(A\) have \(n\) linearly independent eigenvectors? The answer is yes, and you are asked to show this in Exercise 6. We summarize the result in the following theorem.
Theorem 19.5. The Diagonalization Theorem.
An \(n \times n\) matrix \(A\) is diagonalizable if and only if \(A\) has \(n\) linearly independent eigenvectors. If \(A\) is diagonalizable and has linearly independent eigenvectors \(\vv_1\text{,}\) \(\vv_2\text{,}\) \(\ldots\text{,}\) \(\vv_n\) with \(A\vv_i = \lambda_i \vv_i\) for each \(i\text{,}\) then \(n \times n\) matrix \(P [ \vv_1 \ \vv_2 \ \cdots \ \vv_n]\) whose columns are linearly independent eigenvectors of \(A\) satisfies \(P^{-1}AP = D\text{,}\) where \(D = [d_{ij}]\) is the diagonal matrix with diagonal entries \(d_{ii} = \lambda_i\) for each \(i\text{.}\)
Subsection Examples
What follows are worked examples that use the concepts from this section.
Example 19.6.
Let \(A = \left[ \begin{array}{crr} 1\amp -2\amp 1 \\ 0\amp 3\amp -1 \\ 0\amp -2\amp 2 \end{array} \right]\) and \(B = \left[ \begin{array}{ccc} 1\amp 2\amp 0 \\ 0\amp 1\amp 0 \\ 0\amp 0\amp 4 \end{array} \right]\text{.}\) You should use appropriate technology to calculate determinants, perform any row reductions, or solve any polynomial equations.
(a)
Determine if \(A\) is diagonalizable. If diagonalizable, find a matrix \(P\) that diagonalizes \(A\text{.}\)
Solution.
Technology shows that the characteristic polynomial of \(A\) is
The eigenvalues of \(A\) are the solutions to the characteristic equation \(p(\lambda) = 0\text{.}\) Thus, the eigenvalues of \(A\) are \(1\) and \(4\text{.}\) To find a basis for the eigenspace of \(A\) corresponding to the eigenvalue \(1\text{,}\) we find the general solution to the homogeneous system \((A - I_3)\vx = \vzero\text{.}\) Using technology we see that the reduced row echelon form of \(A - I_3 = \left[ \begin{array}{crr} 0\amp -2\amp 1 \\ 0\amp 2\amp -1 \\ 0\amp -2\amp 1 \end{array} \right]\) is \(\left[ \begin{array}{ccr} 0\amp 1\amp -\frac{1}{2} \\ 0\amp 0\amp 0 \\ 0\amp 0\amp 0 \end{array} \right]\text{.}\) So if \(\vx = \left[ \begin{array}{c} x_1 \\ x_2 \\ x_3 \end{array} \right]\text{,}\) then the general solution to \((A-I_3)\vx = \vzero\) is
So a basis for the eigenspace of \(A\) corresponding to the eigenvalue \(1\) is
To find a basis for the eigenspace of \(A\) corresponding to the eigenvalue \(4\text{,}\) we find the general solution to the homogeneous system \((A - 4I_3)\vx = \vzero\text{.}\) Using technology we see that the reduced row echelon form of \(A - 4I_3 = \left[ \begin{array}{rrr} -3\amp -2\amp 1 \\ 0\amp -1\amp -1 \\ 0\amp -2\amp -2 \end{array} \right]\) is \(\left[ \begin{array}{ccr} 0\amp 1\amp -1 \\ 0\amp 1\amp 1 \\ 0\amp 0\amp 0 \end{array} \right]\text{.}\) So if \(\vx = \left[ \begin{array}{c} x_1 \\ x_2 \\ x_3 \end{array} \right]\text{,}\) then the general solution to \((A-4I_3)\vx = \vzero\) is
So a basis for the eigenspace of \(A\) corresponding to the eigenvalue \(4\) is
Eigenvectors corresponding to different eigenvalues are linearly independent, so the set
is a basis for \(\R^3\text{.}\) Since we can find a basis for \(\R^3\) consisting of eigenvectors of \(A\text{,}\) we conclude that \(A\) is diagonalizable. Letting
gives us
(b)
Determine if \(B\) is diagonalizable. If diagonalizable, find a matrix \(Q\) that diagonalizes \(B\text{.}\)
Solution.
Technology shows that the characteristic polynomial of \(B\) is
The eigenvalues of \(B\) are the solutions to the characteristic equation \(p(\lambda) = 0\text{.}\) Thus, the eigenvalues of \(B\) are \(1\) and \(4\text{.}\) To find a basis for the eigenspace of \(B\) corresponding to the eigenvalue \(1\text{,}\) we find the general solution to the homogeneous system \((B - I_3)\vx = \vzero\text{.}\) Using technology we see that the reduced row echelon form of \(B - I_3 = \left[ \begin{array}{ccc} 0\amp 2\amp 0 \\ 0\amp 0\amp 0 \\ 0\amp 0\amp 3 \end{array} \right]\) is \(\left[ \begin{array}{ccc} 0\amp 1\amp 0 \\ 0\amp 0\amp 1 \\ 0\amp 0\amp 0 \end{array} \right]\text{.}\) So if \(\vx = \left[ \begin{array}{c} x_1 \\ x_2 \\ x_3 \end{array} \right]\text{,}\) then the general solution to \((B-I_3)\vx = \vzero\) is
So a basis for the eigenspace of \(B\) corresponding to the eigenvalue \(1\) is
To find a basis for the eigenspace of \(B\) corresponding to the eigenvalue \(4\text{,}\) we find the general solution to the homogeneous system \((B - 4I_3)\vx = \vzero\text{.}\) Using technology we see that the reduced row echelon form of \(B - 4I_3 = \left[ \begin{array}{rrc} -3\amp 2\amp 0 \\ 0\amp -3\amp 0 \\ 0\amp 0\amp 0 \end{array} \right]\) is \(\left[ \begin{array}{ccc} 1\amp 0\amp 0 \\ 0\amp 1\amp 0 \\ 0\amp 0\amp 0 \end{array} \right]\text{.}\) So if \(\vx = \left[ \begin{array}{c} x_1 \\ x_2 \\ x_3 \end{array} \right]\text{,}\) then the general solution to \((B-4I_3)\vx = \vzero\) is
So a basis for the eigenspace of \(B\) corresponding to the eigenvalue \(4\) is
Since each eigenspace is one-dimensional, we cannot find a basis for \(\R^3\) consisting of eigenvectors of \(B\text{.}\) We conclude that \(B\) is not diagonalizable.
(c)
Is it possible for two matrices \(R\) and \(S\) to have the same eigenvalues with the same algebraic multiplicities, but one matrix is diagonalizable and the other is not? Explain.
Solution.
Yes it is possible for two matrices \(R\) and \(S\) to have the same eigenvalues with the same multiplicities, but one matrix is diagonalizable and the other is not. An example is given by the matrices \(A\) and \(B\) in this problem.
Example 19.7.
(a)
Is it possible to find diagonalizable matrices \(A\) and \(B\) such that \(AB\) is not diagonalizable? If yes, provide an example. If no, explain why.
Solution.
Let \(A = \left[ \begin{array}{cc} 1\amp 1\\0\amp 2 \end{array} \right]\) and \(B = \left[ \begin{array}{cr} 2\amp -2 \\ 0\amp 1 \end{array} \right]\text{.}\) Since \(A\) and \(B\) are both diagonal matrices, their eigenvalues are their diagonal entries. With \(2\) distinct eigenvalues, both \(A\) and \(B\) are diagonalizable. In this case we have \(AB = \left[ \begin{array}{cr} 2\amp -1\\0\amp 2 \end{array} \right]\text{,}\) whose only eigenvector is \(2\text{.}\) The reduced row echelon form of \(AB - 2I_2\) is \(\left[ \begin{array}{cr} 0\amp 1\\0\amp 0 \end{array} \right]\text{.}\) So a basis for the eigenspace of \(AB\) is \(\{[1 \ 0]^{\tr}\}\text{.}\) Since there is no basis for \(\R^2\) consisting of eigenvectors of \(AB\text{,}\) we conclude that \(AB\) is not diagonalizable.
(b)
Is it possible to find diagonalizable matrices \(A\) and \(B\) such that \(A+B\) is not diagonalizable? If yes, provide an example. If no, explain why.
Solution.
Let \(A = \left[ \begin{array}{cc} 1\amp 3\\0\amp 2 \end{array} \right]\) and \(B = \left[ \begin{array}{rr} 2\amp 0 \\ 0\amp 1 \end{array} \right]\text{.}\) Since \(A\) and \(B\) are both diagonal matrices, their eigenvalues are their diagonal entries. With \(2\) distinct eigenvalues, both \(A\) and \(B\) are diagonalizable. In this case we have \(A+B = \left[ \begin{array}{cc} 3\amp 3\\0\amp 3 \end{array} \right]\text{,}\) whose only eigenvector is \(3\text{.}\) The reduced row echelon form of \((A+B) - 3I_2\) is \(\left[ \begin{array}{cc} 0\amp 1\\0\amp 0 \end{array} \right]\text{.}\) So a basis for the eigenspace of \(A+B\) is \(\{[1 \ 0]^{\tr}\}\text{.}\) Since there is no basis for \(\R^2\) consisting of eigenvectors of \(A+B\text{,}\) we conclude that \(A+B\) is not diagonalizable.
(c)
Is it possible to find a diagonalizable matrix \(A\) such that \(A^{\tr}\) is not diagonalizable? If yes, provide an example. If no, explain why.
Solution.
It is not possible to find a diagonalizable matrix \(A\) such that \(A^{\tr}\) is not diagonalizable. To see why, suppose that matrix \(A\) is diagonalizable. That is, there exists a matrix \(P\) such that \(P^{-1}AP = D\text{,}\) where \(D\) is a diagonal matrix. Recall that \(\left(P^{-1}\right)^{\tr} = \left(P^{\tr}\right)^{-1}\text{.}\) So
Letting \(A = \left(P^{\tr}\right)^{-1}\text{,}\) we conclude that
Therefore, \(Q\) diagonalizes \(A^{\tr}\text{.}\)
(d)
Is it possible to find an invertible diagonalizable matrix \(A\) such that \(A^{-1}\) is not diagonalizable? If yes, provide an example. If no, explain why.
Solution.
It is not possible to find an invertible diagonalizable matrix \(A\) such that \(A^{-1}\) is not diagonalizable. To see why, suppose that matrix \(A\) is diagonalizable. That is, there exists a matrix \(P\) such that \(P^{-1}AP = D\text{,}\) where \(D\) is a diagonal matrix. Thus, \(A = PDP^{-1}\text{.}\) Since \(A\) is invertible, \(\det(A) \neq 0\text{.}\) It follows that \(\det(D) \neq 0\text{.}\) So none of the diagonal entries of \(D\) can be \(0\text{.}\) Thus, \(D\) is invertible and \(D^{-1}\) is a diagonal matrix. Then
and so \(P^{-1}\) diagonalizes \(A^{-1}\text{.}\)
Subsection Summary
A matrix \(D = [d_{ij}]\) is a diagonal matrix if \(d_{ij} = 0\) whenever \(i \neq j\text{.}\)
A matrix \(A\) is diagonalizable if there is an invertible matrix \(P\) so that \(P^{-1}AP\) is a diagonal matrix.
-
Two matrices \(A\) and \(B\) are similar if there is an invertible matrix \(P\) so that
\begin{equation*} B = P^{-1}AP\text{.} \end{equation*} Similar matrices have the same determinants, same characteristic polynomials, and same eigenvalues. Note that similar matrices do not necessarily have the same eigenvectors corresponding to the same eigenvalues.
An \(n \times n\) matrix \(A\) is diagonalizable if and only if \(A\) has \(n\) linearly independent eigenvectors.
When an \(n \times n\) matrix \(A\) is diagonalizable, then \(P = [ \vv_1 \ \vv_2 \ \vv_3 \ \cdots \ \vv_n]\) is invertible and \(P^{-1}AP\) is diagonal, where \(\vv_1\text{,}\) \(\vv_2\text{,}\) \(\ldots\text{,}\) \(\vv_n\) are \(n\) linearly independent eigenvectors for \(A\text{.}\)
One use for diagonalization is that once we have diagonalized a matrix \(A\) we can quickly and easily compute powers of \(A\text{.}\) Diagonalization can also help us understand the actions of matrix transformations.
Exercises Exercises
1.
Determine if each of the following matrices is diagonalizable or not. For diagonalizable matrices, clearly identify a matrix \(P\) which diagonalizes the matrix, and what the resulting diagonal matrix is.
(a)
\(A=\left[ \begin{array}{cr} 2\amp -1\\ 1\amp 4 \end{array} \right]\)
(b)
\(A=\left[ \begin{array}{rcr} -1 \amp 4 \amp -2 \\ -3 \amp 4 \amp 0 \\ -3 \amp 1 \amp 3 \end{array} \right]\)
2.
The \(3\times 3\) matrix \(A\) has two eigenvalues \(\lambda_1=2\) and \(\lambda_2=3\text{.}\) The vectors \(\left[ \begin{array}{c} 1\\2\\1 \end{array} \right]\text{,}\) \(\left[ \begin{array}{r} 1\\-1\\2 \end{array} \right]\text{,}\) and \(\left[ \begin{array}{c} 2\\4\\2 \end{array} \right]\) are eigenvectors for \(\lambda_1=2\text{,}\) while the vectors \(\left[ \begin{array}{c} 1\\1\\1 \end{array} \right], \left[ \begin{array}{c} 2\\2\\2 \end{array} \right]\) are eigenvectors for \(\lambda_2=3\text{.}\) Find the matrix \(A\text{.}\)
3.
Find a \(2\times 2\) non-diagonal matrix \(A\) and two different pairs of \(P\) and \(D\) matrices for which \(A=PDP^{-1}\text{.}\)
4.
Find a \(2\times 2\) non-diagonal matrix \(A\) and two different \(P\) matrices for which \(A=PDP^{-1}\) with the same \(D\text{.}\)
5.
Suppose a \(4\times 4\) matrix \(A\) has eigenvalues 2, 3 and 5 and the eigenspace for the eigenvalue 3 has dimension 2. Do we have enough information to determine if \(A\) is diagonalizable? Explain.
6.
Let \(A\) be a diagonalizable \(n \times n\) matrix. Show that \(A\) has \(n\) linearly independent eigenvectors.
7.
(a)
Let \(A = \left[ \begin{array}{cc} 1\amp 1\\0\amp 1 \end{array} \right]\) and \(B = \left[ \begin{array}{cc} 1\amp 2\\0\amp 1 \end{array} \right]\text{.}\) Find the eigenvalues and eigenvectors of \(A\) and \(B\text{.}\) Conclude that it is possible for two different \(n \times n\) matrices \(A\) and \(B\) to have exactly the same eigenvectors and corresponding eigenvalues.
(b)
A natural question to ask is if there are any conditions under which \(n \times n\) matrices that have exactly the same eigenvectors and corresponding eigenvalues must be equal. Determine the answer to this question if \(A\) and \(B\) are both diagonalizable.
8.
(a)
Show that if \(D\) and \(D'\) are \(n \times n\) diagonal matrices, then \(DD' = D'D\text{.}\)
(b)
Show that if \(A\) and \(B\) are \(n \times n\) matrices and \(P\) is an invertible \(n \times n\) matrix such that \(P^{-1}AP = D\) and \(P^{-1}BP = D'\) with \(D\) and \(D'\) diagonal matrices, then \(AB = BA\text{.}\)
9.
Exercise 1 in Section 18 shows that the determinant of a matrix is the product of its eigenvalues. In this exercise we show that the trace of a diagonalizable matrix is the sum of its eigenvalues. 35 First we define the trace of a matrix.
Definition 19.8.
The trace of an \(n \times n\) matrix \(A = [a_{ij}]\) is the sum of the diagonal entries of \(A\text{.}\) That is,
(a)
Show that if \(R = [r_{ij}]\) and \(S = [s_{ij}]\) are \(n \times n\) matrices, then \(\trace(RS) = \trace(SR)\text{.}\)
(b)
Let \(A\) be a diagonalizable \(n \times n\) matrix, and let \(p(\lambda) = \det(A - \lambda I_n)\) be the characteristic polynomial of \(A\text{.}\) Let \(P\) be an invertible matrix such that \(P^{-1}AP = D\text{,}\) where \(D\) is the diagonal matrix whose diagonal entries are \(\lambda_1\text{,}\) \(\lambda_2\text{,}\) \(\ldots\text{,}\) \(\lambda_n\text{,}\) the eigenvalues of \(A\) (note that these eigenvalues may not all be distinct).
(i)
Explain why \(\trace(A) = \trace(D)\text{.}\)
(ii)
Show that the trace of an \(n \times n\) diagonalizable matrix is the sum of the eigenvalues of the matrix.
10.
In this exercise we generalize the result of Exercise 12 in Section 8 to arbitrary diagonalizable matrices.
(a)
Show that if
then
(b)
Now suppose that an \(n \times n\) matrix \(A\) is diagonalizable, with \(P^{-1}AP\) equal to a diagonal matrix \(D\text{.}\) Show that \(e^A = Pe^DP^{-1}\text{.}\)
11.
Let \(A = \left[ \begin{array}{cc} 1\amp 1\\0\amp 0 \end{array} \right]\) and let \(B = \left[ \begin{array}{cr} 0\amp -1 \\ 0\amp 0 \end{array} \right]\text{.}\)
(a)
Use the result of Exercise 10 to calculate \(e^A\text{.}\)
(b)
Calculate \(e^B\text{.}\)
Explain why \(B\) is not diagonalizable.
(c)
Use the result of Exercise 10 to calculate \(e^{A+B}\text{.}\)
(d)
The real exponential function satisfies some familiar properties. For example, \(e^xe^y = e^ye^x\) and \(e^{x+y} = e^x e^y\) for any real numbers \(x\) and \(y\text{.}\) Does the matrix exponential satisfy the corresponding properties. That is, if \(X\) and \(Y\) are \(n \times n\) matrices, must \(e^Xe^Y = e^Ye^X\) and \(e^{X+Y} = e^X e^Y\text{?}\) Explain.
12.
In Exercise 11 we see that we cannot conclude that \(e^{X+Y} = e^X e^Y\) for \(n \times n\) matrices \(X\) and \(Y\text{.}\) However, a more limited property is true.
(a)
Follow the steps indicated to show that if \(A\) is an \(n \times n\) matrix and \(s\) and \(t\) are any scalars, then \(e^{As} e^{At} = e^{A(s+t)}\text{.}\) (Although we will not use it, you may assume that the series for \(e^A\) converges for any square matrix \(A\text{.}\))
(i)
Use the definition to show that
(ii)
Relabel and reorder terms with \(n = k+m\) to show that
(iii)
Complete the problem using the Binomial Theorem that says
(b)
Use the result of part (a) to show that \(e^A\) is an invertible matrix for any \(n \times n\) matrix \(A\text{.}\)
13.
There is an interesting connection between the determinant of a matrix exponential and the trace of the matrix. Let \(A\) be a diagonalizable \(n \times n\) matrix with real entries. Let \(D = P^{-1}AP\) for some invertible matrix \(P\text{,}\) where \(D\) is the diagonal matrix with entries \(\lambda_1\text{,}\) \(\lambda_2\text{,}\) \(\ldots\text{,}\) \(\lambda_n\) the eigenvalues of \(A\text{.}\)
(a)
Show that \(e^A = Pe^DP^{-1}\text{.}\)
(b)
Use Exercise 9 to show that
14.
There is interesting relationship between a matrix and its characteristic equation that we explore in this exercise.
(a)
We first illustrate with an example. Let \(B = \left[ \begin{array}{cr} 1\amp 2\\ 1\amp -2 \end{array} \right]\text{.}\)
(i)
Show that \(\lambda^2 + \lambda - 4\) is the characteristic polynomial for \(B\text{.}\)
(ii)
Calculate \(B^2\text{.}\) Then compute \(B^2 + B - 4I_2\text{.}\) What do you get?
(b)
The first part of this exercise presents an example of a matrix that satisfies its own characteristic equation. Show that if \(A\) is an \(n \times n\) diagonalizable matrix with characteristic polynomial \(p(x)\text{,}\) then \(p(A) = 0\text{.}\) 36 That is, if \(p(x) = a_nx^n+a_{n-1}x^{n-1} + \cdots + a_1x + a_0\text{,}\) then \(p(A) = a_nA^n+a_{n-1}A^{n-1} + \cdots + a_1A + a_0 = 0\text{.}\) (Hint: If \(A = PDP^{-1}\) for some diagonal matrix \(D\text{,}\) show that \(p(A) = Pp(D)P^{-1}\text{.}\) Then determine \(p(D)\text{.}\))
15.
Label each of the following statements as True or False. Provide justification for your response.
(a) True/False.
If matrix \(A\) is diagonalizable, then so is \(A^T\text{.}\)
(b) True/False.
If matrix \(A\) is diagonalizable, then \(A\) is invertible.
(c) True/False.
If an \(n\times n\) matrix \(A\) is diagonalizable, then \(A\) has \(n\) distinct eigenvalues.
(d) True/False.
If matrix \(A\) is invertible and diagonalizable, then so is \(A^{-1}\text{.}\)
(e) True/False.
If an \(n \times n\) matrix \(C\) is diagonalizable, then there exists a basis of \(\R^n\) consisting of the eigenvectors of \(C\text{.}\)
(f) True/False.
An \(n\times n\) matrix with \(n\) distinct eigenvalues is diagonalizable.
(g) True/False.
If \(A\) is an \(n\times n\) diagonalizable matrix, then there is a unique diagonal matrix such that \(P^{-1}AP = D\) for some invertible matrix \(P\text{.}\)
(h) True/False.
If \(A\) is an \(n\times n\) matrix with eigenvalue \(\lambda\text{,}\) then the dimension of the eigenspace of \(A\) corresponding to the eigenvalue \(\lambda\) is \(n - \rank(A - \lambda I_n)\text{.}\)
(i) True/False.
If \(\lambda\) is an eigenvalue of an \(n \times n\) matrix \(A\text{,}\) then \(e^\lambda\) is an eigenvalue of \(e^A\text{.}\) (See Exercise 12 in Section 8 for information on the matrix exponential.)
Subsection Project: Binet's Formula for the Fibonacci Numbers
We return to the Fibonacci sequence \(F_n\) where \(F_{n+2} = F_{n+1} + F_{n}\text{,}\) for \(n \geq 0\text{,}\) \(F_0 = 0\text{,}\) and \(F_1=1\text{.}\) Since \(F_{n+2}\) is determined by previous values \(F_{n+1}\) and \(F_n\text{,}\) the relation \(F_{n+2} = F_{n+1} + F_{n}\) is called a recurrence relation. The recurrence relation \(F_{n+2} = F_{n+1} + F_{n}\) is very time consuming to use to compute \(F_n\) for large values of \(n\text{.}\) It turns out that there is a fascinating formula that gives the \(n\)th term of the Fibonacci sequence directly, without using the relation \(F_{n+2} = F_{n+1} + F_{n}\text{.}\)
Project Activity 19.7.
The recurrence relation\(F_{n+2} = F_{n+1} + F_{n}\) gives the equations
Let \(\vx_{n} = \left[ \begin{array}{c} F_{n+1} \\ F_{n} \end{array} \right]\) for \(n \geq 0\text{.}\) Explain how the equations (19.3) and (19.3) can be described with the matrix equation
where \(A = \left[ \begin{array}{cc} 1\amp 1 \\ 1\amp 0 \end{array} \right]\text{.}\)
The matrix equation (19.5) shows us how to find the vectors \(\vx_n\) using powers of the matrix \(A\text{:}\)
So if we can somehow easily find the powers of the matrix \(A\text{,}\) then we can find a convenient formula for \(F_n\text{.}\) As we have seen, we know how to do this if \(A\) is diagonalizable
Project Activity 19.8.
Let \(A = \left[ \begin{array}{cc} 1\amp 1 \\ 1\amp 0 \end{array} \right]\text{.}\)
(a)
Show that the eigenvalues of \(A\) are \(\varphi = \frac{1 + \sqrt{5}}{2}\) and \(\overline{\varphi} = \frac{1 - \sqrt{5}}{2}\text{.}\)
(b)
Find bases for each eigenspace of \(A\text{.}\)
Now that we have the eigenvalues and know corresponding eigenvectors for \(A\text{,}\) we can return to the problem of diagonalizing \(A\text{.}\)
Project Activity 19.9.
(a)
Why do we know that \(A\) is diagonalizable?
(b)
Find a matrix \(P\) such that \(P^{-1}AP\) is a diagonal matrix. What is the diagonal matrix?
Now we can find a formula for the \(n\)th Fibonacci number.
Project Activity 19.10.
Since \(P^{-1}AP = D\text{,}\) where \(D\) is a diagonal matrix, we also have \(A = PDP^{-1}\text{.}\) Recall that when \(A = PDP^{-1}\text{,}\) it follows that \(A^n = PD^nP^{-1}\text{.}\) Use the equation \(A^n = PD^nP^{-1}\) to show that
We just need to calculate the second component of \(A^n \vx_0\text{.}\)
Formula (19.6) is called Binet's formula. It is a very surprising formula in the fact that the expression on the right hand side of (19.6) is an integer for each positive integer \(n\text{.}\) Note that with Binet's formula we can quickly compute \(F_n\) for very large values of \(n\text{.}\) For example,
The number \(\varphi = \frac{1+\sqrt{5}}{2}\text{,}\) called the golden mean or golden ratio is intimately related to the Fibonacci sequence. Binet's formula provides a fascinating relationship between the Fibonacci numbers and the golden ratio. The golden ratio also occurs often in other areas of mathematics. It was an important number to the ancient Greek mathematicians who felt that the most aesthetically pleasing rectangles had sides in the ratio of \(\varphi:1\text{.}\)
Project Activity 19.11.
You might wonder what happens if we use negative integer exponents in Binet's formula. In other words, are there negatively indexed Fibonacci numbers? For any integer \(n\text{,}\) including negative integers, let
There is a specific relationship between \(F_{-n}\) and \(F_n\text{.}\) Find it and verify it.