Patterns, WIP
ETA: About a month. This post was delayed due to personal reasons and COVID19. Sorry.
I recently solved an interesting albeit fundamental problem which made me realize (again) that patterns in mathematics can be found pretty much anywhere.
I hope you are familiar with basic linear algebra and understand what a matrix is and how matrix multiplication, inverse and dot products are calculated and finally know about the determinant of a matrix. If not – just brush up it up on Khan Academy, MIT OCW or literally any linear algebra textbook.
Let’s move ahead with the question:
- $M$ is not diagonalizable.
- $M$ is idempotent.
- $M$ is nilpotent.
- The minimal polynomial and the characteristic polynomial of $M$ are different.
Click here for the solution if you have tried to solve it yourself.
The minimal polynomial and the characteristic polynomial of $M$ are different.Before we delve into the question and solve it, let’s look especially on these five terms: (1) Diagonalization (2) Idempotent (3) Nilpotent (4) Characteristic Polynomial of a matrix and (5) Minimal polynomial a matrix and how it differs from the characteristic polynomial.
Diagonalization
To understand diagonalization, we need to understand what eigenvalues or more precisely what eigenvectors do?
Eigenvectors and Eigenvalues
Eigenvectors are quite difficult to comprehend if you do not understand what they do. I mean why are we interested in finding them?
3b1b does an excellent job of explaining them in his video at Youtube – please watch this as I’ll explain it very briefly below.
You tried a lot of different transforms on your matrix and found out that a particular1 transform ($\mathbf{v}$) when applied (in this context you multiplied your matrix with the transform a.k.a vector) makes your original matrix ($A_{n \times n} = A$) change exactly the same it would if multiplied by a scalar ($\lambda$). $$ A \mathbf{v} = \lambda \mathbf{v} $$ Looking at it geometrically, we see that the transformed vector we get must lie on the same line of the original vector as it is just being magnified or shrunk exactly as if it is being multiplied by a scalar.
You can zoom in or out, play with with the axis in the above graph and verify it yourself.
That transform is the eigenvector and the magnitude of change is the eigenvalue. This is particularly interesting because that transform which is a vector is working exactly similar to a scalar number.
We can perform a few manipulations on this equation to get: $$ \begin{align} A \mathbf{v} = \lambda \mathbf{v} \\ \\ A \mathbf{v} - \lambda \mathbf{v} = 0 \\ \\ A \mathbf{v} - \lambda\mathbf{I}\mathbf{v} = 0 \\ \\ (A - \lambda \mathbf{I}) \mathbf{v} = 0 \end{align} $$
( Here $\mathbf{I}_{n\times n} = \mathbf{I}$ is the multiplicative indentity in the space where $A$ exists thus $A \mathbf{I} = A $)
We can solve this by either making $\det | A - \lambda \mathbf{I} | = 0$ or $\mathbf{v} =0 $. The later is not allowed (and obviously not useful — I mean it's trivial ) because we had defined the linear transform be a nonzero vector. Thus we are left to solve for $\det | A - \lambda \mathbf{I} | = 0$. The polynomial (of the variable $\lambda$ — the indeterminate) which we get after simplifying $\det | A - \lambda \mathbf{I} | = 0$ is called the characteristic polynomial of $A$.
Let's solve a simple question on this to understand it better.
Note: I took this example as it is relevant to the question that we are solving — the answer might not be surprising for you but just keep reading as it the beginning of an interesting pattern.
Answer.
Let's start with the equation $(A - \lambda \mathbf{I}) \mathbf{v} = 0$ solving which gives $\det | A - \lambda \mathbf{I} | = 0$. We are given $A = \begin{bmatrix} 1 & 1 \\ 1 & 1 \\ \end{bmatrix} $. $$ \begin{gather} \det | A - \lambda \mathbf{I} | = \det \bigg( \begin{bmatrix} 1 & 1 \\ 1 & 1 \\ \end{bmatrix} - \lambda \begin{bmatrix} 1 & 0 \\ 0 & 1 \\ \end{bmatrix} \bigg) \\ \\ = \det \bigg( \begin{bmatrix} 1 & 1 \\ 1 & 1 \\ \end{bmatrix} - \begin{bmatrix} \lambda & 0 \\ 0 & \lambda \\ \end{bmatrix} \bigg) \\ \\ = \bigg| \begin{matrix} 1 - \lambda & 1 \\ 1 & 1 - \lambda \\ \end{matrix} \bigg| \\ \\ = (1-\lambda)^2 - 1 \\ \\ = 1 - 2 \lambda + \lambda^2 - 1 \\ \\ = \lambda^2 - 2\lambda \\ \\ = \lambda(\lambda - 2) \end{gather} $$ The eigenvalues of $A$ are the solution to the polynomial $\lambda(\lambda - 2) = 0 \implies \lambda_1 = 0, \lambda_2 = 2$.Phew, finally we got them! So, what are the eigenvectors now? Just substitute the eigenvalues in the initial equation to find $\mathbf{v}$.
We have (substitute $\lambda_2 = 2 = \lambda$) and also let $\mathbf{v} = \begin{bmatrix} x_1 \\ x_2 \\ \end{bmatrix}$: (for clarity, I'm skipping $=$ signs in each step) $$ \begin{gather} A \mathbf{v} = \lambda \mathbf{v}\\ \\ A \mathbf{v} = 2\mathbf{v}\\ \\ A \begin{bmatrix} x_1 \\ x_2 \\ \end{bmatrix} = 2 \begin{bmatrix} x_1 \\ x_2 \\ \end{bmatrix}\\ \\ A \begin{bmatrix} x_1 \\ x_2 \\ \end{bmatrix} = \begin{bmatrix} 2x_1 \\ 2x_2 \\ \end{bmatrix}\\ \\ \begin{bmatrix} 1 & 1 \\ 1 & 1 \\ \end{bmatrix} \begin{bmatrix} x_1 \\ x_2 \\ \end{bmatrix} = \begin{bmatrix} 2x_1 \\ 2x_2 \\ \end{bmatrix}\\ \\ \begin{bmatrix} x_1 + x_2 \\ x_1 + x_2 \\ \end{bmatrix} = \begin{bmatrix} 2x_1 \\ 2x_2 \\ \end{bmatrix}\\ \\ \end{gather} $$ Comparing we get, $x_1 + x_2 = 2x_1$ and $x_1 + x_2 = 2x_2 \implies x_2 = x_1$ and $x_1 = x_2$. Thus, $$ \mathbf{v} = \begin{bmatrix} x_1 \\ x_1\\ \end{bmatrix} \text{or} \begin{bmatrix} x_2 \\ x_2 \\ \end{bmatrix} $$ What does this mean? It means while there are infinitely many solution vectors of $A \mathbf{v} = 2 \mathbf{v} $ they all need to satisfy $x_1 = x_2$. We can represent all the solutions of the equation by parametrizing relation using $p$ which is any real number. $$ \begin{bmatrix} p \\ p \\ \end{bmatrix} = p \begin{bmatrix} 1 \\ 1 \\ \end{bmatrix} $$ So, the nonzero vectors $\mathbf{v}$ that satisfy $A \mathbf{v} = 2 \mathbf{v}$ are called eigenvectors associated with the eigenvalue $\lambda = 2$. For example making $p = 1$ gives us the eigenvector: $$ u_1 = \begin{bmatrix} 1 \\ 1 \\ \end{bmatrix} $$ All other eigenvectors are just scalar multiples of this eigenvector, $u_1$. Let's check out $4 u_1 = 4 \begin{bmatrix} 1 \\ 1 \\ \end{bmatrix} = \begin{bmatrix} 4 \\ 4 \\ \end{bmatrix}$ Does it satisfy $A \mathbf{v} = 2 \mathbf{v}$? $$ \begin{gather} A \mathbf{v} = 2 \mathbf{v} \\ \\ A \begin{bmatrix} 4 \\ 4 \\ \end{bmatrix} = 2 \begin{bmatrix} 4 \\ 4 \\ \end{bmatrix} \\ \\ \begin{bmatrix} 1 & 1 \\ 1 & 1 \\ \end{bmatrix} \begin{bmatrix} 4 \\ 4 \\ \end{bmatrix} = 2 \begin{bmatrix} 4 \\ 4 \\ \end{bmatrix} \\ \\ \begin{bmatrix} 4 + 4\\ 4 + 4\\ \end{bmatrix} = \begin{bmatrix} 2 \times 4 \\ 2 \times 4 \\ \end{bmatrix} \\ \\ \begin{bmatrix} 8 \\ 8 \\ \end{bmatrix} = \begin{bmatrix} 8 \\ 8 \\ \end{bmatrix} \end{gather} $$ Well, of course it does!
Let's recap the most important points we found:
- The transform ($\mathbf{v}$) when applied makes your original matrix ($A_{n \times n} = A$) change exactly the same it would if multiplied by a scalar ($\lambda$). $$A \mathbf{v} = \lambda \mathbf{v} $$
- The transformed vector we get, must lie on the same line of the original vector as it is just being magnified or shrunk exactly as if it is being multiplied by a scalar.
- The polynomial (of the variable $\lambda$ ) which we get after simplifying $\det | A - \lambda \mathbf{I} | = 0$ is called the characteristic polynomial of $A$.
- When talking about eigenvectors you must specify the domain (field) you're in. Every complex matrix has at least one eigenvector and every matrix has at least one eigenvalue — think this way, the eigenvalues are the roots of the characteristics polynomial. Does every polynomial have roots?2
You could find the eigenvalues and eigenvectors using scipy in Python like,
import numpy as np
import scipy.linalg as la
A = np.array([[1,1],[1,1]])
eigvals, eigvecs = la.eig(A)
print(eigvals.real)
# [2. 0.]
print(eigvecs)
#array([[ 0.70710678, -0.70710678],
# [ 0.70710678, 0.70710678]])
As I don’t need to cover this deeply to help solve the problem I suggest you refer the Brilliant wiki, Wikipedia or any good linear algebra textbook such as Introduction to Linear Algebra by Gilbert Strang or Linear Algebra Done Right by Sheldon Axler, etc for a more systematic and rigorous read on eigenvectors and eigenvalues.
Diagonalization
WIP
Idempotent Matrices
Let us define what an idempotent matrix is or more specifically what idempotence is all about.
Similarly we can define an idempotent matrix as follows:
Now is a $3\times3$ matrix whose all entries are $1$, idempotent?
….which is equal to $3A$. As $A \times A = 3A $ for the matrix, $A$ above, we can rule out the second option (2) from our question, which is M is idempotent, is false. It is definitely not idempotent.
But wait, are we onto something? If $A_{3 \times 3} \times A_{3 \times 3} = 3 A_{3 \times 3} $, is is true for $A_{4 \times 4}, A_{5 \times 5} \dots A_{n \times n} $ also?
Is $A_{n \times n} \times A_{n \times n} = n A $?
And a $12 \times 12$ matrix whose all are entries are 1 is – you can easily visualize this as I really didn’t want to type such a huge matrix.
Nilpotent Matrices
Stiching Up Everything
work in progress
Conclusion
lol, u suck
References
This post couldn’t be written without the help of 3b1b’s YouTube Channel, Wikipedia pages on Linear Transformations, Idempotent matrices, nilpotent matrices, vectorplot.js library by ?? add citations, more refs
Footnotes
-
A matrix can have infinite eigenvectors as you can keep multiplying scalars to $\mathbf{v}$ to get another eigenvector. For example, if $\mathbf{v}$ is a eigenvector, $2\mathbf{v}, 3\mathbf{v}, 4\mathbf{v} \ldots$ are all eigenvectors. Interestingly, there can be only at most $n$ (order of matrix) eigenvalues. Why? Think eigenvalues are roots of the polynomial $\det | A - \lambda \mathbf{I} | = 0$ and how many roots can a $n$-degree polynomial have? ↩
-
Yes, you can easily prove it by the fundamental theorem of algebra. ↩
-
You can’t multiply a $n\times m$ matrix by itself – matrix multiplication is illegal as $n \neq m$, it’s only possible if $n = m$. ↩