Patterns, WIP

ETA: About a month. This post was delayed due to personal reasons and COVID19. Sorry.

I recently solved an interesting albeit fundamental problem which made me realize (again) that patterns in mathematics can be found pretty much anywhere.

I hope you are familiar with basic linear algebra and understand what a matrix is and how matrix multiplication, inverse and dot products are calculated and finally know about the determinant of a matrix. If not – just brush up it up on Khan Academy, MIT OCW or literally any linear algebra textbook.

Let’s move ahead with the question:

$M$ is a $12 \times 12 $ matrix whose all entries are $1$, then which one is correct?

$M$ is not diagonalizable.
$M$ is idempotent.
$M$ is nilpotent.
The minimal polynomial and the characteristic polynomial of $M$ are different.

Click here for the solution if you have tried to solve it yourself.

The minimal polynomial and the characteristic polynomial of $M$ are different.

Before we delve into the question and solve it, let’s look especially on these five terms: (1) Diagonalization (2) Idempotent (3) Nilpotent (4) Characteristic Polynomial of a matrix and (5) Minimal polynomial a matrix and how it differs from the characteristic polynomial.

Diagonalization

To understand diagonalization, we need to understand what eigenvalues or more precisely what eigenvectors do?

Eigenvectors and Eigenvalues

Eigenvectors are quite difficult to comprehend if you do not understand what they do. I mean why are we interested in finding them?

3b1b does an excellent job of explaining them in his video at Youtube – please watch this as I’ll explain it very briefly below.

Definition 1.3. An eigenvector or characteristic vector of a linear transformation is a nonzero vector that changes at most by a scalar factor when that linear transformation is applied to it.

Here's an easy definition: You got a matrix of $ n \times n $ which you want to transform or make it do something like rotate it $180^\circ$, maginify it by a factor of $2$ or just move it to another place in the plane. Think a trasform as a function taking in a vector/matrix and spitting out another. The function subjects the input to a set of rules you define to create the output.

You tried a lot of different transforms on your matrix and found out that a particular¹ transform ($\mathbf{v}$) when applied (in this context you multiplied your matrix with the transform a.k.a vector) makes your original matrix ($A_{n \times n} = A$) change exactly the same it would if multiplied by a scalar ($\lambda$). $$ A \mathbf{v} = \lambda \mathbf{v} $$ Looking at it geometrically, we see that the transformed vector we get must lie on the same line of the original vector as it is just being magnified or shrunk exactly as if it is being multiplied by a scalar.

Here, $\mathbf{v_1}$, $\mathbf{v_2}$ are eigenvetors whereas $\mathbf{v_3}$ is not an eigenvector of the original vector $\mathbf{v_{0}} = \begin{bmatrix} 1 \\ 1 \\ \end{bmatrix} $ (black).
You can zoom in or out, play with with the axis in the above graph and verify it yourself.

That transform is the eigenvector and the magnitude of change is the eigenvalue. This is particularly interesting because that transform which is a vector is working exactly similar to a scalar number.

We can perform a few manipulations on this equation to get: $$ \begin{align} A \mathbf{v} = \lambda \mathbf{v} \\ \\ A \mathbf{v} - \lambda \mathbf{v} = 0 \\ \\ A \mathbf{v} - \lambda\mathbf{I}\mathbf{v} = 0 \\ \\ (A - \lambda \mathbf{I}) \mathbf{v} = 0 \end{align} $$
( Here $\mathbf{I}_{n\times n} = \mathbf{I}$ is the multiplicative indentity in the space where $A$ exists thus $A \mathbf{I} = A $)

We can solve this by either making $\det | A - \lambda \mathbf{I} | = 0$ or $\mathbf{v} =0 $. The later is not allowed (and obviously not useful — I mean it's trivial ) because we had defined the linear transform be a nonzero vector. Thus we are left to solve for $\det | A - \lambda \mathbf{I} | = 0$. The polynomial (of the variable $\lambda$ — the indeterminate) which we get after simplifying $\det | A - \lambda \mathbf{I} | = 0$ is called the characteristic polynomial of $A$.

Let's solve a simple question on this to understand it better.

Example 1. Find the eigenvectors and eigenvalues of the matrix $A = \begin{bmatrix} 1 & 1 \\ 1 & 1 \\ \end{bmatrix} $.

Note: I took this example as it is relevant to the question that we are solving — the answer might not be surprising for you but just keep reading as it the beginning of an interesting pattern.

Answer.

Let's start with the equation $(A - \lambda \mathbf{I}) \mathbf{v} = 0$ solving which gives $\det | A - \lambda \mathbf{I} | = 0$. We are given $A = \begin{bmatrix} 1 & 1 \\ 1 & 1 \\ \end{bmatrix} $. $$ \begin{gather} \det | A - \lambda \mathbf{I} | = \det \bigg( \begin{bmatrix} 1 & 1 \\ 1 & 1 \\ \end{bmatrix} - \lambda \begin{bmatrix} 1 & 0 \\ 0 & 1 \\ \end{bmatrix} \bigg) \\ \\ = \det \bigg( \begin{bmatrix} 1 & 1 \\ 1 & 1 \\ \end{bmatrix} - \begin{bmatrix} \lambda & 0 \\ 0 & \lambda \\ \end{bmatrix} \bigg) \\ \\ = \bigg| \begin{matrix} 1 - \lambda & 1 \\ 1 & 1 - \lambda \\ \end{matrix} \bigg| \\ \\ = (1-\lambda)^2 - 1 \\ \\ = 1 - 2 \lambda + \lambda^2 - 1 \\ \\ = \lambda^2 - 2\lambda \\ \\ = \lambda(\lambda - 2) \end{gather} $$ The eigenvalues of $A$ are the solution to the polynomial $\lambda(\lambda - 2) = 0 \implies \lambda_1 = 0, \lambda_2 = 2$.
Phew, finally we got them! So, what are the eigenvectors now? Just substitute the eigenvalues in the initial equation to find $\mathbf{v}$.

We have (substitute $\lambda_2 = 2 = \lambda$) and also let $\mathbf{v} = \begin{bmatrix} x_1 \\ x_2 \\ \end{bmatrix}$: (for clarity, I'm skipping $=$ signs in each step) $$ \begin{gather} A \mathbf{v} = \lambda \mathbf{v}\\ \\ A \mathbf{v} = 2\mathbf{v}\\ \\ A \begin{bmatrix} x_1 \\ x_2 \\ \end{bmatrix} = 2 \begin{bmatrix} x_1 \\ x_2 \\ \end{bmatrix}\\ \\ A \begin{bmatrix} x_1 \\ x_2 \\ \end{bmatrix} = \begin{bmatrix} 2x_1 \\ 2x_2 \\ \end{bmatrix}\\ \\ \begin{bmatrix} 1 & 1 \\ 1 & 1 \\ \end{bmatrix} \begin{bmatrix} x_1 \\ x_2 \\ \end{bmatrix} = \begin{bmatrix} 2x_1 \\ 2x_2 \\ \end{bmatrix}\\ \\ \begin{bmatrix} x_1 + x_2 \\ x_1 + x_2 \\ \end{bmatrix} = \begin{bmatrix} 2x_1 \\ 2x_2 \\ \end{bmatrix}\\ \\ \end{gather} $$ Comparing we get, $x_1 + x_2 = 2x_1$ and $x_1 + x_2 = 2x_2 \implies x_2 = x_1$ and $x_1 = x_2$. Thus, $$ \mathbf{v} = \begin{bmatrix} x_1 \\ x_1\\ \end{bmatrix} \text{or} \begin{bmatrix} x_2 \\ x_2 \\ \end{bmatrix} $$ What does this mean? It means while there are infinitely many solution vectors of $A \mathbf{v} = 2 \mathbf{v} $ they all need to satisfy $x_1 = x_2$. We can represent all the solutions of the equation by parametrizing relation using $p$ which is any real number. $$ \begin{bmatrix} p \\ p \\ \end{bmatrix} = p \begin{bmatrix} 1 \\ 1 \\ \end{bmatrix} $$ So, the nonzero vectors $\mathbf{v}$ that satisfy $A \mathbf{v} = 2 \mathbf{v}$ are called eigenvectors associated with the eigenvalue $\lambda = 2$. For example making $p = 1$ gives us the eigenvector: $$ u_1 = \begin{bmatrix} 1 \\ 1 \\ \end{bmatrix} $$ All other eigenvectors are just scalar multiples of this eigenvector, $u_1$. Let's check out $4 u_1 = 4 \begin{bmatrix} 1 \\ 1 \\ \end{bmatrix} = \begin{bmatrix} 4 \\ 4 \\ \end{bmatrix}$ Does it satisfy $A \mathbf{v} = 2 \mathbf{v}$? $$ \begin{gather} A \mathbf{v} = 2 \mathbf{v} \\ \\ A \begin{bmatrix} 4 \\ 4 \\ \end{bmatrix} = 2 \begin{bmatrix} 4 \\ 4 \\ \end{bmatrix} \\ \\ \begin{bmatrix} 1 & 1 \\ 1 & 1 \\ \end{bmatrix} \begin{bmatrix} 4 \\ 4 \\ \end{bmatrix} = 2 \begin{bmatrix} 4 \\ 4 \\ \end{bmatrix} \\ \\ \begin{bmatrix} 4 + 4\\ 4 + 4\\ \end{bmatrix} = \begin{bmatrix} 2 \times 4 \\ 2 \times 4 \\ \end{bmatrix} \\ \\ \begin{bmatrix} 8 \\ 8 \\ \end{bmatrix} = \begin{bmatrix} 8 \\ 8 \\ \end{bmatrix} \end{gather} $$ Well, of course it does!

Key Takeouts.
Let's recap the most important points we found:

The transform ($\mathbf{v}$) when applied makes your original matrix ($A_{n \times n} = A$) change exactly the same it would if multiplied by a scalar ($\lambda$). $$A \mathbf{v} = \lambda \mathbf{v} $$
The transformed vector we get, must lie on the same line of the original vector as it is just being magnified or shrunk exactly as if it is being multiplied by a scalar.
The polynomial (of the variable $\lambda$ ) which we get after simplifying $\det | A - \lambda \mathbf{I} | = 0$ is called the characteristic polynomial of $A$.
When talking about eigenvectors you must specify the domain (field) you're in. Every complex matrix has at least one eigenvector and every matrix has at least one eigenvalue — think this way, the eigenvalues are the roots of the characteristics polynomial. Does every polynomial have roots?²

You could find the eigenvalues and eigenvectors using scipy in Python like,

import numpy as np
import scipy.linalg as la 

A = np.array([[1,1],[1,1]])

eigvals, eigvecs = la.eig(A)

print(eigvals.real)
# [2. 0.]

print(eigvecs)
#array([[ 0.70710678, -0.70710678],
#       [ 0.70710678,  0.70710678]])

As I don’t need to cover this deeply to help solve the problem I suggest you refer the Brilliant wiki, Wikipedia or any good linear algebra textbook such as Introduction to Linear Algebra by Gilbert Strang or Linear Algebra Done Right by Sheldon Axler, etc for a more systematic and rigorous read on eigenvectors and eigenvalues.

Diagonalization

WIP

Idempotent Matrices

Let us define what an idempotent matrix is or more specifically what idempotence is all about.

Definition 1.1. An element $x$ of a magma $(M, •)$ is said to be idempotent if: $$x • x = x$$ A magma is a set closed under one binary operation which means that given two elements $a, b \in M$, then $a • b \in M$. Note that $•$ can be any binary operation such as $ +, -$, etc.

Simply, if you take two numbers and multiply, add, subtract (any binary operation) you will get the same number again. Eg, $0 + 0 = 0, 1 \times 1 = 1, 1 \div 1 = 1$, etc.

Similarly we can define an idempotent matrix as follows:

Definition 1.2. A $n \times n $ matrix $A$ (the matrix must be a square matrix, why?³ ) is said to be idempotent if multiplied by itself, yields itself. $$A \cdot A = A^2 = A $$

Now is a $3\times3$ matrix whose all entries are $1$, idempotent?

$\begin{bmatrix} 1 & 1 & 1 \\ 1 & 1 & 1 \\ 1 & 1 & 1 \end{bmatrix} \times \begin{bmatrix} 1 & 1 & 1 \\ 1 & 1 & 1 \\ 1 & 1 & 1 \end{bmatrix} = \begin{bmatrix} 3 & 3 & 3 \\ 3 & 3 & 3 \\ 3 & 3 & 3 \end{bmatrix}$

….which is equal to $3A$. As $A \times A = 3A $ for the matrix, $A$ above, we can rule out the second option (2) from our question, which is M is idempotent, is false. It is definitely not idempotent.

But wait, are we onto something? If $A_{3 \times 3} \times A_{3 \times 3} = 3 A_{3 \times 3} $, is is true for $A_{4 \times 4}, A_{5 \times 5} \dots A_{n \times n} $ also?

Is $A_{n \times n} \times A_{n \times n} = n A $?

And a $12 \times 12$ matrix whose all are entries are 1 is – you can easily visualize this as I really didn’t want to type such a huge matrix.

Tip: Whenever you see a question which has a huge matrix ($n \geq 6 $), it's definitely a question that's more theoretical and requires generalization than a brute force approach. I mean why would anyone work with a $12 \times 12 $ matrix? Madness. Don't fall into that.

Nilpotent Matrices

Stiching Up Everything

work in progress

Conclusion

lol, u suck

References

This post couldn’t be written without the help of 3b1b’s YouTube Channel, Wikipedia pages on Linear Transformations, Idempotent matrices, nilpotent matrices, vectorplot.js library by ?? add citations, more refs

Footnotes

A matrix can have infinite eigenvectors as you can keep multiplying scalars to $\mathbf{v}$ to get another eigenvector. For example, if $\mathbf{v}$ is a eigenvector, $2\mathbf{v}, 3\mathbf{v}, 4\mathbf{v} \ldots$ are all eigenvectors. Interestingly, there can be only at most $n$ (order of matrix) eigenvalues. Why? Think eigenvalues are roots of the polynomial $\det | A - \lambda \mathbf{I} | = 0$ and how many roots can a $n$-degree polynomial have? ↩
Yes, you can easily prove it by the fundamental theorem of algebra. ↩
You can’t multiply a $n\times m$ matrix by itself – matrix multiplication is illegal as $n \neq m$, it’s only possible if $n = m$. ↩