biVector forum

Eigenvectors from Eigenvalues

There’s a newly published result relating the coordinates of the eigenvectors of a (hermitian or symmetric) linear operator to the eigenvalues of the operator and its minors that has been getting some press lately:

Quanta Magazine: https://www.quantamagazine.org/neutrinos-lead-to-unexpected-discovery-in-basic-math-20191113/

arXiv: https://arxiv.org/abs/1908.03795

Terry Tao gives a nice “geometric” proof on his blog that might be interesting to readers here (see “2 a geometric proof”): https://terrytao.wordpress.com/2019/08/13/eigenvectors-from-eigenvalues/

Working through the details of the proof is a nice exercise in the use of the outermorphism of a linear operator, somewhat similar in flavor to Hestenes and Sobczyk’s proof of the Cayley-Hamilton theorem in Chapter 3 of “Clifford Algebra to Geometric Calculus”. Good practice if you’re interested in this kind of thing.

2 Likes

The origin of this particular saga was a few months ago on reddit: https://reddit.com/ci665j/ https://reddit.com/ckuer9/ https://reddit.com/cq3en0/

That is cool!

Recently I have found a new expression for for calculating eigenvectors from an eigenvalues of a real symmetric matrix in Geometric Algebra, just using the outer product!. Basically the only thing you need to do is taking the meet of n - 1 columns of a singular matrix. More details can be found in my paper:

http://vixra.org/abs/1911.0127

That works well for small matrices, in practice. I am not sure how that is connected to the result mentioned in this thread. But I would like to find that connection.

2 Likes

I would like someone rewrite Terry Tao’s proof with more details.

1 Like

Indeed, the two results are connected, it seems that in Tao’s blog post the 2nd geometric proof is pretty much the same thing as what you seem to be describing:

Yeah, I think so. We (I specially) should make the connection explicit. I mean, workout the details and provide an explicit derivation would be best and instructive for all of us.

2 Likes

I have worked out the Tao’s geometric proof in detail and have re-write it so anybody can follow what is going on in detail.

WARNING: I go very slow in order to be clear and explicit (some times pedantic). Skip the parts you feel its too slow.

NOTE1: read Tao’s post first https://terrytao.wordpress.com/2019/08/13/eigenvectors-from-eigenvalues/ and then read this extended version of the proof.

NOTE2: If you find errors please point them out (specially with signs). Do the same for things not clear.

First I think its much simpler if I work on a small sample. I am going to work with 4 \times 4 matrices, the R^4 vector space and the geometric algebra (4,0,0).

Let A be a full-rank real symmetric 4 \times 4 matrix. A has 4 distinct real eigenvalues, denoted by \lambda_i, and 4 distinct corresponding orthonormal eigenvectors denoted by v_i.

The characteristic equation is (A - \lambda_i I) v_i = 0 and \det(A - \lambda_i I) = 0. In geometric algebra terms the determinant of A is given by \det(A) = \|a_1 \wedge a_2 \wedge a_3 \wedge a_4\| where a_i are the columns of matrix A. The characteristic outer product is:

(a_1 - \lambda_i e_1) \wedge (a_2 - \lambda_i e_2) \wedge (a_3 - \lambda_i e_3) \wedge (a_4 - \lambda_i e_4) = 0

The parallel of the characteristic outer product and the characteristic equation continue as follows:

\|(a_1 - \lambda_i e_1) \wedge (a_2 - \lambda_i e_2) \wedge (a_3 - \lambda_i e_3) \wedge (a_4 - \lambda_i e_4)\| = \det(A - \lambda_i I)

which will be used in Tao’s proof.

Since eigenvectors form an orthonormal basis we can define k-vectors on the eigenvector’s basis of the form w_i = v_1 \wedge v_{i-1} \wedge v_{i+1} \wedge v_n. In our case those are:

w_4 = v_1 \wedge v_2 \wedge v_3
w_3 = v_1 \wedge v_2 \wedge v_4
w_2 = v_1 \wedge v_3 \wedge v_4
w_1 = v_2 \wedge v_3 \wedge v_4

Applying the function (A - \lambda_4 I) to the 3-vector w_4 we get:

(A - \lambda_4 I) w_4 = (A - \lambda_4 I) v_1 \wedge (A - \lambda_4 I) v_2 \wedge (A - \lambda_4 I)v_3
(A - \lambda_4 I) w_4 = (\lambda_1 v_1 - \lambda_4 v_1) \wedge (\lambda_2 v_2 - \lambda_4 v_2) \wedge (\lambda_3 v_3 - \lambda_4 v_3)
(A - \lambda_4 I) w_4 = (\lambda_1 - \lambda_4)v_1 \wedge (\lambda_2 - \lambda_4)v_2 \wedge (\lambda_3 - \lambda_4)v_3
(A - \lambda_4 I) w_4 = \left[\prod_{k=1, k\ne4}^4 (\lambda_k - \lambda_4)\right] v_1 \wedge v_2 \wedge v_3
(A - \lambda_4 I) w_4 = \left[\prod_{k=1, k\ne4}^4 (\lambda_k - \lambda_4)\right] w_4

So w_4 is an eigenblade (by outermorphism) as expected. Now, if we apply the same function (A - \lambda_4 I) to w_1, w_2 or w_3 we get:

(A - \lambda_4 I) w_1 = \left[\prod_{k=1, k\ne1}^4 (\lambda_k - \lambda_4)\right] w_1 = 0
(A - \lambda_4 I) w_2 = \left[\prod_{k=1, k\ne2}^4 (\lambda_k - \lambda_4)\right] w_2 = 0
(A - \lambda_4 I) w_3 = \left[\prod_{k=1, k\ne3}^4 (\lambda_k - \lambda_4)\right] w_3 = 0

That is why Tao said this is “rank one”. So generalizing:

(A - \lambda_i I) w_i = \left[\prod_{k=1, k\ne i}^4 (\lambda_k - \lambda_i)\right] w_i

The inner product (Euclidean Inner product) of any k-vector w_i with itself is <w_i, w_i> = 1, by orthonormality of the eigenbasis v_i

<(A - \lambda_i I)w_i, w_i> = \prod_{k=1, k\ne4}^4 (\lambda_k - \lambda_4)

We can use the following expression to agree with Tao’s definition:

<(A - \lambda_i I)w_i, w_i> = \left[\prod_{k=1, k\ne4}^4 (\lambda_k - \lambda_4)\right] |w_i \wedge v_i|^2

since |w_i \wedge v_i|^2 = 1. But the above expression holds even if you replace w_i with any k-vector w on the orthonormal eigenbasis v_i.

Now, lets define k-vectors on the basis R^4:

x_4 = e_1 \wedge e_2 \wedge e_3
x_3 = e_1 \wedge e_2 \wedge e_4
x_2 = e_1 \wedge e_3 \wedge e_4
x_1 = e_2 \wedge e_3 \wedge e_4

And compute x_4 \wedge v_i:

x_4 \wedge v_i = x_4 \wedge (v_{i1} e_1 + v_{i2} e_2 + v_{i3} e_3 + v_{i4} e_4)
x_4 \wedge v_i = v_{i4} e_1 \wedge e_2 \wedge e_3 \wedge e_4
\|x_4 \wedge v_i\| = v_{i4}

Generalizing: \|x_j \wedge v_i\| = v_{ij}

Now, lets apply the function (A - \lambda_4 I) to the 3-vector x_4 we get:

(A - \lambda_4 I) x_4 = (A - \lambda_4 I) e_1 \wedge (A - \lambda_4 I) e_2 \wedge (A - \lambda_4 I)e_3
(A - \lambda_4 I) x_4 = (a_1 - \lambda_4 e_1) \wedge (a_2 - \lambda_4 e_2) \wedge (a_3 - \lambda_4 e_3)

Remember that a_i = a_{i1} e_1 + a_{i2} e_2 + a_{i3} e_3 + a_{i4} e_4 and since a_i contains components in the e_4 direction the above outer product gives:

(A - \lambda_4 I) x_4 = \det(M_4 - \lambda_4 I) e_{123} + \alpha e_{124} + \beta e_{134} + \gamma e_{234}

We still need to proof (and explain) that the coefficient of the first term, corresponding to the 3-vector e_{123}, is \det(M_4 - \lambda_4 I).

To do that, lets define the column vectors \hat a^4_i:

\hat a^4_i = a_{i1} e_1 + a_{i2} e_2 + a_{i3} e_3

So the vector \hat a^4_i is suppressing the coordinate e_4 from the column vector a_i.

So if we define the 3-vector \hat m_4 = \hat a^4_1 \wedge \hat a^4_2 \wedge \hat a^4_3 it corresponds to the outer product of the columns of the matrix M_4 which is a submatrix of A obtained by eliminating the row 4 and column 4. Which is known also as minor.

If we apply the function M_4 - \lambda_4 I to 3-vector x_4 we get:

(M_4 - \lambda_4 I) x_4 = (M_4 - \lambda_4 I) e_1 \wedge (M_4 - \lambda_4 I) e_2 \wedge (M_4 - \lambda_4 I)e_3
(M_4 - \lambda_4 I) x_4 = (\hat a^4_1 - \lambda_4 e_1) \wedge (\hat a^4_2 - \lambda_4 e_2) \wedge (\hat a^4_3 - \lambda_4 e_3)

Since (\hat a^4_1 - \lambda_4 e_1) \wedge (\hat a^4_2 - \lambda_4 e_2) \wedge (\hat a^4_3 - \lambda_4 e_3) is just the characteristic outer product of M_4 (see discussion about characteristic outer product above) we can simply write:

(M_4 - \lambda_4 I) x_4 = \det(M_4 - \lambda_4I) x_4

Finally, since M_4 is full rank symmetric real matrix, it also has an orthonormal eigenbasis, which will be denoted as \hat v^4_i. Consider the 3-vector \hat w^4 = \hat v^4_1 \wedge \hat v^4_2 \wedge \hat v^4_3.

(M_4 - \lambda_4 I ) w^4 = \left[\prod^3_{k=1} (\lambda_k - \lambda_4)\right] w^4

where \lambda_k are eigenvalues of M_4. Since \|x_4\| = \|w^4\| = 1 we can equate:

\det(M_4 - \lambda_4I) = \prod^3_{k=1} (\lambda_k - \lambda_4)

Generalizing:

\det(M_i - \lambda_iI) = \prod^{n-1}_{k=1} (\lambda_k - \lambda_i)

Which concludes the Tao’s proof.

2 Likes

Relation With My Work

At this point I don’t think that my formulation is the same as the one published in arXiv: https://arxiv.org/abs/1908.03795 but I still see connections.

Null Space

Finding an eigenvector from an eigenvalue is related to finding the null space of a matrix. Let A be a N \times N symmetric matrix of rank N-1. The basis for the null space of A is a 1-vector denoted by x, so that A x = 0. The vector x is unique up to a scale factor.

Let a_i be the columns of matrix A. Since we know the rank of A is N-1 then only N-1 columns are linearly independent. Let a_N be the column which is linearly dependent on the others, so that:

a_N = \alpha_1 a_1 + ... + \alpha_{n-1} a_{N-1}

So the outer product:

a_1 \wedge ... \wedge a_N = 0

We can take all the outer products of all N-1 columns:

X_i = a_1 \wedge ... \wedge a_{i-1} \wedge a_{i+1} \wedge ... \wedge a_N

And we find that

X_N = \det(M_i) e_1\wedge...\wedge e_{N-1} + ...

The first connection that I found is this one: the k-vector component of the X_N blade (the one without e_N) has the determinant of the i-minor of A as coefficient.

Now, the dual vector of X_N (or polarity of X_N), denoted by x_N = X_N I^{-1}, where I is the pseudoscalar of R^N, is by definition orthogonal to the column vectors a_i of A i.e., a_i \cdot x_N = 0

So, the 1-vector x_N is actually in the null space of A since:

A^T x_N = A x_N = 0

and the e_N coordinate of x_N has coefficient equal to \det(M_i).

It is easy to check that all k-vectors X_i are equal to X_N up to a constant. Thus all the corresponding dual vectors x_i are equal to x_N up to a constant.

So, in principle the null space of A can be found by taking the outer producy of any N-1 columns of A. However there might be some combinations of N-1 columns whose outer product is zero. To avoid complications one can simply use all X_i, as shown in my paper, just adding them before taking the dual. That way one always get some vector in null space.

Eigenvectors

Let A be a N \times N symmetric matrix of rank N. Let \lambda_i be the eigenvalues of A corresponding to eigenvectors v_i, where i goes from 1 to N.

So that A v_i = \lambda_i v_i and the characteristic equation is (A - \lambda_i I) v_i = 0.

We can see that v_i is in the null space of the characteristic matrix (A - \lambda_i I) which is singular of rank N-1.

From our previous discussion on null space, we can find v_i by taking the outer product of N-1 columns of matrix (A - \lambda_i I) and then taking its dual vector.

Let a_N be the column of (A - \lambda_i I) which is linearly dependent on the others, then the unnormalized v_i is

v_i = (a_1 \wedge ... \wedge a_{N-1}) I^{-1}

and the coefficient of the e_N component of v_i has \det(M_i - \lambda_i I) as coefficient, where M_i is the i-minor of A.

Again we don’t need to know which one is the linearly dependent column of A, we can use any combination of N-1 column vectors, or all of them as I did in my paper, to get the eigenvector v_i in general.

I think this method can be used also to find the eigenspace of lower rank matrices as well, probably with many applications since we don’t need fancy numerical algorithms, just take outer products.

1 Like

It seems outerproducts are more convenient to use than determinants… :smiley:

1 Like