All right, today we will finally prove the existence of the Jordan form of any operator acting on a finite dimensional complex vector space,
That is to say the following theorem. Let V be a finite dimensional vector space over the field of complex numbers, and an operator on V and has a Jordan basis. That is to say there exists a basis, a beta of v, such that the matrix of T with respect to beta has Jordan form.
This, I explained what this means last time. So last time we proved that, proved this in the case when T is impotent. That is to say to some power is the zero operator for some k, I guess. Great,
Now we prove it in general. Now we use this result to prove it in general, that is to say, for an arbitrary T. The proof goes.
We need to reduce the proof to the case when t is important. For that, we're going to use the theorem which we proved during the last lecture before the break. Previously, we proved under the conditions of this theorem.
That is to say we deal with the vector, finite dimensional vector space over the field of complex numbers, and we have an operator acting on it.
We prove that decomposes into direct sum of generalized values eigenspaces.
This is generalized eigen space of corresponding to lamb. Here, lambda I, where I goes from one to m, are the distinct eigenvalues of t, for example, they can be found as the roots of the minimal polynomial associated to the precise definition of this is that this is the subspace of those vectors in V, which annihilated by sufficiently high power of minus lambda times identity.
In particular, it contains the true Eigen space which consists of those vectors which satisfy this equation for article one.
But in general, it's bigger than the Eigen space. The essential point here is that this is invariant. Subspace is invariant. If you have a vector which is annihilated by power of t minus lambda I, then the same power will annihilate TV for the simple reason that T commutes with on the outside. For the simple reason that we can push through this R power because this R power contains nothing but and the identity separator and both of them commute with tea.
We have discussed multiple times that any power of tea commutes with tea. Because multiplying a power of t by t on the left or on the right will give you again, the power of tea. The degree will be greater by one. It doesn't matter on which side we multiply.
Likewise, identity commutes with everything. We pull it and then V gets hit again. By this we get t of zero, and that's zero.
This shows that for every vector in the subspace lambda, IT, the vector T applied to v will also be in the subspace. And that's exactly the definition of invariance.
What it means is that we have broken the vector space into pieces, where the operator acts along each piece independently. Since every vector can be written as a sum of of vectors in different subspaces. Here by linearity, we can find the action of on an vector b from the knowledge of the action of on each of those subspaces. That is the value of the composition of a vector space into a direct sum of vient subspaces. If those subspaces were not enviroent, we would not be able to argue like that. That decomposition would not give us any insights about the appiator itself. Think about the example that I gave you last time where you have an operator which acts as two, one as a matrix two, one, then the standard subspaces that we depict as a horizontal and vertical lines are invariant. This is invariant. Therefore, we can find of V for any vector by decomposing V as a sum of V one plus V two, where one is a horizontal vector, V two is a vertical vector. And using linearity, this will again be a horizontal vector. This will again be a vertical vector, especially because the eigen vectors, we know that this will be two, V one and this will be two. That's how we find that V1v2 goes to V12. But if we took a different decomposition, another pair of lines, some generic pair of lines which are not preserved, notice that these are the only two lines which are preserved because if you have a vector which is not lying in. In either of these two lines. Then when you write it as V one plus V two, both V one and V two will be non zero, but the operator acts by multiplying the first by two and the second by one. Therefore, the V one plus two does not get multiplied by the same number. It's neither multiplied by two, nor by one, nor by anything else. A vector which is not a horizontal vector or a vertical vector is not an eigenvector, Therefore, it doesn't get multiplied by a number. Therefore, no other line other than the horizontal and vertical line is invariant. This and this are not invariant even though our vector space is a direct sum of the yellow lines and the direct sum of two white lines, the decomposition into direct sum of yellow lines is irrelevant to the question of finding how acts. We might as well just look at how it acts on the entire space. You see that's why this condition of variance is essential. Luckily, we do have this condition here. What are we trying to do? We're trying to show that there is a basis in V which has a certain special property, but because of the decomposition, we can construct this basis as a union of basis of each of those pieces. Right? Each of the subspaces has a basis that is a Jordan basis, then the union of those bases will be a Jordan basis in the entire V. But to speak about Jordan basis here, for, we have to make sure that this is invariant. Because if it's not invariant, it is not preserved by the separator. Yes, you can apply the sparator to every vector. For instance, we can apply your operator to this vector, but it will take it out of the subspace. So you're not going to get an operator acting on the same space. Remember, a general linear map is not a map from V to V. It's a map from V to. If you have a map from V to. There is no notional vector because there's no identity operator. Identity operator can only act from space to itself. All of this discussion is about operator acting from vector space to itself. Therefore, if we want decomposition, you better make sure that each of the subspaces in your decomposition is mapped by the operator to itself. Luckily, we have that property here and we will use this property. Now, consider restricted the restricted to G of lambda t. That is to say we only take input from this subspace. Then this is going to be an apparator from the subspace to itself. Now, this aperator is not nilpotent, but if we take minus lambda for this appert, it will be nilpotent, you see, because identity operator preserves every subspace. Likewise, a multiple of identity operator. If you take minus lumber and restricted to the subspace, you will get a well defined operator. Again, because subtracting this doesn't change the property of invariants. But this guy is nicer because we know that every vector is annihilated by sufficiently high power of the operator. We also know that the no spaces of this power is actually stabilize. In fact the operator potent the separator is potent because of the definition of the generalized again space because of this formula. Therefore, we can apply the result we proved last time because now it fits the properties that we stipulated last time. Apply the theorem from last time, We find that T minus lambda I restricted to lambda I has a Jordan basis. Let's call it beta I. Now in this basis, the matrix of this operator, let's call this operator. The matrix of this operator relative to beta is going to look like this. It's going to be a union of blocks, where Where each block is going to be a jordan block of some size but corresponding to agon value zero, right? For the nilpotent operator, diagonal entries have to be equal to zero. The simplest operator of this kind is just a jordan block which has zeros on the diagonal. But we're not claiming that this matrix is just going to have one jordan block, it could have several. In fact, it could be zero matrix for all we know if it happened. And it could happen that this subspace is actually in honest to goodness, Eigen space, so that every vector is annihilated by t minus lambda times the identity. What does it mean? It means that the operator acts on the subspace by zero. The operator could actually be zero, and then the matrix associated would be zero. From the perspective of Jordan blocks, I explained that diagonal matrix which includes the zero matrix. You can think of the zero matrix as a diagonal matrix for which the degenerate case where every diagon is zero. It fits into the paradigm of Jordan forms, because you think about it as a union of blocks of size one by one, but in general, it's going to be a union of blocks. Let me give you just one example, just to illustrate what it's going to look like. It could be like this. There could be another block of size two, another block of size two. Let's say one more block, size one to simplify my notation. First of all, I'm not going to write zero. I'll just leave empty space. Also, I'm not going to draw all the other lines. You have to realize that I'm talking about maybe just a or of the entire matrix, but not anything else. That's what the matrix looks like. That is an example of a kind of matrix that we get. We proved last time that such a basis exists. The existence of a Jordan basis for an important Apator means precisely that relative to that basis, the matrix of the separator looks like so. But then you say, okay, wait a minute, this is not operator restricted to our subspace, but it's minus lambda times identity. Which I say, that's fine. Let's find out what is with respect to the separator. To get back from the separator, we have to add to it maim identity. This is I then restricted to of lambda T, which is what we want. We could obtain it from this matrix by adjoining lambda times in matrix, which simply means inserting lambda at every diagonal entry. So this is not going to change the fact that it is a union of jordan blocks. It's still union of jordan blocks, it's just that fi those jordan blocks corresponded to lambda zero but lambda, since I put lambda lambda. But now we have restored lambda to its previous glory union of jordan blocks. And I may be J and J because it should be different index from is already taken. Okay? Any questions about this? All right, so that's how we do it this way. We find that beta has the requisite property for the operator restricted to this piece of lambda T. But then if we take the union of those bases, beta I, we'll get first of all, a basis. Now take, okay, then this. First of all, this is a basis of of V, which is always the case when you have the direct sum composition of any vector space into subspaces. Which may have nothing to do with generalized like in spaces of any operators, just generic direct sum composition. If you take the union of basis of those subspaces, you get a basis of V. This is a basic result, no pun intended. First of all, it's a basis of V. But second, if you look at the matrix of relative to this basis, you will see that it will have blocks where each block responds to the MIT M1t. Where here you will have restricted to this lambda one relative to beta beta one. It's going to be a of this beta one. On finally you'll have the block corresponding to the matrix. Actually I can write it inside of restricted to of lambda m. You see that there are two levels of blockness to speak or block decomposition. First, you have a decomposition of two blocks point to each, each lambda lambda one, lamb two, lamb three, and then there are zeros everywhere else. We knew that from the beginning because I already explained this last time. It's a very important point to realize that if you have a decomposition of a vector space into direct sum of environ subspaces, where is operator, then the matrix of the separator with respect to a basis, which is of this form, that is to say a union of basis of those subspaces. We'll always have block form and vice versa if you know that you choose a basis of each subspace and take the union and write your matrix relative to this union, and it's going to have a block form. This is exactly equivalent to saying that your vector spaces direct some of these subspaces and each of them is invariant. That's the block decomposition, one at level one. But then we zoom in and we look at each of these blocks, for example, this block for lambda one, and we find finer decomposition into Jordan blocks, never. This is not a Jordan block, it's not a single German block, it's a union of several blocks. But what they have in common is the same lambda one. It could be several ones, but all have the same things on the diagonal. Likewise for lumber two and so on. For lumber, that is the general structure. And this is what we now have learned by simply zooming in on each of those subspaces, and using the result from previous lecture that concerned an important operators. This proves the theorem. Any questions? Okay. So now I would like to give you an example of how this works, because it's one thing to prove theorems about it and it's another thing to see how this works in practice. Let's look at the simplest case, which is two by two. Here is an example that I propose. Again, just for the sake of brevity, I will not distinguish between operators and the matrices. My operators will be acting on two and I will be just writing two N I will representing the operator by the corresponding matrix relative to the standard basis like this. It's not in Jordan form, right? But in other words the standard basis of two, it's an operator acting 2-2 given by multiplication by this matrix on the left. The standard basis of two is not a Jordan basis. Well enough, there is a one here. Should be zero. But we know that you can always find such a basis. We just proved it. Let's find it. But to find it, we have to first know what the Lambdas are. What tools do we have to find Lambdas? The tool that we have for now is the minimal polynomial. Next week we will introduce the Critysic polynomial as well. And that's also a very powerful tool. Which actually works more efficiently for matrices of size greater than two. For size two. Actually, it takes as much work to find the gen, values using minimal polynomials as using the cretysic polynomials, as you will see in a moment, as we have already discussed. Let's find what the minimal polynomial is. We know that the minimum polynoma has degree at most two. It's going to be polynomial involving T squared T In identity there are unique numbers. Abc says that the sin recombination is a zero operator, unique of smallest degree. It could be of degree one. If you have one of degree one, you can always multiplied by any other polynom of degree one. In this case it's going to be of degree two. Because if it were of degree one, we know that it would be. When it's diagonalizable, we don't want to consider that t squared is Minus one, minus 223. And so we have also identity he, first of all, p is not of degree one because the polynomial of degree one, the only operator for which it would be of degree one, is a multiple of didentity. This is not a multiple of identity, obviously, because it has a diagonal entries, you have to find t squared plus t plus that. It is zero. It's easy to do is minus two is equal to one, t squared minus two plus is equal to zero. The corresponding polynomial is z squared minus z plus one minus two, which is z minus one squared. Operator is not diagonalizable because we know that an operator is diagonalizable if and only if it's a product of linear factors, each appearing exactly once. And here's a square. That's how we know it must be blocks, but how many blocks can there be? It's on two by two. There must be one jordan block of size two. That's fine, but let's find the Jordan basis. By the way, the critic polynomial is equal to the same expression. You could calculate the determinant of the matrix minus z times I. You get the same thing. That's an alternative way to find what the eigen values are. There must be the basis should look like this. Should be a chain of lengths. Two, remember we discussed that there are two vectors. There's going to be two vectors, V one and V two. Then this goes to zero under t minus identity. This one is going to be the vector, that is to say it is annihilated just by minus identity or t x on it, by multiplication by one, or preserves it, if you will. This guy is not an agin vector, but if you apply t minus twice, you get zero. How to find them? The first one is easy to find is just that you write, let's say V one is a one, two. Then you write zero minus 112 times a one. Two is equal to a one, two if you will. That's one way to write the equation for egon vector. But we could also write this minus identity times one, two minus identity is going to be minus one minus 111 times one, two. It's easier in this form because it looks like homogeneous system of equations. You immediately see what you need to do. One and two have to have opposite signs. Then this way you'll get zero. That's what I mean. Okay, please ask me if something is not clear in this calculation, so we find that one is minus a two. And so we can choose V one to be of the form, let's say one, one. What did I write? Yeah, one minus one, right? So for example, now keep in mind that the eigenvectors are never unique because if you have an eigenvector, you can always multiplied by a non zero scaler, and this will not violate the equation, which is consistent with the fact that what we have here is a homogeneous system of equations. Obviously, there is no unique if a solution exists. It's not unique because it can always multiplied by a non zero number, at least has a one dimensional space of solutions. Therefore, there is already some choice here. You could write, for example, minus one one, or somebody else could write one, two, minus one, two. In principle, they are all legitimate. At this point, it becomes a matter of convenience to write it in a more convenient way to simplify your calculation. But what about the two? Now, this is a very important point that I want to explain. You see that's our two dimensional space on which the operator acts. We have now found this V one. This is an Eigen space. And we know that there are no other in vectors because if they were the mineral Panoma would not be equal to z minus one squared but would be equal to z minus one. Or it would be z one times z minus something, which is not equal to one. That means that no other vectors are vectors, but every other vector under the action of this Apator. When I say this operator, I mean this is minus. Because that's what the chain is with respect to t. We are trying to find two a vector which is mapped to V one by t. Let's observe that actually every non zero vector which is outside of this line will have this property. It will not maybe give us one on the nose, but it will give us v one up to a multiple because there has no place to nowhere else to go. In fact, the entire space is annihilated by the t minus squared. How do I know that square minus two minus t plus I is equal to zero minus I squared is zero? It kills every vector. Let me emphasize this, but now we need to find, we need to find a particular vector, which if you apply minus I, you will get a non zero vector number one, but also exactly one, because we want to get a chain. Then we will get the Jordan Block. How to find it. The point is that you can just take any vector. Of course, there are two possibilities, two ways to solve it. First, you could just, in general, it could be a subspace, two dimensional subspace, you would have to find a second vector which is annhilated by t minus squared. The space of solutions of T minus I applied to V equals zero is one dimensional, but T minus squared applied to V is two dimensional. You'd have to find another vector. But here we have no other blocks, have no other subspaces. We can take any vector. Then we can just write the equation. Let's say two is b1b2. We could write the equation minus, minus, minus one, minus 11112 equals one minus one, Right? That's how we can find it. But there is an even simpler way, which is just take any vector which is not proportional to this. This is first first method. Here you have a sequence of two equations with two variables. It takes a little time. Here is a shortcut. The short cut take some vector which you know for sure is not proportional. So it's not, again, space, but let's go it. 210 for instance, clearly is not proportional. Proportional to V one, because it has a second component is and then just apply minus to it to V two, what do we get? We get minus one, one. Well, let's write it. Minus one, minus 111, that's the first column of this matrix. That's not my V one, right? This is minus one. But you know that you can compensate whatever efficient you get here. You just divide the two tilda by it, and then you will get vector that you want. Instead, this with minus, then it will be minus here, minus here, and now this is V one. Therefore, we find that the Jordan basis is minus one, zero, that's V two. And one minus one. And it goes to zero, right? That's two and that's one. So now I can erase Tilda, that's the fastest way. Now you have a bigger space which says three dimensional. You first have to identify your two dimensional subspace, and then take a vector which is not proportional to the eigen vector. You may have to solve the equation t minus I squared equals zero first to find what the subspace is of generalized vectors for lambda equal one and then do this calculation. But in the dimensional case, there's not many choices, so therefore you can do it very fast. So you know that if you call beta. V1v2. Then the matrix of T with respect to beta is going to be 1110. That's the Jordan form, and that's the Jordan basis. However, that's not a unique basis. This is a second point. First of all, I have shown you the check how to do it. But the second thing to realize is instead we can take, instead of minus one, zero, we could take, that is to say, two, what we call V two. We could take two plus any multiple of V one. Let's call it V two prime. We still have the same property because if you apply minus to V two prime, it's the same as minus I applied to V two, plus t minus I applied to alpha v one. But this is zero because one is killed by minus I. See, the second vector is not uniquely defined. You can always add to it a multiple of the eigenvector. If you add a multiple of an eigenvector as you proceed along the chain. And you can do it at every step, actually, as you proceed along the chain, you will not feel it because the separator just kills the eigen vector. In fact, that means what can we could take a vector minus one plus alpha minus L? It would serve the same purpose. We would get the same picture if we replace two by this guy. Yes, this is the most important thing. When you do this calculation, you don't work with, you work with minus lambda times the identity is not nilpotent, does not give you unless it is nilpotent to begin with. Right? This is the trick. On each of the generalized gen spaces, you should consider t minus lambda times identity lambda I is lambda of that subspace. In this case, the whole space is a generalized gen space and lambda is equal to one. We take t minus I. Okay? All right, so now this sounds a little bit disconcerting because you start wondering, well, the Jordan basis is not unique. How do we even know that the sizes of the jordan blocks are uniquely defined? For now, we don't, we only know that there exists a decomposition into Jordan blocks, but we have not yet established that those blocks will necessarily have the same sizes, which depend on, on, not on any additional choices. And now I'm going to demonstrate that this is the case. So in general, actually, let me keep this just in case. Maybe I'll need it. I might need it. So let me keep this. But I guess I could I could raise this. Let me remind you one more time that we have two levels of block decomposition. First, there is a decomposition, two blocks corresponding to each lambda, and then the blocks labeled by lambda e. Then, for each lambda e, there is a further decomposition into Jordan blocks. What it shows you is that you see the size of this matrix. Therefore, the size of this matrix is a dimension of the generalized again space Chris point to lambda I, right? Because we said that this matrix is in fact restricted to G of lambdaT. Therefore, its size is the dimension. But on the other hand, it is equal to its size is equal to the sum of the size of all the jordan blocks. We find this identity. This is equal to the sizes of the jordan blocks which appear in it. At least we know that the sum of the sizes is determined by itself because you see for instance, the dimensions of those subspaces, lambda T, They depend on T only because they're given by some equations corresponding to, they do not depend on the choices over some G. But now I want to show you that actually all the sizes are invariant. Invariant. Maybe using the term invariant would be too confusing because we already used the term T invariant. But in mathemaics, we often use the term invariant to say that something is independent of additional choices. It's an invariant attached to, meaning that whatever other choices you make, those numbers will stay the same. Stays the same. Let's just say that the numbers which are determined by without over using the word invariant, The claim is, maybe I'll put it as a lemma, is that for a given, the sizes of the Joan blocks are uniquely determined by the no additional choices I needed. No choice can change those numbers, those sizes. The proof is simply, we should prove it for each lambda separately. Because we know that blocks of type lambda I appear only in the decomposition of the matrix corresponding to operator restricted to go lambda. Here I want to recall this idea of a chain. The fact that this operator has a matrix which is a union of Jordan blocks is equivalent to the statement that the corresponding basis is a union of chains for t minus lambda, where the notion of chain was introduced last time chain. Let's draw the chain. In this case you have the size two other size two. Let's prove it in this case. And then it will be clear how to prove it in general. Okay, You see this is minus lamb. Then here we go to zero under the action of t minus Under which means what? Which means that if you take the null space of t minus LmderI, it's exactly this, right? Because is spanned by these vectors, which are the last vectors of the chain. That's where we jumped off the cliff, so to speak. Every other vector is going to be a linear combination of a vector of this form plus a vector of this form. And this form, say at least one of them with non zero efficient. When we apply lambda, they get shifted along the chain and therefore we get something non zero if your vector has non zero components of this and this. In other words, all the things which are not last colon, if you apply on lambda to it, you will get a non zero vector. Therefore, they will not contribute to the no space. However, if you take the union of this or the first two colons, this is the first colon or last column. All right. Now what if you take the no space of T minus lambda squared? I claim that that's the spin of the last two colons, right? Because these guys are annihilated by Tim Mans lamda I. Therefore by squared as well, these guys are annihilated by tim lamda I squared. This is not annihilated by TimnlmderI squared only by manslamder cubed. Therefore, it cannot contribute. Finally, well, in this case, it ends here. Then we get the whole space. If you take the no space of the cube, what does this tell us? It tells us that intrinsically, just from the knowledge of itself, we can find the dimensions of the number of elements in the last colon of this picture of this diagram. The sum of the elements of the last colon and the previous one. The last two, the last three, and so on. But if you know those numbers, you can reconstruct the diagram you see. Let me write this down. The dimension of the null space of T minus lambda I is the number of elements of the last colon. In our case it's going to be four. The dimension of T minus lambda n space of chance squared is the number of elements in the last two columns, which in our case is seven, the dimension of the No space. Minus cube is the number of the last three colons, which in our case is eight. In general, it will continue in general, but in our case we will fit space will always stop somewhere. Namely the total number of colons in this diagram in general. But in this case we stop. Now you see these numbers of elements in the last column of this diagram. Sum of elements and last two columns and so on, are completely determined by itself And lamb of course, because they are dimension of the no spaces. Dimension of no space doesn't depend on the choices given an apart. It has no space. That's it. But I claim that once I know those numbers, I can reconstruct the diagram from the first equation. I know that the diagram will have four in the last column. All right. From the first two equations, I know that the total of two colons is seven. But I already have four. That means here it will be three. The total is seven. Here is four, right? In other words, you have to look at the differences. This is a number of elements in the last colon. The difference between these two is the number of elements in the pin ultimate colon. The difference between these two is in the preceding colon and so on. So in this case is just one more because this is all eight. But if I know already all the columns, then I know all the roles because it's the same combinatory information. So that's how you know that you can reconstruct the diagram from the knowledge of the dimensions of the n spaces, of the powers of Tim identity. And that's how you know that the diagram of the union of those chains is completely determined by the operator itself. There, even though the Jordan basis could be slightly different as we saw even in two by two case, the basic structure is going to be the same. The sizes of the jordan blocks are going to be the same. That's how we see it. All right, so any questions? All right, so now we have achieved this goal. So one thing that remains is how to generalize the analysis that I have given you for two by two matrices to say three by three or four by four and so on. For matrix of size is greater than two, it's becoming more and more difficult to find. The minimal polynomial. Mini poom is a great theoretical tool to prove things you see were able to the decomposition into generalized again spaces and so on. All these results and the Joconicalfm by using minimal polynomial. We never used the notion of cretisic polynomial which you studied before. But still for practical purposes, critic Pin is much more efficient than minimal polynomial. That's why I decided to spend some time to actually explain how to the meaning of the critic polynomial. Cretisic polynomial of operator is by definition the determinant. Determinant of is the determinant of the two possibilities. To write it, you can either minus I or determinant of I minus. The advantage is that this guy will start with the end, where n is the dimension of V. Whereas this guy will start with plus or minus. It's going to be minus one to n to the n. Some people find annoying, some people find annoying. Putting the minus, you put a sign in front of T some people. But then they're happy that z to the n. The poma is going to be monic poma for every n here. The annoyance is here. It starts with, you can subtract the looks a little nicer, but the annoying fact is that it's a monic ponoma only if n is even, and odd starts with minus z to the n. Of course, it's not a big deal. You can always rescale it, you can always put a sign minus one to the on in the book. They're going to introduce it like this. We will introduce it like this, the minus. Earlier in the course I mentioned Crits Poma, I wrote it like this. But you see now that the two definitions only differ by sign. So it's not a big deal. But to introduce it, we have to know what is the determinant. Traditionally, students are fed an explicit formula and told some properties of it without any explanation of where this formula came from. We will actually discuss from first principles how the determinant appears and what is the meaning of the determinant. And that will be the subject of the rest of today and next Tuesday. Okay, It's actually not very hard. Here is the idea. The idea is the determinant. So what is the determinant? The determinant is actually a function from matrices, from operators say LV, or from the two possibilities to think about it as a number associated to an operator acting on the vector space of dimension n. Or to a matrix which is ByN matrix over your ground field. A bit conceptual in the first, that is to say that it's act on operators V is a vector space over here, actually could be an arbitrary field. In this course we only consider as real or complex number. We will go back to that generality of being either r or C in the previous discussion of Jordan form. As you remember, we restrict ourselves to F C for the simple reason that we didn't want to deal with polynomials which do not factor over your field into linear factors. V is the finite dimensional vector space. Let's say dimension of V is n. Then you have the space LV, which is all operators from V to v. The determinant is a function which assigns to every apperator a particular number, which is called. It has a remarkable properties, which is that, first of all, the determinant is non zero if and only if t is invertible. Second, if you take a composition of S and apply the determinant, then it is a product of the determinants. Okay? We need to show that such a function exists. In fact, this function exists and is unique, uniquely defined by these properties. The key idea here is to introduce multilinear functionals. That's the key idea. Now we're going to discuss what multilinear functions are. At the end of the discussion, we will introduce specific space of linear functionals. Will lead us straight into the determinant. Or lead us straight to the determinant. What are the multilinear functions? This is a generalization of linear functionals as the name suggests. Let me remind you that linear functionals is a function from V to where again, is the ground field over which V is defined is linear, meaning that it's a linear map. When we say linear, it means always that the sum goes to the sum and scalar multiple goes to scalar multiple, okay? And we know that all linear functionals, the set of all linear functionals on a given vector space is itself a vector space called the dual space, and noted by V prime. Right? Now you see this is a function of one argument. The argument is a vector V. When I say one argument, I don't mean for instance, we could be two dimensional, three dimensional, you could say it's a function of two or three variables. If you write this vector in terms of a basis, what I mean is there is a single vector which is the argument of this functional multilinear. The notion of a multilinear function is obtained by allowing more than one argument. Consider a similar thing where you have to three or more arguments. For example, you have bilinear functionals. So what is a bilinear functional? A bilinear functional, or bilinear form, is a map from V cross V to, which is linear for each argument. Let me remind you the notation in set theory. There are various operations that we can perform on sets. For instance, if you have two subsets of a given set, we can take the union or intersection. But also if you have two sets, the first one and two are two sets, then we can form what's called the Cartesian product. Cartesian product of these two sets. By definition, its elements ordered pairs where the first element in the pair is an element of the first set. The second element is an element of the second set, right? For example, if S one is equal to r, and S two is equal to r, then S one cross two is two because it consists of all pairs of real numbers, ordered pairs of real numbers. Now what we're doing is seeing a Cartesian product of our vector space viewed as a set. For now, every vector space is a set. It starts life as a set. And then it has two operations which satisfy axioms and so on. But first and foremost, every vector space is a set. Therefore, we can apply the vector space, any operation that is legitimate in the world of sets. Here on a one set, the set underlying V, two also the set underlying V, and take the Cartesian product, beta is a function from here to here. Now, a function, again a function, is also a stheretic notion, a function. In general, there is a rule which assigns to every element of your set, an element of another set. In general, a function from S one cross two to some other set, let's call it three, is a rule which assigns to every element of this, which is a pair, some element of three, which we denote as one of, of s1s2. You can think of this function as a function of two arguments. If you have a function 1-3 function of one argument, and that argument is an element of S one. If it's a function from S one cross two, it means that effectively, it's all the argument has this form. Something where the first something is in S one and second something in two. But from the perspective of calculus, it looks exactly like the notion of a function of two variables. It's just that normally when we say function two variables in calculus, in mass 53, we mean that both variables are in R. That's the case of this example, but in principle, it's a totally general situation. It's the function of two variables or two arguments. More precisely, maybe where the first argument is in the set one and the second argument in the set two and the value is in S three. Now in our case, S one and S two are both S three is our field. Effectively, what is this? This beta assigns to every pair of vectors ordered pair. Because this is always ordered, we know which one is the first one is the second, because we write it on a sheet of paper from left to right as the first second goes to beta VW. Okay, so that's a general function, but remember in the case of linear functionals, we also impose linearity condition. In the case of linear functionals there was only one argument. The condition was on that argument, but now we have two arguments. All right? So it's natural to impose linearity for each argument such that it's linear for each of the two arguments. Which means that beta of V one plus V two with a fixed is beta of V one. Beta of V two. Beta of C V is beta of VW. Likewise, for, for the second argument. For the second argument it will be beta of V W one plus W two. That's a comma, beta one, beta V two. Beta of V C is equal to beta. That's all. It's a function of two arguments. Now, both arguments take values in V in our vector space. It's a function which is linear for each factors. This is very natural definition. Let me give you an example to show. Suppose that is r two. Then we can have standard basis 1210 and zero. One is the basis of R two. Every vector can be written as the sum of every vector V. One can be written as one, one plus two, two. And V and W is B11 plus B22, right? Yes, we have to read it from left to right, not from right to left. It's a rule which says that if my first argument is a multiple of something, then the result will be that multiple of the beta in which I substitute V and W. So what, you're not trying to solve this in terms of this, you're saying that this will have to be equal to this. In other words, whatever rules we are used, flinarity, they have to be satisfied for each argument separately, right? So it doesn't matter that the result from yes, it's true that this is in retrospect, is beta of CVWyes. It's one of the consequences of this, right? But it's not a problem. Okay, now, now let me calculate beta of VW using this information that V and W are linear combinations of our basis. Both are linear combinations but different combinations. That's why I wrote a one two for the Qficients of V and B1b2 cufficient of, it's going to be beta of one, one plus two to one, One plus two, two. Now I use the yellow linearity for the second argument and the white linearity for the first argument. To open the brackets, so to speak, what it means that effectively, it behaves as if it was just a product of these two things, right? Because using the first, I get one beta of 111 plus 222, beta of 211 plus two two. Now I use the second linearity to get two terms out of each of these. I get a one B one beta, I want to put it, beta of one, one plus one, two. Beta of one, two plus 221, beta two one plus two, two. Beta of two, two. You see that the only thing that we need to know is what is beta of I, E, J these four numbers. They're not given, so we have a freedom to choose them anyway we like. This is similar, by the way, to how it was here. If I was using it in the case when V is R two, I would also write A 1122 is one of one plus a two. Five of two. In this case, we have to say what five of one is and what five of two is. And then we know five of every v bilinearity. Now we need to know four. Now you can guess that if it's timial space, you need to know n square numbers, which are going to be five of J. It's convenient to actually arrange them as a matrix that's called the matrix of the bilinear form. Relative to a given basis, it's going to be to have all these things. Beta one beta of one, two beta of two, one beta of this mat completely determines, uniquely determines beta bilinearity. All right, Because we have specified these numbers, we know the value of beta on every pair of vectors V and W. According to this formula, simply multiply each of the betas by the corresponding product of a and B and take the sun. That's the value. Okay. The way to think more concretely about the bilinear forms, that's how we can think more concretely. But there's more because here we start out with linear functionals. And a single function is a function on a vector space, a function from vector space to. But then we realize that actually the set of all linear functionals is itself a vector space because we can also add linear functionals. Likewise, we can add bilinear forms. The set of bilinear forms, I'm going to state this result, and I'll just leave it for you to prove because it's very simple. Let me just make sure. Let's call this matrix M of beta relative to our basis. Unfortunately, beta, we like to use beta for basis. I wasn't thinking about that. Let's use some other, let's call the basis in our case one, two. I cannot use, but it doesn't have indices, capital B, capital B. I'm going out of my way not to confuse you and you're making fun of me. Okay. Anyway, so yeah, I, I should have used a different notation, but that's what they used in the book. And so it is the register fine. So the next result is. The next result is lemma. Actually, before lemma, let's introduce the set of two. Let's call two. Here's prime when it's a single argument. Brackets is like an analogue of prime. Two is the set of bilinear forms on V. Then the dilemma is that two is a vector space, that's the first part, is a vector space with respect to the obvious apperations that if you have beta one and beta two elements of this set, then we simply take beta one plus beta two. Meaning what is beta one plus beta two? It is a bilinear form whose value V is exactly what you expected to be, is just beta one V beta two on. It's the same idea as how we define the sum of two operators or sum of two linear functionals. Sum of two linear maps, just take the sum of the values. That's it. And in two just goes to times beta. Beta of V is just times beta of V. That's it. Part two is that we have a map from to, let me remind you that and stands for the vector space of n by and matrices over. We have a map from V two and n which sends each bilinear form to the corresponding matrix. For some basis B basis B a basis. Then we have a function from V two to n by matrices which assigns to every bilinear form it's matrix the way I define it for two by two, but it's very obvious how to generalize it for n by n. You just take in general. In general, you're going to take beta of I J, that's the entry V of beta B IJ by definition. Okay? So then the claim is that this function is actually a linear map and isomorphism. It is an isomorphism of vector spaces for each choice of a basis. Okay? This implies that the dimension of V two is equal to the dimension of FNN, which is the vector space of bid matrices whose dimension is obviously Un squared. The dimension of two is squared. In other words, there are exactly n squared parameters. For example, for n equal two, there are four parameters. Those parameters are the entries of this matrix which uniquely determine abilinearfm. Now we get on a nice handle on this bilinear forms is a vector space which, if you choose a basis, is actually isomorphic to the vector space of bio matrices. Remember, if we choose a basis, the dual space is isomorphic to n by one matrices or one by mats. We like to write them as rows other than colons. It's nice evaluating a linear function on a vector becomes a matrix product. But here, there is something new. Okay? This is where the interesting stuff begins. Up to now, it looks a little bit pedantic and useless. But there is an interesting question. As long as we have just one argument, there is nothing more that we can do. But when we have two arguments, we can ask what happens if we switch them? Because you have beta of VW, but you also have beta of WV. This leads to the notion of symmetric bilinear forms. The latter notion of alternating bilinear forms will eventually lead us to the notion of the determinant definition. Beta two is called symmetric if beta of VW is equal to beta of V for all VNW. It's called alternating. If beta of V is equal to minus beta of V for all VN. Okay? So the next Lemma is that if we to denote by two inside two a subset of symmetric bilinear forms and two out the subset of alternating bilinear form forms. In fact, we will get subspaces of V two. These are subspaces. By the way, notice that at the level of matrices, let's look at the two by two keys. This has the same arguments and this has the same arguments, but this has one, two, and this has two one. Beta is symmetric if and only if this is equal to this. Such matrix are called symmetric matrices. Beta is alternating if this is equal to minus this such matrix A alternating antisymmetric matrices. You see that two by matrices can be written as a sum of symmetric and nine symmetric. The next result, and we'll probably have to quote today, after that, is the statement that two is actually a direct sum. But before I get to this, I want to give you a nice corollary of the definition of alternating matrices. That alternating matrices can be defined, an equivalent definition. How many Ms? I'm not numbering lemmas. Fine. A four. Okay. So then this is three. Thank you. All right. So two out actually consists of those beta in V two, which have the property that beta of v is zero for all V. Okay? Instead of alternating, define alternating forms in two ways. Either by saying that you get a sign if you switch the arguments, or by saying that if the arguments are equal, you get zero. And how to prove it is just you write plus. Let's derive that property from this property. On the one hand this is O. On the other hand, by linearity, this is beta, which is zero. Plus beta of u plus beta of plus beta of. This is also zero. You get that zero is equal to this plus this. Which is equivalent to saying that beta v is equal to minus beta of V. If you read it in the opposite direction, then you will derive this from this. All right, that's an equivalent way to put it. Finally, let me state the result. Them five with three ms will be five ms. I suppose if it's M five, is that two direct sum two out. The proof is very simple. I'll just write the crucial formula. It's really easy to show that the intersection is zero because you cannot satisfy both conditions. Unless beta is actually identically zero. It's called to zero on every pair of vectors. What is not obvious is that every bilinear form can be written as a sum of symmetric and an alternating. But here is the same trick as used to prove that every function on a real line is a sum of even and odd functions. That is to say, beta, suppose, define alpha is by the formula alpha v.