Disclaimer

You are reminded to wait a couple of minutes every time you open the blog, so that it can typeset all the mathematical notation and symbols.

Sunday 30 August 2015

Further Pure | Chapter 9: Matrices and Linear Spaces

Curriculum Heading

  • recall and use the axioms of a linear (vector) space (restricted to spaces of finite dimension over the field of real numbers only)
  • understand the idea of linear independence, and determine whether a given set of vectors is dependent or independent
  • understand the idea of the subspace spanned by a given set of vectors;
  • recall that a basis for a space is a linearly independent set of vectors that spans the space, and determine a basis in simple cases;
  • recall that the dimension of a space is the number of vectors in a basis;
  • understand the use of matrices to represent linear transformations from ${R^n} \to {R^m}$
  • understand the terms ‘column space’, ‘row space’, ‘range space’ and ‘null space’, and determine the dimensions of, and bases for, these spaces in simple cases;
  • determine the rank of a square matrix, and use (without proof) the relation between the rank, the dimension of the null space and the order of the matrix;
  • use methods associated with matrices and linear spaces in the context of the solution of a set of linear equations;
  • evaluate the determinant of a square matrix and find the inverse of a non-singular matrix (2 × 2 and 3 × 3 matrices only), and recall that the columns (or rows) of a square matrix are independent if and only if the determinant is non-zero
  • understand the terms ‘eigenvalue’ and ‘eigenvector’, as applied to square matrices;
  • find eigenvalues and eigenvectors of 2 × 2 and 3 × 3 matrices (restricted to cases where the eigenvalues are real and distinct);
  • express a matrix in the form $QD{Q^{ - 1}}$ , where D is a diagonal matrix of eigenvalues and Q is a matrix whose columns are eigenvectors, and use this expression, e.g. in calculating powers of matrices.

Our work on the vector spaces, and eigenvectors require understanding of the fundamentals of algebra of matrices. It is prudent, then that we begin with it, seeing that you may have not met it in the study of O/A Level Mathematics Course. You may have done some work on matrices of 2 x 2 order in the additional mathematics course, those people will too find the introduction to matrices indispensable


Matrices


The following rectangular array:

$\left( \begin{array}{l} \begin{array}{*{20}{c}} {{a_{11}}}&{{a_{12}}}&{{a_{13}} \ldots }\\ {{a_{21}}}&{{a_{22}}}&{{a_{23}} \ldots }\\ {{a_{31}}}&{{a_{32}}}&{{a_{33}} \ldots } \end{array}\begin{array}{*{20}{c}} {{a_{1n}}}\\ {{a_{2n}}}\\ {{a_{3n}}} \end{array}\\ \begin{array}{*{20}{c}} {\; \vdots }&{\;\;\;\; \vdots }&{\quad \vdots }&{\quad \vdots } \end{array}\\ \begin{array}{*{20}{c}} {{a_{m1}}}&{{a_{m2}}}&{{a_{m3}}}&{{a_{mn}}} \end{array} \end{array} \right)$


subject to multiplication and addition to be defined, is called a matrix of the order $m \times n$. A matrix where m = n, is called a square matrix. Of a square matrix the diagonal that goes through the top left element and bottom right element, is called the principle diagonal.


Upper (Lower) triangular Matrix


In a square matrix, whose elements below (or above) the principle diagonal are zero are called Upper (or Lower) triangular matrix.

$\left( {\begin{array}{*{20}{c}} {{a_{11}}}&{{a_{12}}}&{{a_{13}}}&{{a_{14}}}\\ 0&{{a_{22}}}&{{a_{23}}}&{{a_{24}}}\\ 0&0&{{a_{33}}}&{{a_{34}}}\\ 0&0&0&{{a_{44}}} \end{array}} \right)$


 is an upper triangular matrix of order 4. It should be noted that ${a_{ii}} \ne 0$, also some of the element lying above the principle diagonal maybe zero but not all.



Diagonal Matrix


A matrix whose only non-zero elements lie on the principle diagonal, is called a diagonal matrix.

$\left( {\begin{array}{*{20}{c}} {{a_{11}}}&0&0\\ 0&{{a_{22}}}&0\\ 0&0&{{a_{33}}} \end{array}} \right)$


where ${a_{ii}} \ne 0$



Scalar Matrix


For a diagonal matrix all of whose non-zero elements are the same is called a scalar matrix

$\left( {\begin{array}{*{20}{c}} a&0&0\\ 0&a&0\\ 0&0&a \end{array}} \right)$

The scalar matrix of 1 is of particular importance. It is called the identity matrix, for it is the multiplicative identity for matrix multiplication, that still has not been defined. This means for a matrix and its inverse, assuming that it exists, the following condition is satisfied:

$A \cdot {A^{ - 1}} = I$

where I is the identity matrix of the same order as A.

Equality of Matrices


For two matrices, A of the order m x n, and B of the order p x q, it must be the case m = p, n = q, and ${a_{ij}} = {b_{ij}}$ for all values of i and j, if:

$A = B$

Sum, Difference, and Scalar Multiplication


For two matrices of the same order, A and B, the sum, written A + B, is the matrix obtained by adding corresponding entries of A and B.


Ex. 

$\left( {\begin{array}{*{20}{c}} 3&2\\ 7&1\\ 8&{{\textstyle{3 \over 2}}} \end{array}} \right) + \left( {\begin{array}{*{20}{c}} 4&1\\ 0&{ - 2}\\ 7&1 \end{array}} \right) = \left( {\begin{array}{*{20}{c}} 7&3\\ 7&{ - 1}\\ {15}&{{\textstyle{5 \over 2}}} \end{array}} \right)$

If we let A be a matrix and c be a scalar. The scalar product cA is the matrix obtained by multiplying all the elements of of A by the scalar c.


$3\left( {\begin{array}{*{20}{c}} 3&1&5&7\\ 2&{{\textstyle{5 \over 3}}}&2&5 \end{array}} \right) = \left( {\begin{array}{*{20}{c}} 9&3&{15}&{21}\\ 6&5&6&{15} \end{array}} \right)$


For the difference between two matrices, A - B, we perform the scalar multiplication (-1)B, and find the sum of A and the new vector obtained.


It follows that the zero matrix is the additive identity, as it is not difficult to see that


$A + 0 = A$


where the zero matrix is of the order of A, and of the form:


$\left( \begin{array}{l} \begin{array}{*{20}{c}} 0&0&{0\, \cdots 0}\\ 0&0&{0\, \cdots 0}\\ 0&0&{0 \cdots 0} \end{array}\\ \begin{array}{*{20}{c}} {\, \vdots }&{\,{\kern 1pt} \vdots }&{\,{\kern 1pt} \vdots }&{\,{\kern 1pt} {\kern 1pt} {\kern 1pt} \vdots } \end{array}\\ \begin{array}{*{20}{c}} 0&0&0&0 \end{array} \end{array} \right)$



Matrix Multiplication


The method of multiplication is a little more complicated than that you may have surmised from the previous definitions. Let me assure you the method of matrix multiplication we will adopt is the most expedient and practical. 
If we let two matrices, A and B, of the orders m x n and p x q respectively, n has to be equal to p, and the resultant matrix will be of the order m x q. The element of the resultant matrix in the ith row and jth column will be the dot product of the ith row of A and jth column of B. Consider the following simple example:

$\left( {\begin{array}{*{20}{c}} 2&1\\ 3&7 \end{array}} \right) \times \left( {\begin{array}{*{20}{c}} 3&4\\ 9&1 \end{array}} \right) = \left( {\begin{array}{*{20}{c}} {2\left( 3 \right) + 1\left( 9 \right)}&{2\left( 4 \right) + 1\left( 1 \right)}\\ {3\left( 3 \right) + 7\left( 9 \right)}&{3\left( 4 \right) + 7\left( 1 \right)} \end{array}} \right) = \left( {\begin{array}{*{20}{c}} {15}&9\\ {72}&{19} \end{array}} \right)$

 Rules Governing Operations on Matrices


Let A, B, and C be matrices, and let a and b be scalars. Assuming that each matrix has the order such that each operation is defined, then:

$\begin{array}{l} 1)\,A + B = B + A\\ 2)\,A + \left( {B + C} \right) = \left( {A + B} \right) + C\\ 3)\,A + 0 = A\\ 4)A + \left( { - A} \right) = 0\\ 5)A\left( {BC} \right) = \left( {AB} \right)C\\ 6)AI = A;IB = B\\ 7)A\left( {B + C} \right) = AB + AC\\ 8)\left( {B + C} \right)A = BA + CA\\ 9)a\left( {B + C} \right) = aB + aC\\ 10)\left( {a + b} \right)C = aC + bC\\ 11)\left( {ab} \right)C = a\left( {bC} \right)\\ 12)1A = A\\ 13)A0 = 0;0B = 0\\ 14)a0 = 0\\ 15)a\left( {AB} \right) = \left( {aA} \right)B = A\left( {aB} \right) \end{array}$


Inverses of Matrices


Not all matrices have inverses. Note that if A has the inverse B, i.e. A x B = I, it can be shown that B is a matrix with the inverse A - B x A = I (Recall that matrix multiplication is not commutative and that AB may not be equal to BA. It has to be proved that if the left inverse and the right inverse exist, i.e. AC = I; and BA = I, then B = C. This is done rather easily, but we refrain from proving that at the moment). Assuming that A has the order m x n, and B the order p x q. If that is the case, for both of the above equations to hold, n = p for A x B = I, and q = m for B x A. Hence one should be of the order m x n and one of n x m, also they must be of the order of I. As we know I must be a square matrix, it follows that only square matrices should have inverses, and that the inverses must be of the same order as the original matrix. It is known that not even all square matrices have inverses. Those that do are called non-singular, those that can't be are called singular matrices. For example take:

$A = \left( {\begin{array}{*{20}{c}} 2&{ - 1}\\ 0&1 \end{array}} \right);\;B = \left( {\begin{array}{*{20}{c}} {{\textstyle{1 \over 2}}}&{{\textstyle{1 \over 2}}}\\ 0&1 \end{array}} \right)$

B is the inverse of A, you may check by multiplying each other.

$\left( {\begin{array}{*{20}{c}} 2&{ - 1}\\ 0&1 \end{array}} \right)\left( {\begin{array}{*{20}{c}} {\frac{1}{2}}&{\frac{1}{2}}\\ 0&1 \end{array}} \right) = \left( {\begin{array}{*{20}{c}} {2\left( {\frac{1}{2}} \right) + \left( { - 1} \right)\left( 0 \right)}&{2\left( {\frac{1}{2}} \right) + \left( { - 1} \right)\left( 1 \right)}\\ {0\left( {\frac{1}{2}} \right) + 1\left( 0 \right)}&{0\left( {\frac{1}{2}} \right) + \left( 1 \right)\left( 1 \right)} \end{array}} \right) = \left( {\begin{array}{*{20}{c}} 1&0\\ 0&1 \end{array}} \right)$

It should also be noted that inverses of matrices are unique. Which is to say that in the above example A & B are inverse, and no other matrix, not equivalent, to A & B, can be the inverse of them. For suppose A has the inverses B and C, then B = BI = B(AC) = (BA)C = IC = C. This holds, obviously for any number of inverses we assume A has.


Inverse of 2 x 2 matrix


$A = \left( {\begin{array}{*{20}{c}} 2&{ - 1}\\ 0&1 \end{array}} \right)$


For its inverse we will require its determinant. I will do an optional chapter on the determinant, detailing properties, applications, and so forth, though you will not need knowledge of determinants beyond that which is provided here.


The determinant of a 2 x 2 matrix $\left( {\begin{array}{*{20}{c}} {{a_1}}&{{a_2}}\\ {{a_3}}&{{a_4}} \end{array}} \right)$ is defined to be ${a_1}{a_4} - {a_2}{a_3}$


The determinant in the above case is $2(1) - ( - 1)(0) = 2$


The initial matrix is adjusted thusly: $\left( {\begin{array}{*{20}{c}} {{a_4}}&{ - {a_2}}\\ { - {a_3}}&{{a_1}} \end{array}} \right)$


In our example we have: $\left( {\begin{array}{*{20}{c}} 1&1\\ 0&2 \end{array}} \right)$


The inverse is then:


${A^{ - 1}} = \frac{1}{{{\mathop{\rm determinant}\nolimits} \;of\;A}}\left( {\begin{array}{*{20}{c}} 1&1\\ 0&2 \end{array}} \right) = \frac{1}{2}\left( {\begin{array}{*{20}{c}} 1&1\\ 0&2 \end{array}} \right) = \left( {\begin{array}{*{20}{c}} {\frac{1}{2}}&{\frac{1}{2}}\\ 0&1 \end{array}} \right)$


You can check by multiplying the initial matrix by the inverse:


$\left( {\begin{array}{*{20}{c}} 2&{ - 1}\\ 0&1 \end{array}} \right)\left( {\begin{array}{*{20}{c}} {{\textstyle{1 \over 2}}}&{{\textstyle{1 \over 2}}}\\ 0&1 \end{array}} \right) = \left( {\begin{array}{*{20}{c}} 1&0\\ 0&1 \end{array}} \right)$


$\left( {\begin{array}{*{20}{c}} {{\textstyle{1 \over 2}}}&{{\textstyle{1 \over 2}}}\\ 0&1 \end{array}} \right)\left( {\begin{array}{*{20}{c}} 2&{ - 1}\\ 0&1 \end{array}} \right) = \left( {\begin{array}{*{20}{c}} 1&0\\ 0&1 \end{array}} \right)$


Inverse of 3 x 3


The method outlined in this section will be a rather tedious affair, and the task increases in its difficulty, as we gradually make our way through square matrices of higher order. Luckily, however, we have a marginally easier method at our disposal, which will not be developed until some time later.

The inverse of 3x3 matrix can be calculated by going through four steps.

The steps are detailed using the matrix:

$\left( {\begin{array}{*{20}{c}} 3&0&2\\ 2&0&{ - 2}\\ 0&1&1 \end{array}} \right)$

We first calculate the matrix of minors. This is step is onerous indeed. The minor of every element in the matrix is defined to be the determinant of the matrix obtained by removing the column and row wherein lies the element considered. For example, let us calculate the minor of 3 in the top left corner in the matrix. This means we remove the row (3, 0, 2) and the column (3, 2, 0), and take the determinant of the matrix left behind, which is:


$\left( {\begin{array}{*{20}{c}} 0&{ - 2}\\ 1&1 \end{array}} \right)$


The minor of the 3 is then 2, as calculated with the definition of determinant given above. Now we go on to do this for all elements in the matrix. At the end we replace each elements of the matrix by its minor, giving us the so called matrix of minors. In this case the matrix of minors is:


$\left( {\begin{array}{*{20}{c}} 2&2&2\\ { - 2}&3&3\\ 0&{-10}&0 \end{array}} \right)$


Next we make the matrix of co-factors, which is found by changing the signs of the elements of the matrix of minor in an alternating fashion. Which is to say that we begin from the top left corner of the matrix (2 in our case) leaving it as it is. We move to the element to its, right and change its sign (this 2 becomes -2). The next element in that row is left as is. We then jump to the second row. Here we change the sign of the first element in this row (-2 becomes 2). The next element is untouched (3). The sign of the element is changed, next to it (3 becomes -3). The third row is changed just as the first row was changed. This results in the following matrix of co-factors


$\left( {\begin{array}{*{20}{c}} 2&{ - 2}&2\\ 2&3&{ - 3}\\ 0&{10}&0 \end{array}} \right)$

An easier way to learn which elements to change the sign of, consider the following matrix.


$\left( {\begin{array}{*{20}{c}} + & - & + \\ - & + & - \\ + & - & + \end{array}} \right)$


The element which corresponds to a plus sign in the above matrix retains its sign, those corresponding to a minus sign, have their signs changed.


For the next step we create the adjoint of the matrix. To make this matrix we leave the elements on the diagonal of the matrix of co-factors untouched. The rest of the elements are changed with the elements that are their reflection through the principle diagonal. Which is to say that the first element in the second row (-2) and the second element in the first row (2) are interchanged. The first element in the third row (2) and the third element in the first row (0) are interchanged. Also, the second element in the third row (-3) and the third element in the second row (10) are interchanged. This process is called transposition. Another way (rather I should say the definitive way to look at it) is that we have effectively interchanged every row with the respective column. The first row is changed with the first column, the second row with second column and so forth. The adjoint (or adjugate) is then:


$\left( {\begin{array}{*{20}{c}} 2&2&0\\ { - 2}&3&{10}\\ 2&{ - 3}&0 \end{array}} \right)$

Finally we divide the adjoint by the determinant of the initial 3 x 3 matrix. The determinant of the 3 x 3 matrix is defined (as previously iterated in the vectors chapter) as:


$\begin{array}{l} \det \left( {\begin{array}{*{20}{c}} {{a_{11}}}&{{a_{12}}}&{{a_{13}}}\\ {{a_{21}}}&{{a_{22}}}&{{a_{23}}}\\ {{a_{31}}}&{{a_{32}}}&{{a_{33}}} \end{array}} \right) = {a_{11}}\left| {\begin{array}{*{20}{c}} {{a_{22}}}&{{a_{23}}}\\ {{a_{32}}}&{{a_{33}}} \end{array}} \right| - {a_{12}}\left| {\begin{array}{*{20}{c}} {{a_{21}}}&{{a_{23}}}\\ {{a_{31}}}&{{a_{33}}} \end{array}} \right| + {a_{13}}\left| {\begin{array}{*{20}{c}} {{a_{21}}}&{{a_{22}}}\\ {{a_{31}}}&{{a_{32}}} \end{array}} \right|\\ = {a_{11}}\left( {{a_{22}}{a_{33}} - {a_{23}}{a_{32}}} \right) - {a_{12}}\left( {{a_{21}}{a_{33}} - {a_{23}}{a_{31}}} \right) + {a_{13}}\left( {{a_{21}}{a_{32}} - {a_{22}}{a_{31}}} \right) \end{array}$


Or you can understand it as multiplying the elements in the top row by their minors and then adding and subtracting them in a specific manner. For the matrix in consideration:


$\det \left( {\begin{array}{*{20}{c}} 3&0&2\\ 2&0&{ - 2}\\ 0&1&1 \end{array}} \right) = 3\left( {0\left( 1 \right) - \left( { - 2} \right)\left( 1 \right)} \right) - 0\left( {2\left( 1 \right) - \left( { - 2} \right)0} \right) + 2\left( {2\left( 1 \right) - 0\left( 0 \right)} \right) = 6 - 0 + 4 = 10$


Finally with all this information we can say that the inverse of the initial matrix is:


${\left( {\begin{array}{*{20}{c}} 3&0&2\\ 2&0&{ - 2}\\ 0&1&1 \end{array}} \right)^{ - 1}} = \frac{1}{{10}}\left( {\begin{array}{*{20}{c}} 2&2&0\\ { - 2}&3&{10}\\ 2&{ - 3}&0 \end{array}} \right) = \left( {\begin{array}{*{20}{c}} {0.2}&{0.2}&0\\ { - 0.2}&{0.3}&1\\ {0.2}&{ - 0.3}&0 \end{array}} \right)$




Systems of Linear Equations


We define a linear equation, of n variables (${x_1} + {x_2} + ... + {x_n}$), as an equation of the form, or can be put in the form:

${a_1}{x_1} + {a_2}{x_2} + ... + {a_n}{x_n} = b$

where b is a constant and at least one of the ${a_i}$ is non-zero.

A system of linear equations, or a linear system, is a set of finite linear equations. A vector of n terms, $\left( {\begin{array} {*{20}{c}} {{s_1}}\\ {{s_2}}\\ \vdots \\ {{s_n}} \end{array}} \right)$ is a solution to a linear system, if it satisfies all the linear equations in that system.

Ex. The linear system:

$\begin{array}{l} 2x = y - 4z\\ y = 2z \end{array}$

is a system of two equations, with three variables. One solution of the system is (-1, 2, 1). Another is (0, 0, 0) or (1/2, -1, -1/2). Rather all variables of the vectors of the form (-t, 2t, t) are solutions to this system. You will later see that systems, where there are more variables than equations, tend to have infinite solutions.

Elementary Row Operations, Augmented matrices and Gaussian Elimination 


It is assumed that you know how to solve such system by algebraic means. Generally the you are only allowed the following algebraic operations to solve a system of linear equations


  1. Multiply an equation in the system by a non-zero scalar.
  2. Interchange the positions of two equations in the system
  3. Replace an equation with the sum of itself and a multiple of another equation of the system. 

These can be adapted to the matrices, to simplify our algebraic process. You will find the method developed here, is felt to be greatly advantageous when tackling linear systems of more than 2 equations. The method is introduced by first solving the equations in our conventional algebraic method and then drawing parallels thence.


Ex. $\begin{array}{l} - y + z = 3\\ x - y - z = 0\\ - x - z = - 3 \end{array}$


Sol.


We begin the solution by exchanging the first and second equation


$\begin{array}{*{20}{l}} {x - y - z = 0}\\ { - y + z = 3}\\ { - x - z = - 3} \end{array}$


We then add to the last equation the first equation:


$\begin{array}{*{20}{l}} {x - y - z = 0}\\ { - y + z = 3}\\ { - y - 2z = - 3} \end{array}$


We multiply the second equation by (-1) and then add it to the other equations yielding:


$\begin{array}{*{20}{l}} {x - 2z = - 3}\\ {y - z = - 3}\\ { - 3z = - 6} \end{array}$


Z is therefore 2. Putting this into the other equations we find:


$\begin{array}{l} x = 1\\ y =  - 1\\ z = 2 \end{array}$

Now to introduce matrices into our study of linear algebra. It is not difficult to see that the linear system in the previous example is equivalent to following equation:

$\left( {\begin{array}{*{20}{c}} 0&{ - 1}&1\\ 1&{ - 1}&{ - 1}\\ { - 1}&0&{ - 1} \end{array}} \right)\left( {\begin{array}{*{20}{c}} x\\ y\\ z \end{array}} \right) = \left( {\begin{array}{*{20}{c}} 3\\ 0\\ { - 3} \end{array}} \right)$


Using this we may construct an augmented matrix of this system. If we keep in mind that all elements in the first column are coefficients of x, those in second of y, and those in third of z, we may effectively ignore the column vector of the variables, also it is convention to disregard the equals sign, and in its stead a dotted vertical line is drawn. It must be noted that the signs must be retained and a zero put there, where a variable is missing.The augmented matrix of this system then:


$\left( {\begin{array}{*{20}{c}} 0&{ - 1}&1\\ 1&{ - 1}&{ - 1}\\ { - 1}&0&{ - 1} \end{array}\begin{array}{*{20}{c}} |\\ |\\ | \end{array}\begin{array}{*{20}{c}} 3\\ 0\\ { - 3} \end{array}} \right)$


One should be cognisant of the fact that these two systems are not equal. Perhaps the latter matrix can be understood as a short hand for the linear system that the matrix equation represents, and makes it easier to perform the algebraic operations we would have performed on the linear system.


We can use the three operations we had defined above here as well. However they will amended slightly, i.e. the word equation will be changed by rows. Hence these three operations will henceforth be dubbed "Elementary Row Operations". The elementary row operations are:


  1. Multiply a row in the system by a non-zero scalar.
  2. Interchange the positions of two rows in the system
  3. Replace a row with the sum of itself and a multiple of another row of the system. 

Care should be taken that each operation is to be performed to the entire row, including the element on the other side of the dotted vertical line. These operations are not restricted to augmented matrices, you may go forth and practice them on any matrix. This, too, has applications as we will see in due time. Let us go through some definitions, before I use the elementary row operation to solve this system of equation. Each of these operations is invertible, and its inverse is an elementary operation in its own right, eg, to scale a row by 2, is reversed by scalling the resultant row by $\frac{1}{2}$.


To use elementary row operations on a matrix is to reduce it.


Two matrices are said to be row equivalent, if one can be found by the use of finite elementary row operations on the other. $B\mathop  \sim \limits^R A$ is the notation used for this


A matrix is said to be in echelon form, or an echelon matrix, if it is row reduced to have the following structure. The first non-zero element (if any) in every row is 1 and the numbers of zeros preceding this one is greater than the corresponding number of zeros in the preceding row. This 1 is of some importance in linear algebra, hence to avoid any tedium in referring to it in future let us designate it the leading one. The following is in echelon form.


$\left( {\begin{array}{*{20}{c}} 0&1&2&3\\ 0&0&1&{ - 1}\\ 0&0&0&0 \end{array}} \right)$


A matrix is said to be in reduced echelon form if it is in echelon form, and the first non-zero element row, is the only non-zero element in its column. The following is in reduced echelon form.


$\left( {\begin{array}{*{20}{c}} 1&3&0&0\\ 0&0&1&0\\ 0&0&0&1 \end{array}} \right)$


Now let us solve the linear system. The procedure would be to reduce our augmented matrix to either echelon or reduced echelon form, wherefrom it easy to work out the answer. This is called Gaussian elimination. We will use the exact same operation we used when solving this algebraic means.


$\begin{array}{l} \left( {\begin{array}{*{20}{c}} 0&{ - 1}&1\\ 1&{ - 1}&{ - 1}\\ { - 1}&0&{ - 1} \end{array}\begin{array}{*{20}{c}} |\\ |\\ | \end{array}\begin{array}{*{20}{c}} 3\\ 0\\ { - 3} \end{array}} \right) \Rightarrow \left( {\begin{array}{*{20}{c}} 1&{ - 1}&{ - 1}\\ 0&{ - 1}&1\\ { - 1}&0&{ - 1} \end{array}\begin{array}{*{20}{c}} |\\ |\\ | \end{array}\begin{array}{*{20}{c}} 0\\ 3\\ { - 3} \end{array}} \right)\\ \left( {\begin{array}{*{20}{c}} 1&{ - 1}&{ - 1}\\ 0&{ - 1}&1\\ { - 1 + 1}&{0 + \left( { - 1} \right)}&{ - 1 + \left( { - 1} \right)} \end{array}\begin{array}{*{20}{c}} |\\ |\\ | \end{array}\begin{array}{*{20}{c}} 0\\ 3\\ { - 3 + 0} \end{array}} \right) = \left( {\begin{array}{*{20}{c}} 1&{ - 1}&{ - 1}\\ 0&{ - 1}&1\\ 0&{ - 1}&{ - 2} \end{array}\begin{array}{*{20}{c}} |\\ |\\ | \end{array}\begin{array}{*{20}{c}} 0\\ 3\\ { - 3} \end{array}} \right)\\ \left( {\begin{array}{*{20}{c}} 1&{ - 1}&{ - 1}\\ 0&{ - 1}&1\\ 0&{ - 1}&{ - 2} \end{array}\begin{array}{*{20}{c}} |\\ |\\ | \end{array}\begin{array}{*{20}{c}} 0\\ 3\\ { - 3} \end{array}} \right) \Rightarrow \left( {\begin{array}{*{20}{c}} {1 - 0}&{ - 1 - \left( { - 1} \right)}&{ - 1 - 1}\\ { - 1\left( 0 \right)}&{ - 1\left( { - 1} \right)}&{ - 1\left( 1 \right)}\\ 0&{ - 1 - \left( { - 1} \right)}&{ - 2 - 1} \end{array}\begin{array}{*{20}{c}} |\\ |\\ | \end{array}\begin{array}{*{20}{c}} {0 - 3}\\ { - 3}\\ { - 3 - 3} \end{array}} \right)\\ = \left( {\begin{array}{*{20}{c}} 1&0&{ - 2}\\ 0&1&{ - 1}\\ 0&0&{ - 3} \end{array}\begin{array}{*{20}{c}} |\\ |\\ | \end{array}\begin{array}{*{20}{c}} { - 3}\\ { - 3}\\ { - 6} \end{array}} \right) \end{array}$



Where we have used an arrow, we have used an operation as was used before. In your examination you will be required to note what operation you have used. For example in the first operation we interchanged first and second rows. A shorthand for this is ${R_1} \Leftrightarrow {R_2}$, or ${R_{12}}$. both stand for the same thing



We can now translate this augmented matrix back into the linear system form. This augmented matrix corresponds to the following linear system:

$\begin{array}{*{20}{l}} {x - 2z = - 3}\\ {y - z = - 3}\\ { - 3z = - 6} \end{array}$


Now you can easily find the solution. Let us do another example:


Ex.  Solve the system:


$\begin{array}{l} {x_1} - {x_2} + 2{x_3} = 0\\ 4{x_1} + {x_2} + 2{x_3} = 1\\ {x_1} + {x_2} + {x_3} = - 1 \end{array}$


We construct the augmented matrix thusly:


$\left( {\begin{array}{*{20}{c}} 1&{ - 1}&2&|&0\\ 4&1&2&|&1\\ 1&1&1&|&{ - 1} \end{array}} \right)$


The following row operations then lead us to the row reduced echelon form:




  1.  add -4 times the 1st row to the 2nd row
  2.  add -1 times the 1st row to the 3rd row
  3.  multiply the 2nd row by 1/5
  4.  add -2 times the 2nd row to the 3rd row
  5.  multiply the 3rd row by 5/7
  6.  add 6/5 times the 3rd row to the 2nd row
  7.  add -2 times the 3rd row to the 1st row
  8.  add 1 times the 2nd row to the 1st row

 The equivalent system of equations in the echelon form would be:

$\begin{array}{l} {x_1} + {\textstyle{1 \over 4}}{x_2} + {\textstyle{1 \over 2}}{x_3} = {\textstyle{1 \over 4}}\\ \quad \quad \,{\kern 1pt} {x_2} - {\textstyle{6 \over 5}}{x_3} = {\textstyle{1 \over 5}}\\ \quad \;{\kern 1pt} \quad \quad \;{\kern 1pt} {\kern 1pt} {\textstyle{{14} \over {10}}}{x_3} = {\textstyle{{-14} \over {10}}} \end{array}$

Solving these equations, we find that ${x_3} = {x_2} =  - 1;\,{x_1} = 1$


An excellent website, for you to check your work is this. The row operations for the last question I copied thence, because I accidentally deleted the work I had myself done. However do not simply have this do all the calculations for you. This, rather consuming piece of work, is something you should have plenty of practice. Be sure to resort to it to confirm your if you have no other avenues available.


In the above example, it was case that the system had a unique solution. The following examples, highlight the working one can be expected to perform should the equation have either no or infinite number of solutions.


Ex.


Consider the system of equations:


$\begin{array}{l} {x_1} + {x_2} + {x_3} + {x_4} + {x_5} = 3\\ 2{x_1} + {x_2} + {x_3} + {x_4} + 2{x_5} = 4\\ {x_1} - {x_2} - {x_3} + {x_4} + {x_5} = 5\\ {x_1} + {x_4} + {x_5} = 4 \end{array}$


We produce the augmented matrix from the system and row reduce it to find the following matrix:


\[\left( {\begin{array}{*{20}{c}} 1&0&0&0&1&1\\ 0&1&1&0&0&{ - 1}\\ 0&0&0&1&0&3\\ 0&0&0&0&0&0 \end{array}} \right)\]



We find the equivalent system of equations:


$\begin{array}{l} {x_1} + {x_5} = 1\\ {x_2} + {x_3} = - 1\\ {x_4} = 3 \end{array}$ 


Now I aver there are an infinite number of solutions to this system, an assertion that you may very well contest. For which reason I provide the following reason. We have found $x_4 = 3$, which means that any vector that satisfies this system has to have 3 as its entry in the fourth row. But what more can we say about the vector that satisfies this system? Two things we know are that the sum of $x_1$ and $x_5$ is 1. Every couple of numbers that satisfy this equation can be the respective entries in the vector that satisfies it. How do we find such values would be a natural rejoinder to this. One possible strategy would be to let $x_5$ be any number, say a variable $\zeta $, and then $x_1$ is simply the following expression: $1 - \zeta $. So, we can take $\zeta$ to be any number, but then $x_1$ must be $1 - \zeta$. You can see than that, $\zeta$ is a "free" quantity that acts as a "parameter" for the system, and that there is a constraint for our choice of $x_1$. For this very reason, $\zeta$ is called a free parameter, and $x_1$ is a constraint. Similar reasoning with the other equation leads us to ${x_2} = \xi $ and ${x_3} =  - 1 - \xi $.


Having done this, we usually move on to converting this cartesian description back to a parametric vector description. The latter is more helpful seeing that the solution to the system is a vector. To find the parametric vector form we note:


$\left( {\begin{array}{*{20}{c}} {{x_1}}\\ {{x_2}}\\ {{x_3}}\\ {{x_4}}\\ {{x_5}} \end{array}} \right) = \left( {\begin{array}{*{20}{c}} {1 - \zeta }\\ { - 1 - \xi }\\ \xi \\ 3\\ \zeta \end{array}} \right) = \left( {\begin{array}{*{20}{c}} 1\\ { - 1}\\ 0\\ 3\\ 0 \end{array}} \right) + \zeta \left( {\begin{array}{*{20}{c}} { - 1}\\ 0\\ 0\\ 0\\ 1 \end{array}} \right) + \xi \left( {\begin{array}{*{20}{c}} 0\\ { - 1}\\ 1\\ 0\\ 0 \end{array}} \right)$


Hence, for every pair $\left( {\zeta ,\xi } \right)$ you can find a different solution to the equation. As there are an infinite number of such pairs, we conclude an infinite number of solutions. But perhaps a more general proof, under the guise of formality will better sate your curiosity. So let a system of equations be Ax = B. Let us then further assume that we know at least two solutions of the system, s and t. Hence: At = B, and As = B. Then let c be a vector such that c = s + $\lambda $(t - s), where lambda is a free parameter. Then:


$Ac = A\left( {s + \lambda \left( {t - s} \right)} \right) = A\left( s \right) + A\left( {\lambda \left( {t - s} \right)} \right) = B + \lambda A\left( {t - s} \right) = B + \lambda \left( {At - As} \right) = B + \lambda \left( {B - B} \right) = B$


Hence c is also a solution of the system. As c has a free parameter, there exist an infinite amount of different c's. Hence we conclude that if a system admits more than one solution it must admit an infinite number of solutions.


Rank of a Matrix:



The rank of a matrix is defined to be the number of leading ones (as defined above) a matrix has in its (reduced) row echelon form.


Ex:


$\begin{pmatrix} 3 & 2 &-17 \\ 6 & 8 &9 \\ 1 & 31 & 4 \end{pmatrix}$ 


The above matrix can be, with minimal effort, brought into the row echelon form:


$\begin{pmatrix} 1 & \frac{2}{4} &\frac{-17}{3} \\ 0 & 1 & \frac{43}{4} \\ 0 & 0 & 1 \end{pmatrix}$ 


Obviously the rank of this matrix is 3. The common notation for this , where A is a matrix.


It is obvious that the rank cannot exceed either the number of rows or the number of columns. This imposes the following restriction on the rank:


$\rho \left( A \right) \le \min \left( {m,n} \right)$ where m is the number of rows and n the number of columns a matrix has, and $\min \left( {{x_1},{x_2},...,{x_n}} \right)$ is the minimum of $x_i$



Cramer's Rule (Optional)

 

The Cramer's Rule is a rather handy tool, when one only requires the solution to a specific variable of a linear system, and doesn't want to solve the entire system. For it requires determinants, it becomes a great deal more tedious, when one uses it for systems that have more than 3 variables. Also it only works if you have the same number of equations as you do variables. Consider the following example that uses the Cramer's rule to solve a linear system.

Ex. Solve:

$\begin{array}{l} 2x + y + z = 3\\ x - y - z = 0\\ x + 2y + z = 0 \end{array}$

The matrix that only contains the coefficients of the variables of the system is (it comprises the part of an augmented matrix that lies on the left side of the dotted line):

$\left( {\begin{array}{*{20}{c}} 2&1&1\\ 1&{ - 1}&{ - 1}\\ 1&2&1 \end{array}} \right)$  

The matrix made of the answers to the system is:

$\left( {\begin{array}{*{20}{c}} 3\\ 0\\ 0 \end{array}} \right)$


We let D be the determinant of the coefficient matrix:


$\det \left( {\begin{array}{*{20}{c}} 2&1&1\\ 1&{ - 1}&{ - 1}\\ 1&2&1 \end{array}} \right) = 3$ 


We let ${D_x}$ be the determinant of the matrix made by swapping the x-column of the co-efficient matrix, by the column of answers to the system.


$\det \left( {\begin{array}{*{20}{c}} 3&1&1\\ 0&{ - 1}&{ - 1}\\ 0&2&1 \end{array}} \right) = 3$ 


${D_y}$ is then the determinant of the matrix constructed when we put in the column of answers in for the y-column in the coefficient matrix.


$\det \left( {\begin{array}{*{20}{c}} 2&3&1\\ 1&0&{ - 1}\\ 1&0&1 \end{array}} \right) = -6$  

Similarly the ${D_z}$ is:

$\det \left( {\begin{array}{*{20}{c}} 2&1&3\\ 1&{ - 1}&0\\ 1&2&0 \end{array}} \right) = 9$

Cramer's Rule, then concludes that:

 $\begin{array}{l} x = \frac{{{D_x}}}{D} = \frac{3}{3} = 1\\ y = \frac{{{D_y}}}{D} = \frac{{ - 6}}{3} =  - 2\\ z = \frac{{{D_z}}}{D} = \frac{9}{3} = 3 \end{array}$

 Cramer's Rule is optional for the examiner doesn't assume that the candidate has knowledge of it. Also it isn't important to other things that do feature in your syllabus. It is an efficient tool to tally your answer, and one I suggest you keep in your arsenal.

Method of Inverses


We have already seen how a linear system can be shown to be equivalent to the system which is essentially Ax = B. Here A is the coefficient matrix and B is a column vector made of the solutions to the equations. Written thusly, one is given to think of a deceptively simple solution to the system. That is,

${A^{ - 1}}Ax = {A^{ - 1}}B \Rightarrow x = {A^{ - 1}}B$


This method does work, but in a limited number of cases. The obvious condition being that the inverse of A should exist. Then it is the matter of the multiplying from the left, the inverse of A to B.


You know how to work out the inverses in cases of 2x2 and 3x3 matrix. Those methods can be extended to matrices of greater orders as well, however below is developed a new method to find out the inverse. It is applicable to square matrices of all orders, and is greatly simpler when considering the matrices of orders 4 and greater. Consider the following argument.


Ex. Find the inverse of


$A = \left( {\begin{array}{*{20}{c}} 1&1&1\\ 0&2&1\\ 1&0&1 \end{array}} \right)$


The inverse of A should be of the order of A and satisfy the following equation:


$\left( {\begin{array}{*{20}{c}} 1&1&1\\ 0&2&1\\ 1&0&1 \end{array}} \right)\left( {\begin{array}{*{20}{c}} {{x_{11}}}&{{x_{12}}}&{{x_{13}}}\\ {{x_{21}}}&{{x_{22}}}&{{x_{23}}}\\ {{x_{31}}}&{{x_{32}}}&{{x_{33}}} \end{array}} \right) = \left( {\begin{array}{*{20}{c}} 1&0&0\\ 0&1&0\\ 0&0&1 \end{array}} \right)$   
   
Inspection of this system yields that according to matrix multiplication this system can be separated into three simpler systems of the form, that we know how to solve by row reduction. That is:

$\begin{array}{l} \left( {\begin{array}{*{20}{c}} 1&1&1\\ 0&2&1\\ 1&0&1 \end{array}} \right)\left( {\begin{array}{*{20}{c}} {{x_{11}}}&{{x_{12}}}&{{x_{13}}}\\ {{x_{21}}}&{{x_{22}}}&{{x_{23}}}\\ {{x_{31}}}&{{x_{32}}}&{{x_{33}}} \end{array}} \right) = \left( {\begin{array}{*{20}{c}} 1&0&0\\ 0&1&0\\ 0&0&1 \end{array}} \right)\\ \left( {\begin{array}{*{20}{c}} 1&1&1\\ 0&2&1\\ 1&0&1 \end{array}} \right)\left( {\begin{array}{*{20}{c}} {{x_{11}}}\\ {{x_{21}}}\\ {{x_{31}}} \end{array}} \right) = \left( {\begin{array}{*{20}{c}} 1\\ 0\\ 0 \end{array}} \right) \Rightarrow \left( {\begin{array}{*{20}{c}} 1&1&1&|&1\\ 0&2&1&|&0\\ 1&0&1&|&0 \end{array}} \right)\mathop \sim \limits^R \left( {\begin{array}{*{20}{c}} 1&0&0&|&2\\ 0&1&0&|&1\\ 0&0&1&|&{ - 2} \end{array}} \right) \Rightarrow \left( {\begin{array}{*{20}{c}} {{x_{11}}}\\ {{x_{21}}}\\ {{x_{31}}} \end{array}} \right) = \left( {\begin{array}{*{20}{c}} 2\\ 1\\ { - 2} \end{array}} \right)\\ \left( {\begin{array}{*{20}{c}} 1&1&1\\ 0&2&1\\ 1&0&1 \end{array}} \right)\left( {\begin{array}{*{20}{c}} {{x_{12}}}\\ {{x_{22}}}\\ {{x_{32}}} \end{array}} \right) = \left( {\begin{array}{*{20}{c}} 0\\ 1\\ 0 \end{array}} \right) \Rightarrow \left( {\begin{array}{*{20}{c}} 1&1&1&|&0\\ 0&2&1&|&1\\ 1&0&1&|&0 \end{array}} \right)\mathop \sim \limits^R \left( {\begin{array}{*{20}{c}} 1&0&0&|&{ - 1}\\ 0&1&0&|&0\\ 0&0&1&|&1 \end{array}} \right) \Rightarrow \left( {\begin{array}{*{20}{c}} {{x_{12}}}\\ {{x_{22}}}\\ {{x_{32}}} \end{array}} \right) = \left( {\begin{array}{*{20}{c}} { - 1}\\ 0\\ 1 \end{array}} \right)\\ \left( {\begin{array}{*{20}{c}} 1&1&1\\ 0&2&1\\ 1&0&1 \end{array}} \right)\left( {\begin{array}{*{20}{c}} {{x_{13}}}\\ {{x_{23}}}\\ {{x_{33}}} \end{array}} \right) = \left( {\begin{array}{*{20}{c}} 0\\ 0\\ 1 \end{array}} \right) \Rightarrow \left( {\begin{array}{*{20}{c}} 1&1&1&|&0\\ 0&2&1&|&0\\ 1&0&1&|&1 \end{array}} \right)\mathop \sim \limits^R \left( {\begin{array}{*{20}{c}} 1&0&0&|&{ - 1}\\ 0&1&0&|&{ - 1}\\ 0&0&1&|&2 \end{array}} \right) \Rightarrow \left( {\begin{array}{*{20}{c}} {{x_{13}}}\\ {{x_{23}}}\\ {{x_{33}}} \end{array}} \right) = \left( {\begin{array}{*{20}{c}} { - 1}\\ { - 1}\\ 2 \end{array}} \right) \end{array}$

The inverse is then simply:


${\left( {\begin{array}{*{20}{c}} 1&1&1\\ 0&2&1\\ 1&0&1 \end{array}} \right)^{ - 1}} = \left( {\begin{array}{*{20}{c}} 2&{ - 1}&{ - 1}\\ 1&0&{ - 1}\\ { - 2}&1&2 \end{array}} \right)$


This method however contains some inefficiencies. As you recall the Gaussian elimination method requires us to reduce what is on the left hand side of the dotted line to the echelon, or as I have done in this case the reduced echelon form. You should also recall that the operations and the order in which they are applied dependent on what is left to the dotted line, what lies on the right is immaterial to the order and choice of operations. In the above method we solved three different linear systems but the order and choice of operations will be the same for all three systems. We can therefore, simplify this method of inverting a matrix. We can construct the following the matrix:


$\left( {\begin{array}{*{20}{c}} 1&1&1&|&1&0&0\\ 0&2&1&|&0&1&0\\ 1&0&1&|&0&0&1 \end{array}} \right)$

  
And then our task would be to reduce the matrix to the left of the line to the identity matrix. What we are left to the right is the inverse. In this case:

$\left( {\begin{array}{*{20}{c}} 1&0&0&|&2&{ - 1}&{ - 1}\\ 0&1&0&|&1&0&{ - 1}\\ 0&0&1&|&{ - 2}&1&2 \end{array}} \right)$


In symbols let me summarise what we have done here: $\left( {A|I} \right)\mathop  \sim \limits^R \left( {I|{A^{ - 1}}} \right)$  



Introduction to Vector Spaces


Note: A point of contention that this topic tends to engender is that of similarities ( or dissimilarities ) with the topic of vectors we have already studied, or that which a student meets in his study of O/A levels Physics. The point of note here is that physicists insist upon a geometrical interpretation of vectors, for among other applications, they need to study the geometry of systems. The mathematician on the other hand has little use for analogies drawn from the real world, when the study of vectors can be firmly founded on the edifice of abstraction. This we will see shortly. One can even argue that is essentially the difference between pure and applied mathematicians, though in reality that would be an oversimplified view of the matter.

I shall begin with the simple admonition that the "vector" in Vector Space doesn't specifically refer to the vectors of Euclidean spaces that you are already aware of (Chapter 7). However those vectors and the Euclidean spaces they exist in, do actually constitute Vector Spaces, as you will come to see and you will almost exclusively work with them.


A Vector Space over a field is a non-empty set that contain objects, referred to as vectors such that they abide by some rules or axioms, involving two operations defined on them. The elements that belong to the field are called scalars. For you examination it will be a field of real numbers. You are not required to know what a field is, however you should appreciate the fact that all the scalars must be reals, which is to say, have no imaginary component.


Defined on these vectors are two, operations that are named familiarly. The first is vector addition, or simply addition. It takes two vectors, say $v$ & $w$, in the space and assigns it to another vector, commonly written $v$ + $w$. The second is called scalar multiplication, which takes a scalar from the field, $\alpha$, and a vector from the space, $v$, and assigns it to another vector, $\alpha $$v$. What is of great import, is that both $v$ + $w$ and $\alpha$$v$ must belong in the space in question. This property is called closure, or specifically closure under vector addition and scalar multiplication One can distinguish between vectors and scalars, by the fact that vectors are written either in boldface, or so $\bar v$. I will use the second one henceforth, on account of it being easier to work with.


The axioms are as follows, furnished with examples of vectors the reader is already familiar with, namely vectors in $R^3$. $R^3$ with vector addition and scalar multiplication as defined in chapter 7. Again I caution you that spaces deal with abstract objects, I use the familiar vectors because there are easy to work with and I cannot introduce a new vector space without defining one to begin with.



  1. Associativity of Addition
                 $\bar u + (\bar v + \bar w) = (\bar u + \bar v) + \bar w$
                 eg. $\left( {\begin{array}{*{20}{c}} 2\\ 1\\ 3 \end{array}} \right) + \left( {\left( {\begin{array}{*{20}{c}} 1\\ 1\\ 1 \end{array}} \right) + \left( {\begin{array}{*{20}{c}} 3\\ 2\\ 0 \end{array}} \right)} \right) = \left( {\left( {\begin{array}{*{20}{c}} 2\\ 1\\ 3 \end{array}} \right) + \left( {\begin{array}{*{20}{c}} 1\\ 1\\ 1 \end{array}} \right)} \right) + \left( {\begin{array}{*{20}{c}} 3\\ 2\\ 0 \end{array}} \right) = \left( {\begin{array}{*{20}{c}} 6\\ 4\\ 4 \end{array}} \right)$

This addition is closed under addition as:


                      $\left( {\begin{array}{*{20}{c}} 6\\ 4\\ 4 \end{array}} \right) \in {R^3}$


     2. Commutativity of Addition


                $\bar u + \bar v = \bar v + \bar u$

                eg $\left( {\begin{array}{*{20}{c}} 3\\ 5\\ 1 \end{array}} \right) + \left( {\begin{array}{*{20}{c}} 2\\ 1\\ 2 \end{array}} \right) = \left( {\begin{array}{*{20}{c}} 2\\ 1\\ 2 \end{array}} \right) + \left( {\begin{array}{*{20}{c}} 3\\ 5\\ 1 \end{array}} \right)$

     3. Existence of a zero vector


               $\bar u + 0 = 0 + \bar u = \bar u$

           
               eg. $\left( {\begin{array}{*{20}{c}} 0\\ 0\\ 0 \end{array}} \right) \in {R^3}$

     4. Existence of additive inverse


              $\bar u + (- \bar u) = (-\bar u) + \bar u = 0$

           
              eg $\left( {\begin{array}{*{20}{c}} 3\\ 2\\ 6 \end{array}} \right) + \left( {\begin{array}{*{20}{c}} { - 3}\\ { - 2}\\ { - 6} \end{array}} \right) = \left( {\begin{array}{*{20}{c}} 0\\ 0\\ 0 \end{array}} \right)$

      5. Distributive Laws

           
              $\alpha \left( {\bar u + \bar v} \right) = \alpha \bar u + \alpha \bar v$
              $\left( {\alpha  + \beta } \right)\bar u = \alpha \bar u + \beta \bar u$
           
              eg. $\begin{array}{l} 5\left( {\left( {\begin{array}{*{20}{c}} 1\\ 2\\ { - 3} \end{array}} \right) + \left( {\begin{array}{*{20}{c}} 2\\ 2\\ 2 \end{array}} \right)} \right) = 5\left( {\begin{array}{*{20}{c}} 1\\ 2\\ { - 3} \end{array}} \right) + 5\left( {\begin{array}{*{20}{c}} 2\\ 2\\ 2 \end{array}} \right) = \left( {\begin{array}{*{20}{c}} {15}\\ {20}\\ { - 5} \end{array}} \right)\\ (2 + 3)\left( {\begin{array}{*{20}{c}} 1\\ 1\\ 1 \end{array}} \right) = 2\left( {\begin{array}{*{20}{c}} 1\\ 1\\ 1 \end{array}} \right) + 3\left( {\begin{array}{*{20}{c}} 1\\ 1\\ 1 \end{array}} \right) = \left( {\begin{array}{*{20}{c}} 5\\ 5\\ 5 \end{array}} \right) \end{array}$

You should also note the closure:


$\left( {\begin{array}{*{20}{c}} {15}\\ {20}\\ { - 5} \end{array}} \right);\left( {\begin{array}{*{20}{c}} 5\\ 5\\ 5 \end{array}} \right) \in {R^3}$  


       6. Associativity of scalar multiplication


             $\alpha \left( {\beta \bar u} \right) = \left( {\alpha \beta } \right)\bar u$


Now that we are established what sets constitute a vector space, allow me to give some more examples, that you shall find instructive (You are welcome to prove that these are vector spaces on your own):



  1. Vectors in a Plane: Here examples are all quantities one usually uses vectors to denote, say force or velocity. If we allow V to be a set of all the vectors in  a plane, it is not difficult to see, that equipped with ordinary vector algebra, and letting the real numbers comprise the field, the axioms of the spaces are satisfied
  2. Every field can be considered as a vector space over itself
  3. Let V be the set of all the numbers of the form $a + b\sqrt 2 $, where a and b are rational numbers (it is easily proven that the elements of V are irrational). The zero vector is the $0 + 0\sqrt 2 $ and the additive inverse is $-a - b\sqrt 2$. If $r \in Q$, then $r\left( {a + b\sqrt 2 } \right) = ra + rb\sqrt 2 $ and it lies in V. Thus V is closed under multiplication by members of Q. It is easily verified 
  4. We retain the reals as our field and let V be the set of all polynomials in $x$ with coefficients in F. Two polynomials in V can be added and a polynomial can always be multiplied by a real number. With these operations defined V is a vector space.
  5. Let $R^R$ be the set of all the functions defined from reals into reals. It is a vector space under ordinary function addition and scalar multiplication:
        $\begin{array}{l} \left( {f + g} \right)x = f(x) + g(x)\\ \left( {af(x)} \right) = a\left( {f(x)} \right) \end{array}$

You should note how the above axioms reproduce the vector algebra for vectors in $R^n$ for sets of objects have nothing to do with them. Here are some other properties of the Euclidean spaces that exist in the vectors spaces, by virtue of the axioms define them:



  1. $a\left( {\bar 0} \right) = \bar 0$
  2. $0(\bar u) = \bar 0$
  3. $\left( { - a} \right)\bar v = a\left( { - \bar v} \right) =  - a\bar v$
  4. $a\left( {\bar u - \bar v} \right) = a\bar u - a\bar v$
These can be easily proved, however first note that these are read: 1. (a) zero-vector = zero-vector; 2. zero-scalar (u) = zero-vector.

Proof:


1.

$\begin{array}{l} a\left( {\bar 0} \right) = a\left( {\bar 0 + \bar 0} \right) = a\left( {\bar 0} \right) + a\left( {\bar 0} \right)\;from\,axioms\,3\& 5\,if\,we\,let\,u = \bar 0\\ a\left( {\bar 0} \right) + \left( { - a\left( {\bar 0} \right)} \right) = a\left( {\bar 0} \right) + a\left( {\bar 0} \right) + \left( { - a\left( {\bar 0} \right)} \right)\;from\,4\\ \bar 0 = a\left( {\bar 0} \right) \end{array}$

2.

 $\begin{array}{l} 0(\bar u) = \left( {0 + 0} \right)\bar u = 0\left( {\bar u} \right) + 0\left( {\bar u} \right)\;from\,axiom\,5\\ 0\left( {\bar u} \right) + \left( { - 0\left( {\bar u} \right)} \right) = 0\left( {\bar u} \right) + 0\left( {\bar u} \right) + \left( { - 0\left( {\bar u} \right)} \right)\;from\,axiom\,4\\ \bar 0 = 0\left( {\bar u} \right) \end{array}$

3.

$\begin{array}{l} 0 = 0\bar v\;from\,(2)\\ \;\;{\kern 1pt} = (a + ( - a))\bar v = a\bar v + \left( {\left( { - a} \right)\bar v} \right)\\ - a\bar v = \left( { - a} \right)\bar v\\ Also\,we\,have:\\ 0 = a\bar 0 = a\left( {\bar v + \left( { - \bar v} \right)} \right) = a\bar v + a\left( { - \bar v} \right)\\ - a\bar v = a\left( { - \bar v} \right) \end{array}$

4.

$\begin{array}{l} a(\bar u - \bar v) = a(\bar u + ( - \bar v))\\ \quad \quad \quad \,\, = a\bar u + a( - \bar v)\\ \quad \quad \quad \,\, = a\bar u - a\bar v\quad from\,(3) \end{array}$

Of course, hereat, you may argue, that this is obvious, to which I provide the rejoinder that while it may be obvious with the conventional vectors of Euclidean spaces, but the purpose of this was to prove it from the axioms, so that regardless of how abstract a vector space you construct, the relationships hold. Relationships that wouldn't have had been so apparent then.



Linear Combination

Given a non-empty subset, $S$, of a Vector Space, $V$, a linear combination of the elements of S is of the form: ${a_1}{\bar v_1} + {a_2}{\bar v_2} + {a_3}{\bar v_3} + ... + {a_n}{\bar v_n} = \sum\limits_{i = 1}^n {{a_i}{{\bar v}_i}} $, where $S = \left[ {{{\bar v}_i}|1 \le i \le n} \right];\;{a_i} \in R$. There is no need for $a_i$ to be real, only that it has to be a part of the field, the vector space is defined over. As your syllabus doesn't concern itself with other fields we say that it has to be real. 

Ex.


Consider the set S, which you shall find is a subset of $R^3$ 


$S = \left[ {\left( {\begin{array}{*{20}{c}} 1\\ 1\\ 1 \end{array}} \right),\left( {\begin{array}{*{20}{c}} 2\\ 0\\ 3 \end{array}} \right),\left( {\begin{array}{*{20}{c}} 1\\ 2\\ 3 \end{array}} \right)} \right]$


Then all the following are possible linear combinations of our Set, S.


$\begin{array}{l} 1\left( {\begin{array}{*{20}{c}} 1\\ 1\\ 1 \end{array}} \right) + 1\left( {\begin{array}{*{20}{c}} 2\\ 0\\ 3 \end{array}} \right) + 1\left( {\begin{array}{*{20}{c}} 1\\ 2\\ 3 \end{array}} \right) = \left( {\begin{array}{*{20}{c}} 4\\ 3\\ 7 \end{array}} \right)\\ 2\left( {\begin{array}{*{20}{c}} 1\\ 1\\ 1 \end{array}} \right) + 0\left( {\begin{array}{*{20}{c}} 2\\ 0\\ 3 \end{array}} \right) + 3\left( {\begin{array}{*{20}{c}} 1\\ 2\\ 3 \end{array}} \right) = \left( {\begin{array}{*{20}{c}} 5\\ 8\\ {11} \end{array}} \right)\\ 1\left( {\begin{array}{*{20}{c}} 1\\ 1\\ 1 \end{array}} \right) + 0\left( {\begin{array}{*{20}{c}} 2\\ 0\\ 3 \end{array}} \right) + 0\left( {\begin{array}{*{20}{c}} 1\\ 2\\ 3 \end{array}} \right) = \left( {\begin{array}{*{20}{c}} 1\\ 1\\ 1 \end{array}} \right) \end{array}$


Notice, as in the last linear combination, that every element of the set is itself a linear combination of the set.



Linear (In)dependence

Again, we take a set of vectors in a vector space, V. The vectors in this set are said to be linearly independent, if for the following vector equation

${\alpha _1}{v_1} + {\alpha_2}{v_2} + ... + {\alpha _n}{v_n} = 0$

the only solution is: ${\alpha _1} = {\alpha _2} = ... = {\alpha _n} = 0$. It is linearly dependent if not all the $\alpha_i$ are zero.

This is a fundamental concept of linear algebra, that you should commit to memory.

Now we are supposed to construct a method such that given a set of vector we can determine whether or not they are linearly independent. Consider the following:

${\alpha _1}{v_1} + {\alpha _2}{v_2} + ... + {\alpha _n}{v_n} = \left( {\begin{array}{*{20}{c}} {{v_1}}&{{v_2}}& \ldots &{{v_4}} \end{array}} \right)\left( {\begin{array}{*{20}{c}} {{\alpha _1}}\\ {{\alpha _2}}\\ \vdots \\ {{\alpha _n}} \end{array}} \right)$

A point worthy of note here is that $v_i$ is a vector and its entries make up the column of that matrix. Now one possible solution to this equation (a homogeneous equation) is the zero vector. What we have to prove is that this is the only solution to the system. That it is unique. First we see that the number of columns (which is to say the number of vectors $v_i$) is equal to the number of variables (which is to say the number of $\alpha_i$). Now consider a case wherein the row reduced form of the matrix $\left( {\begin{array}{*{20}{c}} {{v_1}}&{{v_2}}& \ldots &{{v_4}} \end{array}} \right)$ is such that it has a leading one in all the columns. As a general example consider:


$\left( {\begin{array}{*{20}{c}} 1&0& \cdots &0\\ 0&1& \cdots &0\\ \vdots & \vdots & \ddots & \vdots \\ 0&0& \cdots &1\\ 0&0& \cdots &0\\ \vdots & \vdots & \ddots & \vdots \\ 0&0&0&0 \end{array}} \right)$


If this is true, than we see:


$\left( {\begin{array}{*{20}{c}} 1&0& \cdots &0\\ 0&1& \cdots &0\\ \vdots & \vdots & \ddots & \vdots \\ 0&0& \cdots &1\\ 0&0& \cdots &0\\ \vdots & \vdots & \ddots & \vdots \\ 0&0&0&0 \end{array}} \right)\left( {\begin{array}{*{20}{c}} {{\alpha _1}}\\ {{\alpha _2}}\\ \vdots \\ {{\alpha _n}} \end{array}} \right) = \left( {\begin{array}{*{20}{c}} 0\\ 0\\ \vdots \\ 0 \end{array}} \right)$


Which leads to a cartesian description that has the solution ${\alpha _1} = {\alpha _2} = ... = {\alpha _n} = 0$. This is condition wherein a matrix has leading ones in all columns in the row reduced echelon form, is called having full column rank. It is only under this condition that a homogeneous equation has a unique solution. You shouldn't have a problem in seeing that should the matrix not have full column rank, it would lead to the introduction of free parameters and hence a multiplicity of solution. In which case it would be linearly dependent.


I hope the reader will allow me this informality, for the syllabus proscribes any detail, formality or rigour.


Hence, our test for linearly independence is the following: Given a number of the vectors, we create a matrix whose columns are the vectors. We then proceed to reduce the matrix, and if it has a full column rank, then those vectors a linearly independent.


Ex.

In $R^2$, are the vectors  $\left( {\begin{array}{*{20}{c}} 1\\ 2 \end{array}} \right);\left( {\begin{array}{*{20}{c}} 1\\ { - 1} \end{array}} \right)$ linearly independent?

We construct the matrix, to reduce:


$\left( {\begin{array}{*{20}{c}} 1&1\\ 2&{ - 1} \end{array}} \right)$

It is easily checked that the matrix can be row reduce to the identity:


$\left( {\begin{array}{*{20}{c}} 1&1\\ 2&{ - 1} \end{array}} \right) \sim \left( {\begin{array}{*{20}{c}} 1&0\\ 0&1 \end{array}} \right)$


As it has full column rank, we note that these are linearly independent.


There is an easier method to see whether two vectors are linearly independent. We simply see whether the two vectors are multiples of each other. If they are, they will be linearly dependent. This is because consider the following matrices:


$\left( {\begin{array}{*{20}{c}} \alpha \\ \beta \end{array}} \right);\left( {\begin{array}{*{20}{c}} \gamma \\ \delta \end{array}} \right)$


$\left( {\begin{array}{*{20}{c}} \gamma \\ \delta \end{array}} \right) = \chi \left( {\begin{array}{*{20}{c}} \alpha \\ \beta \end{array}} \right)$


Then:


$\chi \left( {\begin{array}{*{20}{c}} \alpha \\ \beta \end{array}} \right) - \left( {\begin{array}{*{20}{c}} \gamma \\ \delta \end{array}} \right) = 0$


Hence, by definition, two vectors that one is multiple for the other, are linearly dependent.


Ex.


Are the following vectors linearly independent?


$\begin{array}{l} i)\;{{\bar v}_1} = \left( {\begin{array}{*{20}{c}} 1\\ 2\\ 0 \end{array}} \right);\,{{\bar v}_2} = \left( {\begin{array}{*{20}{c}} 0\\ 1\\ 1 \end{array}} \right)\\ ii)\;{{\bar v}_1} = \left( {\begin{array}{*{20}{c}} 1\\ 1 \end{array}} \right);\,{{\bar v}_2} = \left( {\begin{array}{*{20}{c}} 1\\ 2 \end{array}} \right);{{\bar v}_3} = \left( {\begin{array}{*{20}{c}} 2\\ 3 \end{array}} \right) \end{array}$


For the first one we note that that there is no value for the scalar, c, such that:


$\left( {\begin{array}{*{20}{c}} 1\\ 2\\ 0 \end{array}} \right) = c\left( {\begin{array}{*{20}{c}} 0\\ 1\\ 1 \end{array}} \right)$


For the second one, however, our method of multiples doesn't suffice. We have to ascertain the column rank of the matrix.


$\left( {\begin{array}{*{20}{c}} 1&1&2\\ 1&2&3 \end{array}} \right) \sim \left( {\begin{array}{*{20}{c}} 1&0&1\\ 0&1&1 \end{array}} \right)$


The matrix doesn't have a full column rank, and hence is not linearly independent.


You should also take the following fact into account: if a set of vectors is linearly dependent, than at least one the vectors can be shown as a linear combination of the rest of the vectors.


Ex.

Show that $\left( {\begin{array}{*{20}{c}} 2\\ 3 \end{array}} \right)$ is a linear combination of  $\left( {\begin{array}{*{20}{c}} 1\\ 1 \end{array}} \right);\,\left( {\begin{array}{*{20}{c}} 1\\ 2 \end{array}} \right)$.

We use the row reduced form of the matrix in the last example to develop the relationship.


From the system we see: $\begin{array}{l} {c_1} + {c_3} = 0\\ {c_2} + {c_3} = 0 \end{array}$


Recall that the values of $c_i$ that are a solution to this, are a solution to the vector equation:


$\sum\limits_{i = 1}^3 {{c_i}{{\bar v}_i}}  = 0$


The system given above of course has infinite solutions of the form $\tau \left( {\begin{array}{*{20}{c}} { - 1}\\ { - 1}\\ 1 \end{array}} \right)$ 


Let us choose the seemingly simplest one: $\tau = -1$, we then have the following relationship: 


$\begin{array}{l} 1\left( {\begin{array}{*{20}{c}} 1\\ 1 \end{array}} \right) + 1\left( {\begin{array}{*{20}{c}} 1\\ 2 \end{array}} \right) - 1\left( {\begin{array}{*{20}{c}} 2\\ 3 \end{array}} \right) = 0\\ \left( {\begin{array}{*{20}{c}} 1\\ 1 \end{array}} \right) + \left( {\begin{array}{*{20}{c}} 1\\ 2 \end{array}} \right) = \left( {\begin{array}{*{20}{c}} 2\\ 3 \end{array}} \right) \end{array}$



Linear Span

Given a set of vectors, $S = \left\{ {{{\bar v}_1},{{\bar v}_2},...,{{\bar v}_n}} \right\}$ than the set of all the linear combinations of these vectors is called the linear span of that set, which is to say: $Lin\left( {{{\bar v}_1},{{\bar v}_2},...,{{\bar v}_n}} \right) = \left\{ {\sum\limits_{i = 1}^n {{\alpha _i}{{\bar v}_i}} |{\alpha _i} \in R} \right\}$

Consider the set $S = \left( {\begin{array}{*{20}{c}} 1\\ 1\\ 1 \end{array}} \right)$, the linear span of the set is then: $Lin\left( S \right) = \left( {\begin{array}{*{20}{c}} \sigma \\ \sigma \\ \sigma \end{array}} \right);\quad \sigma \in R$


I hope the reader will have no difficulty in proving that the linear span of S, it self constitutes a vector space. Allow me to prove the closure under addition and the other axioms will be left to the reader to verify.


For two vectors in the span of S, we have:


$\begin{array}{l} \left( {\begin{array}{*{20}{c}} \sigma \\ \sigma \\ \sigma \end{array}} \right);\,\,\left( {\begin{array}{*{20}{c}} \lambda \\ \lambda \\ \lambda \end{array}} \right)\\ \left( {\begin{array}{*{20}{c}} \sigma \\ \sigma \\ \sigma \end{array}} \right) + \left( {\begin{array}{*{20}{c}} \lambda \\ \lambda \\ \lambda \end{array}} \right) = \left( {\sigma + \lambda } \right)\left( {\begin{array}{*{20}{c}} 1\\ 1\\ 1 \end{array}} \right) = \alpha \left( {\begin{array}{*{20}{c}} 1\\ 1\\ 1 \end{array}} \right) \in Lin\left( S \right) \end{array}$



The reader hopefully sees that, in the same vein, with manipulation of the properties of matrix properties given above, the other axioms can be shown to hold.

This work leads us to another important concept, that of a spanning set. For the vector space that is the linear span of the set S, we call S the spanning set. In general if a vector space, V, is such that V = Lin(S), then S is called the spanning set of V. 


Subspace


A  subspace, W, of a vector space V, is a non-empty subset of V, such that W is a vector space itself under the same addition and scalar multiplication operations as V. You should note that every vector space has, trivially, at least two subspaces, one comprising made up of the empty set, the other that vector space itself.


What follows is the basic criterion of deciding whether a subset of a vector space is a subspace or not.

Subspace criterion:

For a given vector space, V, a subset, W, is a subspace, if and only if the following conditions hold:


1. for all $\bar u;\bar v \in W,\,\;\bar u + \bar v \in W$

2. for all $\bar u \in W;\;\alpha  \in R,\,\;\alpha \bar u \in W$

The first condition you will recognise as closure under addition, and the second as closure under scalar multiplication.


Now consider a set of vectors $X = \left\{ {{{\bar v}_1},{{\bar v}_2},...,{{\bar v}_n}} \right\}$, such that all the vectors belongs to the vector space, V. Lin(X) is then a subspace of the vector space V, in fact it is the smallest subspace of V that contains the vectors in X.


Ex. Consider the following set of vectors: $X = \left\{ {\left( {\begin{array}{*{20}{c}} 1\\ 0\\ 0\\ 0 \end{array}} \right);\left( {\begin{array}{*{20}{c}} 0\\ 1\\ 0\\ 0 \end{array}} \right)} \right\}$ Let V be the vector space $R^4$


We know that both the vectors in X belong to the Vector Space, $R^4$. The above theorem then tells me that Lin(X) is a subspace of V. Let us call it S. We can then write down the subspace parametrically, as follows: $S = \left\{ {\left. {\left( {\begin{array}{*{20}{c}} s\\ t\\ 0\\ 0 \end{array}} \right)} \right|s,t \in R} \right\}$





Basis of a Vector Space


A basis of a vector space, V, is defined as a linearly independent spanning set for the V. A familiar example is again of $R^n$ Euclidean Space, and you should already have met what Mathematicians call the standard basis for these Euclidean Spaces. The Standard Basis for say $R^2$ in particular is: 

$E = \left\{ {\left( {\begin{array}{*{20}{c}} 1\\ 0 \end{array}} \right);\left( {\begin{array}{*{20}{c}} 0\\ 1 \end{array}} \right)} \right\}$

You can apply any of our two tests for linear independence to confirm that this set is of course linearly independent. Furthermore you can assure yourself that this spans $R^2$ as:

$\alpha \left( {\begin{array}{*{20}{c}} 1\\ 0 \end{array}} \right) + \beta \left( {\begin{array}{*{20}{c}} 0\\ 1 \end{array}} \right) = \left( {\begin{array}{*{20}{c}} \alpha \\ \beta \end{array}} \right) \in {R^2}$


You see, of course, that the general linear combination of E, is the same as a general point in $R^2$. Now this is not the only possible basis for this space. You will come to see another possible basis for $R^2$ in an example to come. What makes this specific basis special enough to warrant a special name is the fact that it is easy to use in most cases.

Next we derive a method to investigate whether or not a given set of vectors spans a n dimensional Euclidean Space. 

Consider this set:

$S = \left\{ {\left( {\begin{array}{*{20}{c}} 1\\ {1} \end{array}} \right),\left( {\begin{array}{*{20}{c}} {-1}\\ 2 \end{array}} \right)} \right\}$

I assert that this set is a basis for $R^2$. Of course, one's assertions are inconsequential in Mathematics unless fortified by proper reasoning, wherefore I give the following proof that the above set does indeed span $R^2$ and is linearly independent (thereby, proving it is a basis). To prove that it is linearly independent we simply note that neither of them are zero vectors, nor are multiples of one another. The spanning part of the definition however requires a little more working. What I want to prove, you see, is that, for any given vector in $R^2$: $\left( {\begin{array}{*{20}{c}} \alpha \\ \beta \end{array}} \right)$

The following linear system is always consistent: $\left( {\begin{array}{*{20}{c}} 1&{ - 1}\\ 1&2 \end{array}} \right)\left( {\begin{array}{*{20}{c}} {{x_1}}\\ {{x_2}} \end{array}} \right) = \left( {\begin{array}{*{20}{c}} \alpha \\ \beta \end{array}} \right)$


For this linear system to be always consistent the coefficient matrix (one which has as its columns, the vectors of Set S) would have to have full row rank. This term shouldn't be completely alien to the reader, and if he endeavours to draw parallels with full column rank, he will indeed arrive at the correct definition of the term. It means that every row in the (reduced) echelon form of the matrix, contains a leading one. If this condition is not upheld, we could, in the equivalent system of equations, possibly end up with an equation of the form $0x = \chi$ where $\chi$ is a non-zero constant. Which would suggest that the system is inconsistent, exactly the circumstance we wish to avoid.


The reader should commit to memory the following: Given a set of n vectors, one can prove that the vectors span the n-dimensional Euclidean Space, if they have a full row rank. If the set is linearly independent, then it is a basis of the Euclidean Space.


 Your syllabus then goes on to say that you are required to work out the basis in simple cases. This point we will cover by doing past papers.




Dimension


The Dimension of any given vector space is the number of vectors in its basis. It is possible for a Vector Space to have an infinite number of sets in its basis, and hence be infinite-dimensional.

You may recall that both the basis, I presented for $R^2$ had two vectors in them. This is why it is a two dimensional vector space, or Euclidean Space.

I don't have any more to say on dimension as yet, but it features rather heavily later on, by virtue of being in a particularly important theorem. This concludes the notes on Linear Spaces, and we move on to Transformations.



Null Space, Column Space and Row Space of a Matrix


Now we look to three vector spaces associated with matrices, and develop methods to determine their bases.

We begin with defining the Null Space, perhaps the easiest one to work with of the triumvirate, if you will. Given a matrix A, the Null space associated with A, is defined to be the set of solutions, or set of vectors x, that satisfy the homogeneous equation: Ax = 0. 

Given a m x n matrix, A, whose columns are denoted by ${c_1},{c_2},...,{c_n}$ the column space of the matrix A, denoted by CS(A), is the linear span of the columns of A.

 $CS(A) = Lin\left( {{c_1},{c_2},...,{c_n}} \right)$

For the same matrix A, assume that the transpose of the rows are given by ${r_1},{r_2},...,{r_m}$ then the row space of the Matrix A is given by:

$RS(A) = Lin\left( {{r_1},{r_2},...,{r_m}} \right)$

A quick note worth making here is that ${c_i}$ is a m x 1 vector in $R^m$, and that ${r_i}$ is a n x 1 vector in $R^n$, it follows by a theorem given above (hopefully) that the Column Space of A is a subspace of $R^m$, and the Row Space of A, is a subspace of $R^n$

Let us use the following matrix to put these definitions to work:


$A = \left( {\begin{array}{*{20}{c}} 1&2&1&3&4\\ 1&1&0&5&{ - 1}\\ 3&5&2&{11}&7 \end{array}} \right)$


I hope you see that the column and row spaces of A, are as follows:


$\begin{array}{l} CS\left( A \right) = Lin\left[ {\left( {\begin{array}{*{20}{c}} 1\\ 1\\ 3 \end{array}} \right),\left( {\begin{array}{*{20}{c}} 2\\ 1\\ 5 \end{array}} \right),\left( {\begin{array}{*{20}{c}} 1\\ 0\\ 2 \end{array}} \right),\left( {\begin{array}{*{20}{c}} 3\\ 5\\ {11} \end{array}} \right),\left( {\begin{array}{*{20}{c}} 4\\ { - 1}\\ 7 \end{array}} \right)} \right]\\ RS(A) = Lin\left[ {\left( {\begin{array}{*{20}{c}} 1\\ 2\\ 1\\ 3\\ 4 \end{array}} \right),\left( {\begin{array}{*{20}{c}} 1\\ 1\\ 0\\ 5\\ { - 1} \end{array}} \right),\left( {\begin{array}{*{20}{c}} 3\\ 5\\ 2\\ {11}\\ 7 \end{array}} \right)} \right] \end{array}$


For the Null Space, one may note that, Ax = 0, gives the following augmented matrix that can be reduced to the following:


$\left( {\begin{array}{*{20}{c}} 1&2&1&3&4\\ 1&1&0&5&{ - 1}\\ 3&5&2&{11}&7 \end{array}} \right) \sim \left( {\begin{array}{*{20}{c}} 1&0&{ - 1}&7&{ - 6}\\ 0&1&1&{ - 2}&5\\ 0&0&0&0&0 \end{array}} \right)$


Note that the zero column that should be on the right side of the dotted line has been omitted, because regardless of our row operations, the zero column remains the zero column. Further we have:


$\begin{array}{l} {x_1} - {x_3} + 7{x_4} - 6{x_5} = 0\\ {x_2} + {x_3} - 2{x_4} + 5{x_5} = 0\\ Let\;{x_5} = \mu ;\,{x_4} = \lambda ;\,{x_3} = \rho \\ Then:\\ {x_1} = \rho - 7\lambda + 6\mu \\ {x_2} = - \rho + 2\lambda - 5\mu \\ \left( {\begin{array}{*{20}{c}} {{x_1}}\\ {{x_2}}\\ {{x_3}}\\ {{x_4}}\\ {{x_5}} \end{array}} \right) = \left( {\begin{array}{*{20}{c}} {\rho - 7\lambda + 6\mu }\\ { - \rho + 2\lambda - 5\mu }\\ \rho \\ \lambda \\ \mu \end{array}} \right) = \rho \left( {\begin{array}{*{20}{c}} 1\\ { - 1}\\ 1\\ 0\\ 0 \end{array}} \right) + \lambda \left( {\begin{array}{*{20}{c}} { - 7}\\ 2\\ 0\\ 1\\ 0 \end{array}} \right) + \mu \left( {\begin{array}{*{20}{c}} 6\\ { - 5}\\ 0\\ 0\\ 1 \end{array}} \right) \end{array}$


On that basis we can argue that the Null space of A is then given by:


$N\left( A \right) = Lin\left[ {\left( {\begin{array}{*{20}{c}} 1\\ { - 1}\\ 1\\ 0\\ 0 \end{array}} \right),\left( {\begin{array}{*{20}{c}} { - 7}\\ 2\\ 0\\ 1\\ 0 \end{array}} \right),\left( {\begin{array}{*{20}{c}} 6\\ { - 5}\\ 0\\ 0\\ 1 \end{array}} \right)} \right]$


You may also have noted that the N(A) is subspace of $R^5$. This is not a coincidence, it can be shown that the null space of a m x n, matrix X, is a subspace of $R^n$.


Now we wish to work bases for these spaces for A. Beginning with the null space of A, I argue that:


$Basis\,of\,N\left( A \right) = \left[ {\left( {\begin{array}{*{20}{c}} 1\\ { - 1}\\ 1\\ 0\\ 0 \end{array}} \right),\left( {\begin{array}{*{20}{c}} { - 7}\\ 2\\ 0\\ 1\\ 0 \end{array}} \right),\left( {\begin{array}{*{20}{c}} 6\\ { - 5}\\ 0\\ 0\\ 1 \end{array}} \right)} \right]$


For this set to be a basis, it must be linearly independent and must span the set of solutions of the homogeneous equation. Of course it is a spanning set; every solution is a linear combination of this set. Furthermore it is linearly independent because each matrix has at least one element which is one, where every other matrix has zero. It obviously can not be a linear combination of the others.


This is a consequence of the construction of the general solution to the system, and holds for all null spaces.


A basis of CS(A) is the set of the columns of A that correspond to the columns that contain the leading ones in the Row reduced echelon form of A. As shown above the first two columns contain leading ones in the RRE(A), and hence a basis for A, is the set comprising only of the first two columns of A.


$Basis\,for\,CS\left( A \right) = \left[ {\left( {\begin{array}{*{20}{c}} 1\\ 1\\ 3 \end{array}} \right),\left( {\begin{array}{*{20}{c}} 2\\ 1\\ 5 \end{array}} \right)} \right]$


The argument that proves it is neither complicated, nor that long. I omit it because the post is already far too long, and there is no need to add things that are not required by the syllabus.


The basis of RS(A) is the set of the transposed rows of RRE(A) that contain leading ones. Recall that the first two rows of RRE(A) contain the leading ones. We conclude that:


$Basis\,for\,RS\left( A \right) = \left[ {\left( {\begin{array}{*{20}{c}} 1\\ 0\\ { - 1}\\ 7\\ 6 \end{array}} \right),\left( {\begin{array}{*{20}{c}} 0\\ 1\\ 1\\ { - 2}\\ 5 \end{array}} \right)} \right]$


A summary of the methods of finding the bases for null, column and row spaces are then as follows:


Given a m x n matrix A, we:



  1. Row Reduce the augmented matrix of the system Ax = 0: [A|0]
  2. Use it to derive the general parametric form of the general solution of the system. The matrices whose linear combination forms the general solution of the equation, form a basis for the Null Space of A.
  3. The columns of the matrix A, that correspond to the columns in RRE(A) that contain a leading one form a basis for the Column space of A.
  4. The transpose of the rows that contain the leading ones in RRE form a basis of the row space of A. 

Rank-Nullity Theorem


Using the example of A as above, consider the fact that the sum of the dimension of the null space is equal to the number of columns in A. This, too, is no coincidence. In fact the reasoning behind is simple. The number of the vectors in the basis of Row space are equal to the number of leading ones in the RRE(A), the number of vectors in the basis of the Null Space are then equal to number of free parameters in the RRE(A), which are in turn equal to the columns that do not contain leading ones. By transitivity of equality then the dimension of Null Space is equal to the number of columns with out a leading one. It is only logical that the sum of the dimension of the RS(A) and N(A) is equal to the number of columns in A. Also the dimension of the Row Space is equal to the rank of A. Of course this "proof" talks of vectors of finite dimensional Euclidean spaces only, but you should be comfortable with the notion that this holds for all the examples of the vector spaces given above. The general proof, though not difficult, would be a digression we cannot afford.

The dimension of the Null Space of the matrix A, is called the nullity of A. 

The rank-nullity theorem is then, for a m x n matrix A:

Rank(A) + Nullity(A) = n

As the rank of A is equal to the dimension of the row space and column space (why?) we have two alternate formulations:

dim(CS(A)) + dim(N(A)) = n
dim(RS(A)) + dim(N(A)) = n


Linear Transformations



Recall that a functions from a set X to a set Y, is a rule that assigns to every element in X, a unique element in Y.

Assume that the Sets X and Y are vector spaces. A function $T:X \to Y$ is called linear if for all vectors $\bar u,\bar v \in X$ and $\alpha  \in R$:

  1. $T\left( {\bar u + \bar v} \right) = T\left( {\bar u} \right) + T\left( {\bar v} \right)$
  2. $T\left( {\alpha \bar u} \right) = \alpha T\left( {\bar u} \right)$
Any such linear function from one space to another, we call a linear transformation. If the transformation is from a space to itself, then it is called a linear operator.

We don't concern ourselves with the general theory of linear transformations, but move directly onto the sort of functions the examiner will expect you to work with.

Let A be a m x n matrix and let $T:{R^n} \to {R^m}$ be the linear function defined by the matrix multiplication: T(x) = Ax. Then T is a linear transformation.

Ex. 

Consider the following transformation, from $R^3$ to $R^2$:

$T\left( {\begin{array}{*{20}{c}} {{x_1}}\\ {{x_2}}\\ {{x_3}} \end{array}} \right) = \left( {\begin{array}{*{20}{c}} {2{x_1} + {x_2} + {x_3}}\\ {{x_1} - {x_2}} \end{array}} \right)$

Then the following matrix reproduces the effect of the transformation:


$A = \left( {\begin{array}{*{20}{c}} 2&1&1\\ 1&{ - 1}&0 \end{array}} \right)$


This is easily proven as, for any general vector in $R^3$:


$T\left( x \right) = Ax = \left( {\begin{array}{*{20}{c}} 2&1&1\\ 1&{ - 1}&0 \end{array}} \right)\left( {\begin{array}{*{20}{c}} {{x_1}}\\ {{x_2}}\\ {{x_3}} \end{array}} \right) = \left( {\begin{array}{*{20}{c}} {2{x_1} + {x_2} + {x_3}}\\ {{x_1} - {x_2}} \end{array}} \right)$


I doubt the examiner will ask you to work out A, but should it happen then, the procedure is as follows. For a linear transformation $T: R^m \to R^n$ then a matrix A, such that T(x) = Ax, can be found by the following procedure:


$A = \left( {\begin{array}{*{20}{c}} {T\left( {{e_1}} \right)}&{T\left( {{e_2}} \right)}&{...}&{T\left( {{e_m}} \right)} \end{array}} \right)$, where $e_i$ are the respective columns of the standard basis.

In the previous example note that:


$A = \left( {\begin{array}{*{20}{c}} {T\left( {\begin{array}{*{20}{c}} 1\\ 0\\ 0 \end{array}} \right)}&{T\left( {\begin{array}{*{20}{c}} 0\\ 1\\ 0 \end{array}} \right)}&{T\left( {\begin{array}{*{20}{c}} 0\\ 0\\ 1 \end{array}} \right)} \end{array}} \right) = \left( {\begin{array}{*{20}{c}} 2&1&1\\ 1&{ - 1}&0 \end{array}} \right)$



Range and Kernel of a Linear Transformation


The Range of a Linear Transformation $T:X \to Y$ is the set:

\[R\left( T \right) = \left\{ {\bar y \in Y|\bar y = T\left( {\bar x} \right)\,for\,some\,\bar x \in X} \right\}\]

Which is to say that it is the set of the elements of Y, to which one or more elements of X are mapped.

The Kernel of the same transformation is then the set:

\[Ker\left( T \right) = \left\{ {\bar x \in X|T\left( {\bar x} \right) = 0} \right\}\]

Which is to say the set of all the vectors of X, that the transformation maps on to the zero vector of X.

For out transformations, we said that T(x) = Ax, where A is matrix. In this case, I further argue that the range of T is the same space as the column space of A, and that the kernel of T is the same space as the null space of A. It follows that the they share the same bases, and dimensions.

Now we have covered all the different matrix related spaces that the reader is supposed to know.


Eigenvectors and Eigenvalues



Let us say that we have a transformation such that T(x) = Ax, and we have a $x_i$ such that:

$T\left( {{x_i}} \right) = A{x_i} = \lambda {x_i}$, such that $\lambda$ is a scalar, then $x_i$ is called an eigenvector and $\lambda$ is called the corresponding eigenvalue. These eigenvalues and eigenvectors tend to define a vector space called an eigenspace, luckily something you wont be required to have knowledge of.

Ex.


For the transformation $T: R^2 \to R^2$, we have that:


$T\left( {\begin{array}{*{20}{c}} x\\ y \end{array}} \right) = \left( {\begin{array}{*{20}{c}} {x + 3y}\\ { - x + 5y} \end{array}} \right) = \left( {\begin{array}{*{20}{c}} 1&3\\ { - 1}&5 \end{array}} \right)\left( {\begin{array}{*{20}{c}} x\\ y \end{array}} \right)$


Then we have the following eigenvectors and corresponding eigenvalues(4 and 2):


$\begin{array}{l} T\left( {\begin{array}{*{20}{c}} 1\\ 1 \end{array}} \right) = \left( {\begin{array}{*{20}{c}} {1 + 3\left( 1 \right)}\\ { - 1 + 5\left( 1 \right)} \end{array}} \right) = \left( {\begin{array}{*{20}{c}} 1&3\\ { - 1}&5 \end{array}} \right)\left( {\begin{array}{*{20}{c}} 1\\ 1 \end{array}} \right) = \left( {\begin{array}{*{20}{c}} 4\\ 4 \end{array}} \right) = 4\left( {\begin{array}{*{20}{c}} 1\\ 1 \end{array}} \right)\\ T\left( {\begin{array}{*{20}{c}} 3\\ 1 \end{array}} \right) = \left( {\begin{array}{*{20}{c}} {3 + 3\left( 1 \right)}\\ { - 3 + 5\left( 1 \right)} \end{array}} \right) = \left( {\begin{array}{*{20}{c}} 1&3\\ { - 1}&5 \end{array}} \right)\left( {\begin{array}{*{20}{c}} 3\\ 1 \end{array}} \right) = \left( {\begin{array}{*{20}{c}} 6\\ 2 \end{array}} \right) = 2\left( {\begin{array}{*{20}{c}} 3\\ 1 \end{array}} \right) \end{array}$

The logical question that follows is then how to find the eigenvectors and the eigenvalues. For this I present the following line of reasoning.


$\begin{array}{l} Ax = \lambda x\\ Ax - \lambda x = 0\\ \left( {A - \lambda I} \right)x = 0 \end{array}$


Now either x is zero and/or the other matrix defines a transformation that maps that vector x onto the zero vector. We ignore the case where x is zero (I cannot give a reason because I haven't taught you transition matrices, and the line of reasoning that leads to the concept of eigenvectors and eigenvalues. I just introduced them without any buildup because the syllabus called for it). Now if the matrix $\left( {A - \lambda I} \right)$ maps x onto the zero vector and x is a non-zero vector, then we conclude that this homogeneous equation has infinite number of solutions. If it has an infinite number of solutions then we can by the main theorem quoted above that the determinant of $\left( {A - \lambda I} \right)$ will be zero.


Hence:


$\left| {A - \lambda I} \right| = 0$


This is called the characteristic equation and solving this gives us the values for $\lambda$ the eigenvectors. Let us use the example I began the topic with.



$\begin{array}{l} \left| {A - \lambda I} \right| = \left| {\left( {\begin{array}{*{20}{c}} 1&3\\ { - 1}&5 \end{array}} \right) - \lambda \left( {\begin{array}{*{20}{c}} 1&0\\ 0&1 \end{array}} \right)} \right|\\ = \left| {\begin{array}{*{20}{c}} {1 - \lambda }&3\\ { - 1}&{5 - \lambda } \end{array}} \right| = \left( {1 - \lambda } \right)\left( {5 - \lambda } \right) - \left( 3 \right)\left( { - 1} \right)\\ = {\lambda ^2} - 6\lambda + 8 = \left( {\lambda - 4} \right)\left( {\lambda - 2} \right) = 0 \end{array}$


So the values for the eigenvalue are 4 and 2. Now we must find the eigenvectors corresponding to this eigenvalues. For this consider the argument that x should by the logic given already should lie in the null space of the matrix $\left( {A - \lambda I} \right)$. Furthermore any vector in the null space would satisfy the equation of a for that value of lambda. If we could find a basis for the null space of $\left( {A - \lambda I} \right)$ we can effectively find any number of eigenvectors for that eigenvalue. Consider the following:


For $\lambda = 4$, we have:


$N\left( {A - 4I} \right) = N\left( {\begin{array}{*{20}{c}} { - 3}&3\\ { - 5}&5 \end{array}} \right)$


By our method we find that:


$N\left( {A - 4I} \right) = N\left( {\begin{array}{*{20}{c}} { - 3}&3\\ { - 5}&5 \end{array}} \right) = Lin\left[ {\left( {\begin{array}{*{20}{c}} 1\\ 1 \end{array}} \right)} \right]$


Hence ${\left( {\begin{array}{*{20}{c}} 1\\ 1 \end{array}} \right)}$  or any multiple of it can be the eigenvector for 4.


By similar reasoning you can deduce an eigenvector for the other eigenvalue.




Diagonalisation


Two matrices, say A and B, are said to be similar if there exists a matrix P, such that the following holds:

\[A = {P^{ - 1}}BP\]

Further if a diagonal matrix, say D, can be found such that it is similar to a matrix A, which is to say:

\[D = PA{P^{ - 1}}\]

Then we call the process of finding that matrix D, in addition to the matrix P and it's inverse the process of diagonalisation. It has its purposes, you are expected to know one in particular that I shall get to in due time. However it is still interesting to note that similar matrices have the same rank, eigenvalues, characteristic equation, and so forth.

The process of diagonalisation of course begins with the question, is it even possible to diagonalise a given matrix. The reader however is not expected to answer this. He will be given a matrix and told to diagonalise it. For the interested reader the concept of Algebraic and Geometric Multiplicity helps us decide whether a matrix can be diagonalised or not.

So suppose the examiner has given you a matrix and told you that this particular matrix is to be diagonalised, how would you go about doing it?

You will find the eigenvalues of the matrix, and create a diagonal matrix. This will be your D. Then you will create a matrix whose columns are the eigenvectors of the matrix, and the columns of the eigenvector shall correspond to the columns the eigenvalues lie in the diagonal matrix.

Consider the example the last example we talked about, then:

$\begin{array}{l} D = \left( {\begin{array}{*{20}{c}} 4&0\\ 0&2 \end{array}} \right)\\ P = \left( {\begin{array}{*{20}{c}} 1&3\\ 1&1 \end{array}} \right)\\ {P^{ - 1}} = \left( {\begin{array}{*{20}{c}} { - {\textstyle{1 \over 2}}}&{{\textstyle{3 \over 2}}}\\ {{\textstyle{1 \over 2}}}&{ - {\textstyle{1 \over 2}}} \end{array}} \right)\\ PD{P^{ - 1}} = \left( {\begin{array}{*{20}{c}} 1&3\\ 1&1 \end{array}} \right)\left( {\begin{array}{*{20}{c}} 4&0\\ 0&2 \end{array}} \right)\left( {\begin{array}{*{20}{c}} { - {\textstyle{1 \over 2}}}&{{\textstyle{3 \over 2}}}\\ {{\textstyle{1 \over 2}}}&{ - {\textstyle{1 \over 2}}} \end{array}} \right) = \left( {\begin{array}{*{20}{c}} 1&3\\ { - 1}&5 \end{array}} \right) = A \end{array}$

Why this works is not required by the syllabus, but it requires the understanding of change of base, transition matrices and other things of their ilk.


Now suppose you wanted to compute $A^5$, $A^{10}$, or $A^{256}$? Some algebraic manipulation of the condition of similarity holds the simpler answer, where the traditional method is far too tedious to be worth the effort. So we do this:


$\begin{array}{l} {A^5} = {\left( {\begin{array}{*{20}{c}} 1&3\\ { - 1}&5 \end{array}} \right)^5} = {\left( {PD{P^{ - 1}}} \right)^5} = PD{P^{ - 1}}PD{P^{ - 1}}PD{P^{ - 1}}PD{P^{ - 1}}PD{P^{ - 1}}\\ = PD\left( {{P^{ - 1}}P} \right)D\left( {{P^{ - 1}}P} \right)D\left( {{P^{ - 1}}P} \right)D\left( {{P^{ - 1}}P} \right)D{P^{ - 1}}\\ = PDDDDD{P^{ - 1}} = P{D^5}{P^{ - 1}} \end{array}$


Also note that for a diagonal matrix of order m, and a real number n the following holds:


${D^n} = {\left( {\begin{array}{*{20}{c}} {{a_{11}}}&0&0&0\\ 0&{{a_{22}}}&0&0\\ \vdots & \vdots & \ddots & \vdots \\ 0&0&0&{{a_{mm}}} \end{array}} \right)^n} = \left( {\begin{array}{*{20}{c}} {{{\left( {{a_{11}}} \right)}^n}}&0&0&0\\ 0&{{{\left( {{a_{22}}} \right)}^n}}&0&0\\ \vdots & \vdots & \ddots & \vdots \\ 0&0&0&{{{\left( {{a_{mm}}} \right)}^n}} \end{array}} \right)$


The proof of this is a simple application of the Principle of Mathematical Induction, and the reader is advised to prove it on his own. For our A, we have then:



${A^5} = \left( {\begin{array}{*{20}{c}} 1&3\\ 1&1 \end{array}} \right)\left( {\begin{array}{*{20}{c}} {{4^5}}&0\\ 0&{{2^5}} \end{array}} \right)\left( {\begin{array}{*{20}{c}} { - {\textstyle{1 \over 2}}}&{{\textstyle{3 \over 2}}}\\ {{\textstyle{1 \over 2}}}&{ - {\textstyle{1 \over 2}}} \end{array}} \right)$

Hence any power of A can be found thus.




Addendum



This is the place where I will drop theorems, rules, patterns and so forth, for which I am unable to find a place in the post above.

1) If A is a (n x n) matrix with the eigenvalues ${\lambda _1},{\lambda _2},...,{\lambda _n}$, then the matrix $\left( {A + \gamma I} \right)$, has the eigenvalues ${\lambda _1} + \gamma ,{\lambda _2} + \gamma ,...,{\lambda _n} + \gamma $ where $\gamma$ is a constant, and I is the identity matrix. The eigenvectors remain the same.

2) If there is one theorem I want you to memorise it would be the following: If A is a n x n matrix then all the following statements are equivalent (which is to say each one implies all the others):

  • $A^{-1}$ exists

  • $Ax = B$ has a unique solution in for any B in $R^n$

  • $Ax = 0$ has only the unique solution $x = \bar 0$
  • The reduced row echelon form of A is the identity matrix of the order n
  • Determinant of A is non-zero
  • The rank of A is n


Ex. (Oct/Nov 2014 Paper 11 Q11 Or)

The square matrix A has the eigenvalue $\lambda$ and the corresponding eigenvector of e. Prove that if A is non-singular then:

1) $\lambda  \ne 0$ 

Sol: $Ae = \lambda e$

The fact that A is non-singular means that it has a non-zero discriminant, which would suggest that it has a inverse. Recall that if that is true then $Ax = 0$ only if x is the zero vector. As e is non-zero (we have in our definition precluded the possibility of zero eigenvectors), then Ae$ \ne 0$. Then $\lambda$e $ \ne 0$. Again as the eigenvector is not zero, so the eigenvalue cannot be zero.

2) The eigenvalues for a diagonal, upper (or lower triangular) matrix will be the elements ling on the principle diagonal.

3) Prove that $A^{-1}$ has the eigenvector ${\lambda ^{ - 1}}$ with e as the corresponding eigenvalue.

Sol: Whenever you're given a Matrix with one of its eigenvectors and eigenvalues, and are required to find the eigenvector for a Matrix that bears some similarity to the original one, usually you may find the answer by doing some matrix algebra.

$\begin{array}{l} Ae = \lambda e\\ {A^{ - 1}}Ae = {A^{ - 1}}\lambda e\;(Multiplying\,by\,the\,inverse)\\ {\lambda ^{ - 1}}e = {\lambda ^{ - 1}}{A^{ - 1}}\lambda e \Rightarrow {A^{ - 1}}e = {\lambda ^{ - 1}}e \end{array}$

The above has taken the form of the definition of the eigenvectors and eigenvalues, for the transformation $A^{-1}$.

3) $A = \left( {\begin{array}{*{20}{c}} { - 2}&2&4\\ 0&{ - 1}&5\\ 0&0&3 \end{array}} \right);\;B = {\left( {A + 3I} \right)^{ - 1}}$

Find the matrices P and D, such that: $B = PD{P^{ - 1}}$.

Sol.

We begin with noticing that the matrix A, is an upper triangular matrix. By the theorem 2 in the addendum, the values of the eigenvectors are simply the elements lying on the principle diagonal. Hence the eigenvalues for A are: -2, -1, 3.

We will have to find the eigenvalues through the regular process. I can offer no short cuts there.

$A + 2I = \left( {\begin{array}{*{20}{c}} 0&4&6\\ 0&1&5\\ 0&0&5 \end{array}} \right) \sim \left( {\begin{array}{*{20}{c}} 0&0&{ - 12}\\ 0&1&5\\ 0&0&5 \end{array}} \right) \sim \left( {\begin{array}{*{20}{c}} 0&0&0\\ 0&1&0\\ 0&0&5 \end{array}} \right)$

This suggests that the null space is defined by the following set of equations: ${x_1} = t;\,{x_2} = 0;\,{x_3} = 0$. Of course any vector satisfying those equations is an eigenvector, we take the simplest to work with. The one with t = 1.

$For\,\lambda = - 2;\;e = \left( {\begin{array}{*{20}{c}} 1\\ 0\\ 0 \end{array}} \right)$

In similar fashion, one could prove the following:

$\begin{array}{l} For\,\lambda = - 1;\;e = \left( \begin{array}{l} 2\\ 1\\ 0 \end{array} \right)\\ For\,\lambda = 3;\;e = \left( \begin{array}{l} - 6\\ 25\\ 20 \end{array} \right) \end{array}$

Let us work out the eigenvalues for B. Firstly we work out the eigenvectors for A + 3I. From the first theorem in the addendum, note that the eigenvalues are now 1, 2, 6. The eigenvectors aren't changed, which is to say, the eigenvector of the eigenvalue -2 is now the eigenvector of the eigenvalue 1, and so forth.

Now we use harken back to the second part of the question. We have proven the relationship between A and $A^{-1}$'s eigenvalues and eigenvectors. Obviously B and A+3I fall under that category and the eigenvalues of B are 1, 1/2, 1/6. The eigenvectors are again retained.

We conclude:

$\begin{array}{l} P = \left( {\begin{array}{*{20}{c}} 1&2&{ - 6}\\ 0&1&{25}\\ 0&0&{20} \end{array}} \right)\\ D = \left( {\begin{array}{*{20}{c}} 1&0&0\\ 0&{{\textstyle{1 \over 2}}}&0\\ 0&0&{{\textstyle{1 \over 6}}} \end{array}} \right) \end{array}$