11. Groups of Matrices
The focus now shifts to group , the general linear group of invertible matrices over a number system (field) . In particular, is crucial to the theory of modular forms to be introduced later.
To get a multiplicative group from a set of matrices, the rules of matrix multiplication argue that only square matrices will do. The identity matrix becomes the neutral element, written for identity. was introduced in the prior section, shown by example to contain on the principal diagonal and elsewhere, where and are the additive and multiplicative identities in the field . To complete the group, a multiplicative inverse of each matrix must be provided. As was shown in the prior chapter, matrix has an inverse when there is a matrix such that . is called the inverse of .
A square matrix is invertible just when its determinant, det, has a multiplicative inverse in , which it will when it is non-zero, which it is when the rows of are linearly independent. Letting , has an inverse when there is .
Consider as an example the group of invertible matrices , of which there are just six:
That each of these six have inverses is verified using the formula for the inverse of a matrix
In the general case, the set of invertible matrices form a group. Matrix multiplication is associative, but not commutative. The proof that is somewhat messy, requiring one to consider individual columns of the matrices. The other group properties are straightforward. To show closure under multiplication, it is sufficient to show that :
The group is called the general linear group because a general determinant is allowed; the determinant is not required to be .
The group consists of matrices whose determinants are either or (the only elements in that have multiplicative inverses in ). Since now matrix has det, . To construct matrices in , pick the top row relatively prime. Then the Euclidean algorithm (see supplement) can provide choices of for matrices of having top row . A heuristic approach constructs two lists, one of multiples of and the other of multiples of , and then picks pairs, one from each row, which differ by . Their multipliers then become and . This approach further demonstrates the existence of an infinite number of such linear combinations, and hence, matrices.
11.1. Supplement: The Extended Euclidean Algorithm (GCD as a Linear Combination)
The Euclidean algorithm is a method for calculating the greatest common divisor of two integers, written , by successive divisions. Observing that for integer quotient and remainder , the algorithm steps to successively reduce the remainders modulo , producing , until is obtained. Then the last non-zero remainder is . Picking the larger integer as :
can also be represented as a linear combination of terms for some integers . There are set-theoretic and induction proofs of this statement, but the entire generality is not needed here. The linear combination is not unique; there are infinitely many combinations satisfying the equality. The extended Euclidean algorithm is used to determine values of and .
In the particular case of and coprime, another way of saying , there are values of and such that per the prior paragraph. This is equivalent to the linear congruence , which can be solved for since and are coprime. Note that is , so the following is also an existence statement and algorithm for . To demonstrate the existence of such an congruence, consider the residues of the sequence of multiples of . Observe that if goes through a complete set of residues , then so does , because implies, when and are coprime, that , a contradiction. Therefore, there is for which . In other words, , for some .
The case at hand, finding values for elements of involves picking coprime and for the first row, and then solving for the values of , which will become the 2nd row values of the matrix. The extended Euclidean algorithm computes these coefficients stepwise during each step of the algorithm. There is an algebraic formula for each step as above, but more feeling for the process can be derived from applying the algorithm to the example matrix: beginning with , find coefficients such that . The first two columns below show the Euclidean algorithm applied to finding . The second two columns show the auxiliary calculations provided by the extended algorithm to compute the coefficients of and . These columns start with a unit vector used to accumulate changes that will result in an integer pair, the two needed coefficients. Each column is labeled with the name of the variable involved:
Thus, and a matrix in with 6, 11 in row 1 is:
11.2. Supplement: Row Equivalence, Determinants, Inverses
Elementary row operations are defined for matrices: row transposition, row multiplication by a scalar, and addition of a row to any other. Two matrices are called row-equivalent if one can be derived from the other by some succession of elementary row operations, as in the Gaussian elimination method. A row operation can be captured as a transformation of the identity matrix, , where E is called an elementary matrix. A sequence of such transformations can be represented by a product of such elementary matrices: . For example, there are only five elementary matrices, representing three types of transformation: shear, 1-dimensional elongation/compression, and reflection. Thus in the example, any one-to-one, homogeneous linear transformation of the plane can be represented as some product of these three types of elementary transformation.
Let be the set of all matrices over number system (most generally a commutative ring with identity, here specialized to a field at no immediate loss of generality). Consider any function which satisfies the following properties, where :
-linear Condition: is linear on each row (or column), when other rows are held unchanged. Write , where is the th row of . Further, since only the th row is being operated on, the others held fixed, the abbreviated form is a convenience. is -linear means . -linearity implies is unchanged by linear combinations of rows. Letting , another row of , then . But since the row appears in both the th and th positions in the rightmost term, that term is , and hence . Thus, the elimination method of elementary row operations, described above for solving a system of linear equations, does not change the determinant of its matrix.
Alternating Condition: if two adjacent columns are equal. Equivalently*, the sign of changes when two adjacent columns (or rows) are exchanged. (*The two statements are logically mutually consequential only over fields where ).
Faithful Condition: , where is the identity matrix and is the unit in .
The object of interest is the image of such , called the determinant of , det. Any function satisfying the above three conditions also satisfies (det, so det is uniquely fixed by these conditions. and det are used interchangeably below.
There are existence proofs for . There are algebraic formulae for det, historically due to Leibniz and Laplace. There are also geometric interpretations of det. In 2-space, consider as the matrix representation of a linear transformation of the unit square into a parallelogram, where the matrix expresses the vertices. Then is the area of the parallelogram, where the absolute value is due to the sign depending on the order in which the vertices are recorded in the matrix. The area of a triangle is expressed similarly, as 1/2 of a parallelogram. In higher dimension spaces, a similar situation holds. For example, in 3-space, consider as a matrix representing a linear transformation of the unit cube into a parallelopiped. Then is the volume of the transformed solid. Also of geometric interest, when a row or column of a matrix is multiplied by a scalar , then its determinant is multiplied by , as one would expect for a volume measure in n-space.
is called an alternating function because it satisfies both parts of the Alternating Condition. Related to this, the definition of involves permutations of degree . Consider this version of the Leibniz formula:
is seen to have sums each with products. Recall is the symmetric group, the permutations of the integers from to . is one element of this group. Recall also that is even or odd depending on whether an even or odd number of transpositions is involved; returns , where is the number of transpositions in . represents one of set of matrix entries with permuted subscripts whose product is formed. For example, when and , then (one transposition) and the product of the matrix entries is . The alternating nature of D is revealed by the alternating sign function, as subscripts (rows) get transposed. Note in the formula that some authors use det rather then , where is the identity matrix permuted by the current . This is another way to specify the sign of each summand.
Given the number of computations required to calculate the determinant of a sizable matrix by the brute force of Leibniz’s formula, it is best to finesse the situation if possible. Three such finesses follow.
By Gaussian elimination, a square invertible matrix can be placed in upper-triangular form, , where has ‘s below the diagonal. The diagonal entries are called the pivots of (and ), and det, the product of the pivots, where is the number of row transpositions needed to produce from .
If an matrix has a block of ‘s isolated in the lower left (for instance, but the usual place), such that it has a block representation:
where is an matrix, is an matrix, is an matrix, and denotes the zero matrix, then detdetdet.
Every entry in , say , has a cofactor, times the determinant of the matrix that is minus the row and column occupied by . The dot product of any row or column with its cofactors is equal to det.
Determinants arise naturally in two common operations: solving a system of linear equations and computing the inverse of a matrix. Discussion of determinants in these contexts provides additional insight into the nature of such operations. But both these types of operation are most efficiently computed using Gaussian elimination rather than determinant evaluation.
Cramer’s rule is a method of using determinants to solve the non-homogeneous equation . Let be the matrix obtained by replacing the th column of by . Then the solution vector is specified by
The inverse of a matrix is expressible in terms of its matrix of cofactors and its determinant. If is the cofactor matrix for , then
Gauss-Jordan elimination is more efficient than the above formula for finding . Gauss-Jordan is an extended version of Gaussian elimination, which applies the machinery of elimination to the block matrix (analogously to the way the extended Euclidean algorithm maintains parallel computations). Operating on and simultaneously takes advantage of the property: Gaussian elimination operations used to reduce , when applied to in the same sequence, give . The reasoning is that is row-equivalent to , so that . Multiplying by , , the stated property. Gauss-Jordan keeps reducing and eliminating until the block matrix results.
Summarizing some attributes of linear independence, the rows of a square matrix over are linearly independent if:
The only solution to is .
The rows form a basis for .
for some set of elementary matrices.
is row-equivalent to .
The rank of is .
; det has an inverse in ; is non-singular.