1.9.4 Matrix to Tensor Generalization

Matrix to Tensor Generalization extends multilinear algebra concepts, generalizing matrices to higher-dimensional arrays for complex data representation and operations.

Matrix to Tensor Generalization is the extension of the matrix, a rank-2 array with two indices representing a linear map between vector spaces, into the broader tensor framework, in which the number of indices is no longer fixed at two but may take any non-negative value, and in which the roles of the two matrix indices, row and column, are recast as instances of the general contravariant and covariant index types that tensors of any rank possess. Where the vector-to-tensor step introduced the mechanism of combining single indices into pairs, the matrix-to-tensor step takes the pair itself, already familiar as rows and columns, and shows how it fits inside, and extends beyond, the same general pattern.

A matrix is typically identified with a type (1, 1) tensor: one contravariant index and one covariant index, which together allow it to act as a linear transformation, mapping a vector to another vector. This identification is what allows matrix operations, multiplication, transposition, the trace, to be re-expressed as special cases of general tensor operations, tensor contraction, index permutation, and full contraction, respectively.

The Matrix as a Two-Index Object

Rows, Columns, and Index Types

A matrix M with components M^i_j acts on a vector v^j to produce a new vector (Mv)^i, summing over the shared index j. The upper index i behaves contravariantly, transforming like a vector, while the lower index j behaves covariantly, transforming like a covector. This split of the two matrix indices into one of each type is precisely what allows a matrix to represent a linear map from a vector space to itself, rather than a bilinear form or some other two-index object.

{(M v)}^{i} = \sum_{j = 1}^{n} M_{j}^{i} v^{j}

Other Matrix Types

Not every two-index array that is informally called a "matrix" has the mixed (1, 1) structure. A bilinear form, such as an inner product represented in matrix form, is instead a type (0, 2) tensor, with two covariant indices, since it accepts two vectors and returns a scalar without acting as a map between vectors. A matrix built from the outer product of two vectors is a type (2, 0) tensor. The generalization to tensors makes this distinction explicit and mandatory, whereas ordinary matrix notation often leaves it implicit.

From Two Indices to Arbitrarily Many

Adding a Third Index

Extending a matrix by one further index produces a rank-3 tensor, an object that can no longer be drawn as a simple rectangular grid of numbers but instead requires a three-dimensional array of components, or equivalently a stack of matrices, one for each value of the new index. Such an object arises naturally, for example, when describing how a linear map itself varies as a function of a further parameter or direction, so that each "slice" of the rank-3 tensor along the new index is itself an ordinary matrix.

T_{jk}^{i} with slices M_{j, k = k_{0}}^{i} for each fixed k_{0}

General Type `(p, q)` Tensors

Continuing this process for arbitrary numbers of upper and lower indices produces the full family of type (p, q) tensors, of which the matrix, at (1, 1), is one specific, low-rank member. No new construction principle is required beyond what is already visible in the matrix case: each additional index simply contributes one more factor to the transformation law and one more dimension to the array of components.

Matrix Operations as Tensor Operations

Matrix Multiplication as Contraction

Matrix multiplication, in tensor language, is a contraction: the shared inner index of the two matrices being multiplied is summed over, exactly as in the matrix-vector product shown above but with the vector replaced by a second matrix. This shows matrix multiplication is not a special operation unique to matrices but a specific instance of the general tensor contraction operation, applied to two type (1, 1) tensors.

C_{k}^{i} = \sum_{j = 1}^{n} A_{j}^{i} B_{k}^{j}

The Trace as Full Contraction

The trace of a matrix, the sum of its diagonal entries, is the full contraction of its single upper index with its single lower index, producing a scalar. Because full contraction is defined identically for tensors of any rank whose upper and lower index counts match, the trace generalizes directly: an even-rank tensor with equal numbers of upper and lower indices can be fully contracted down to a scalar in the same way, with the matrix trace as the simplest nontrivial case.

tr (M) = \sum_{i = 1}^{n} M_{i}^{i}

Transposition as Index Permutation

Transposing a matrix swaps its two indices. For general tensors, this operation generalizes to index permutation, in which any two indices of the same variance, both upper or both lower, can be interchanged. A tensor that is unchanged by such a swap is called symmetric in those indices; a tensor that changes sign is called antisymmetric. Matrices themselves can be symmetric or antisymmetric with respect to their single index pair, and this classification extends unchanged to any pair of like-type indices in a higher-rank tensor.

Symmetry Structure Beyond the Matrix Case

Symmetric and Antisymmetric Matrices

A symmetric matrix satisfies M_ij = M_ji, and an antisymmetric matrix satisfies M_ij = -M_ji. Every square matrix with two lower indices can be decomposed uniquely into the sum of a symmetric part and an antisymmetric part, a decomposition that depends only on the fact that there are exactly two indices of the same type to compare.

Generalized Symmetrization and Antisymmetrization

For a tensor with more than two indices of the same type, symmetry becomes a richer structure: a tensor may be symmetric or antisymmetric in some pairs of indices without being so in others, and full symmetrization or antisymmetrization over all indices of a given type is computed by averaging over all permutations of those indices, with a sign attached in the antisymmetric case. The two-index symmetric and antisymmetric matrices are simply the smallest nontrivial instances of this generalized symmetry machinery.

Why the Matrix Is a Natural Midpoint

Between Vectors and Higher-Rank Tensors

The matrix occupies a natural midpoint in the tensor hierarchy: it is complex enough to already exhibit two-index phenomena absent from vectors, index type distinctions, contraction, symmetry, yet simple enough that all of its structure can still be visualized as a two-dimensional grid and manipulated with well-known linear algebra techniques. Generalizing beyond the matrix retains every one of these phenomena while removing the restriction to exactly two indices, which is why matrix intuition transfers so directly to the study of general tensors.

Linear Maps as a Special Case of Multilinear Maps

A matrix, as a (1, 1) tensor, is a linear map from V to V. A general (p, q) tensor is a multilinear map taking q vectors and p covectors as arguments and returning a scalar, or equivalently a multilinear map into a tensor of appropriately reduced type. The matrix's linear-map interpretation is recovered exactly when p = q = 1, confirming that the entire multilinear framework used for general tensors already contains ordinary linear algebra as its simplest nontrivial case.