Sparse Matrices


Dense Matrices

A n×n matrix is called dense if it has O(n2) non-zero entries. For example:

A=[1.02.03.04.05.06.07.08.09.0].

To store the matrix, all components are saved in row-major order. For A given above, we would store:

AA=[1.02.03.04.05.06.07.08.09.0].

The dimensions of the matrix are stored separately.

Sparse Matrices

A n×n matrix is called sparse if it has O(n) non-zero entries. For example:

A=[1.0002.003.04.005.006.007.08.09.00010.011.00000012.0].

COO (Coordinate Format) stores arrays of row indices, column indices and the corresponding non-zero data values in any order. This format provides fast methods to construct sparse matrices and convert to different sparse formats. For A the COO format is:

data=[12.09.07.05.01.02.011.03.06.04.08.010.0] row=[422100312123],col=[442303300132]

How to interpret: The first entries of data, row, col are 12.0, 4, 4, respectively, meaning there is a 12.0 at position (4, 4) of the matrix; second entries are 9.0, 2, 4, so there is a 9.0 at (2, 4).

CSR (Compressed Sparse Row) encodes rows offsets, column indices and the corresponding non-zero data values. This format provides fast arithmetic operations between sparse matrices, and fast matrix vector product. The row offsets are defined by the followign recursive relationship (starting with rowptr[0]=0):

rowptr[j]=rowptr[j1]+nnz(rowj1),

where nnz(rowk) is the number of non-zero elements in the kth row. Note that the length of rowptr is nrows+1, where the last element in rowptr is the number of nonzeros in A. For A the CSR format is:

data=[1.02.03.04.05.06.07.08.09.010.011.012.0] col=[030130234234] rowptr=[02591112]

How to interpret: The first two entries of rowptr gives us the elements in the first row. Interval [0, 2) of data and col, corresponding to two (data, column) pairs: (1.0, 0) and (2.0, 3), means the first row has 1.0 at column 0 and 2.0 at column 3. The second and third entries of rowptr tells us [2, 5) of data and col corresponds to the second row. The three pairs (3.0, 0), (4.0, 1), (5.0, 3) means in the second row, there is a 3.0 at column 0, a 4.0 at column 1, and a 5.0 at column 3.

CSR Matrix Vector Product Algorithm

The following code snippet performs CSR matrix vector product for square matrices:

import numpy as np
def csr_mat_vec(A, x):
  Ax = np.zeros_like(x)
  for i in range(x.shape[0]):
    for k in range(A.rowptr[i], A.rowptr[i+1]):
      Ax[i] += A.data[k]*x[A.col[k]]
  return Ax

Review Questions