1
00:00:01,960 --> 00:00:18,840
OK, hi guys. Welcome to CS445 Computational Photography. I'm Ashwin Ramesh, and today I'll be doing a brief prerequisite review of some of the linear algebra concepts that you might need to use in this course.

2
00:00:18,840 --> 00:00:30,260
Note that this should just serve as a guide for what you should review before you embark on the projects, rather than a comprehensive review.

3
00:00:30,260 --> 00:00:33,460
With that, let's get started.

4
00:00:33,460 --> 00:00:38,260
So the goals of this session are as follows:

5
00:00:38,260 --> 00:00:47,140
We're going to talk about systems of linear equations and how to represent them as matrices; go over some basic notations, properties,

6
00:00:47,140 --> 00:00:53,200
setting up systems, and concepts like SVD.

7
00:00:53,200 --> 00:01:01,780
And we're also going to do a demo on Jupyter Notebook of some of the library functions in NumPy

8
00:01:01,780 --> 00:01:05,840
that allow you to do these things.

9
00:01:05,840 --> 00:01:09,100
OK. So, here are some references that you should consult.

10
00:01:09,100 --> 00:01:13,780
The YouTube channel 3Blue1Brown in particular is one that I like a lot.

11
00:01:13,780 --> 00:01:19,680
It will really help you get an intuition for linear algebra concepts.

12
00:01:19,680 --> 00:01:23,820
Really great visualizations. I encourage you to check that out.

13
00:01:23,820 --> 00:01:32,740
The first one, the Stanford review PDF, is a good reference for sure.

14
00:01:32,740 --> 00:01:35,380
OK. Let's move on.

15
00:01:35,380 --> 00:01:44,740
So, this is a system of linear equations. This particular system has two equations and two variables.

16
00:01:44,740 --> 00:01:53,900
So, we know it's linear because the highest power of any variable in the equations is 1.

17
00:01:54,040 --> 00:01:58,180
And we know that these...

18
00:01:58,180 --> 00:02:03,280
So, what else do we know? We also know that these equations are linearly independent.

19
00:02:03,280 --> 00:02:13,080
Why do we know this? You can pause and take a look at it, if you would like, but the reason we know is because...

20
00:02:13,080 --> 00:02:21,000
Because the equations are not linear transformations of each other.

21
00:02:21,000 --> 00:02:30,280
So let's look at how to represent this in matrix form. This system is in the form Ax = b,

22
00:02:30,280 --> 00:02:39,740
where A is the matrix of coefficients of the variables and b is a vector of the right hand side constants.

23
00:02:39,740 --> 00:02:48,060
Something to note in this is that you can interchange the equations above,

24
00:02:48,060 --> 00:02:54,280
and that doesn't really affect the solution values of x1 and x2.

25
00:02:54,280 --> 00:03:02,800
So consequently, you can interchange the rows in A and b, and that won't affect the solution to the system.

26
00:03:02,800 --> 00:03:15,100
And, we actually use this property quite extensively when we're solving systems of linear equations using row reduction, and etc.

27
00:03:15,100 --> 00:03:27,040
Another benefit of converting the system into matrix form is that we can employ a lot of parallelism (thread-based parallelism).

28
00:03:27,040 --> 00:03:33,940
Matrices are easy to represent in computers.

29
00:03:35,240 --> 00:03:39,880
OK, so let's look at some basic notations.

30
00:03:39,880 --> 00:03:50,020
So the i-th element of a vector x (and usually the vectors we deal with in this course are column vectors),

31
00:03:50,020 --> 00:03:56,960
is denoted by xi. So the i-th element is denoted by xi,

32
00:03:56,960 --> 00:04:05,640
a matrix "A" with m rows and n columns is said to be in R m cross n, R^(mxn).

33
00:04:05,640 --> 00:04:14,960
And each element is denoted by a lowercase "a" subscripted with the row index, then the column index.

34
00:04:14,960 --> 00:04:18,860
Let's look at some properties of vectors.

35
00:04:18,860 --> 00:04:31,140
A norm of a vector: As you know, a vector is a quantity with direction and magnitude, and a norm is some measure of the magnitude of a vector.

36
00:04:31,140 --> 00:04:42,040
And, a particular norm that's important to remember is the L2 or Euclidean norm. And, it's denoted like that:

37
00:04:42,040 --> 00:04:52,080
It's basically equal to the square root of the sum of the squares of each element in the vector.

38
00:04:53,940 --> 00:04:57,060
So, some vector-vector relationships:

39
00:04:57,060 --> 00:05:07,020
Given two column vectors x and y of the same length, the quantity x transpose y (x^T y) is called the "inner product" or "dot product" of the vectors.

40
00:05:07,020 --> 00:05:18,580
And, it's computed as shown: You basically take corresponding elements, multiply them, and then sum together all the products,

41
00:05:18,580 --> 00:05:20,500
in each vector.

42
00:05:20,500 --> 00:05:30,780
The dot product is a very important operation in machine learning and image processing, etc.,

43
00:05:30,780 --> 00:05:33,700
but more generally, as well.

44
00:05:33,700 --> 00:05:42,380
There's also something called the "outer product", and this operation doesn't really require the input vectors to be of the same length.

45
00:05:42,380 --> 00:05:55,120
So, given an m-length vector and an n-length vector, the result x y transpose (x y^T) is going to be of dimensions m cross n.

46
00:05:55,120 --> 00:06:05,820
And the entries of this matrix, each row is going to be the corresponding value in the first vector multiplied by the second vector,

47
00:06:05,820 --> 00:06:14,400
and each column is consequently the first whole vector multiplied by the corresponding column in the second vector.

48
00:06:14,400 --> 00:06:18,320
So, you can observe that relationship.

49
00:06:20,080 --> 00:06:25,540
OK. So now, let's talk about some operations on matrices.

50
00:06:25,540 --> 00:06:29,660
The main one is matrix multiplication.

51
00:06:29,660 --> 00:06:37,920
The product of two matrices A (with m rows and n columns) and B (with n rows and p columns) is the matrix C (with m rows and p columns).

52
00:06:37,920 --> 00:06:47,660
So the most important thing to note is the number of columns in the left matrix must be equal to the number of rows in the right matrix.

53
00:06:47,660 --> 00:06:57,060
So, matrix multiplication really depends on the order in which you multiply the matrices A and B. (We'll discuss about that in a second.)

54
00:06:57,060 --> 00:07:09,580
And each element of C, Cij, is calculated as shown below: Cij equals the sum from k=1 to n of Aik Bkj.

55
00:07:09,580 --> 00:07:20,220
The way I like to visualize it is that each row of the left matrix A

56
00:07:20,220 --> 00:07:32,560
gets dotted with each column of the right matrix B to produce one element of the matrix C.

57
00:07:32,560 --> 00:07:36,280
That's the best way to look at it, for me.

58
00:07:36,280 --> 00:07:45,180
So let's look at some properties of matrix multiplication. It is associative, which means that if you have three matrices multiplied together,

59
00:07:45,180 --> 00:07:53,620
you can do any combination first between AB and BC there: (AB)C = A(BC).

60
00:07:53,620 --> 00:08:00,980
Matrix multiplication is also distributive: A(B+C) = AB + AC. But it is in general not commutative.

61
00:08:00,980 --> 00:08:07,360
As I mentioned earlier, it can be that AB is not the same as BA.

62
00:08:07,360 --> 00:08:16,740
Sometimes BA is not even legal. It's not even defined. (The dimensions must work out as described earlier.)

63
00:08:19,600 --> 00:08:24,180
Another operation we can do on a matrix is the transpose.

64
00:08:24,180 --> 00:08:29,680
The transpose is basically just an interchange of rows and columns.

65
00:08:29,680 --> 00:08:33,120
The rows become columns, and all the columns become rows.

66
00:08:33,120 --> 00:08:37,200
And the transpose has the following properties:

67
00:08:37,200 --> 00:08:43,020
The transpose of the transpose of A, (A^T)^T, just gives back the original matrix A.

68
00:08:43,020 --> 00:08:49,680
The transpose of the product of two matrices is interesting: (AB)^T = B^T A^T

69
00:08:49,680 --> 00:08:57,340
The left product, AB, the transpose of that is just the right product, B transpose A transpose.

70
00:08:57,340 --> 00:08:59,720
So that's important to note.

71
00:08:59,720 --> 00:09:08,040
And then A plus B, the whole transpose: (A+B)^T, is just A^T + B^T. So the transpose there is distributive.

72
00:09:10,620 --> 00:09:21,520
Matrices have a property known as "rank", and that's given by the number of linearly independent columns and rows of the matrix.

73
00:09:21,520 --> 00:09:34,260
And if the rank of the matrix is equal to the minimum of its number of rows and number of columns then it's said to have full rank.

74
00:09:34,260 --> 00:09:46,180
So the rank of a matrix can be at most the minimum number of rows or columns, and the way you can think about that is,

75
00:09:49,140 --> 00:09:57,300
what's the most linearly independent rows you can have in an m by n matrix?

76
00:09:57,300 --> 00:10:03,160
And what's the most linearly independent columns you can have in an m by n matrix?

77
00:10:03,160 --> 00:10:09,080
Yeah, so I encourage you to think about that.

78
00:10:09,720 --> 00:10:15,820
So now the inverse of a matrix: The inverse is only defined for

79
00:10:15,820 --> 00:10:20,000
square matrices. That's important to note.

80
00:10:20,000 --> 00:10:23,560
And it's denoted by "A inverse", A^(-1), as shown there.

81
00:10:23,560 --> 00:10:36,400
And it's the unique matrix such that A^(-1) A is equal to A A^(-1) and they're both equal to the identity matrix of size n.

82
00:10:36,400 --> 00:10:46,120
So that's important. The identity matrix of size n is an n by n square matrix with all 0s, except 1s on the major diagonal.

83
00:10:47,000 --> 00:10:54,400
OK. Now let's talk about eigenvalues and eigenvectors. Given a square matrix A in R^(m x n),

84
00:10:54,400 --> 00:11:02,400
lambda is an eigenvalue of A with a corresponding eigenvector of x if and only if Ax = lambda x,

85
00:11:02,400 --> 00:11:07,040
and x is not a zero vector, so it's not a trivial solution.

86
00:11:08,180 --> 00:11:12,240
The determinant of A is equal to the product of its eigenvalues,

87
00:11:12,240 --> 00:11:17,700
so the product of all possible values of lambda is the determinant of the matrix.

88
00:11:17,700 --> 00:11:24,920
If you remember, the determinant of the matrix is proportional to this area of

89
00:11:24,920 --> 00:11:32,460
this n-dimensional parallelepiped, composed of the column vectors in A.

90
00:11:32,780 --> 00:11:44,160
It also happens to be the case that the sum of the eigenvalues is equal to the trace of the matrix A, which is the sum of the values on its major diagonal.

91
00:11:44,160 --> 00:11:51,220
So in the case of the example shown below, A has columns [1, 3], [3, 2].

92
00:11:51,220 --> 00:11:57,220
The trace would be 1+2 = 3. And that would be the sum of the eigenvalues.

93
00:11:57,220 --> 00:12:01,220
OK, now we're going to talk about diagonalizable matrices.

94
00:12:02,320 --> 00:12:12,180
So, for a matrix A in R^(n x n), if A has n linearly independent eigenvectors, then A is said to be diagonalizable.

95
00:12:12,180 --> 00:12:23,460
So in math that looks like being able to decompose A into the form U, D, U inverse.

96
00:12:23,460 --> 00:12:29,720
This is a diagonalization of A. We're going to talk more about this, but

97
00:12:29,720 --> 00:12:35,600
basically what the definition says is that a matrix A is diagonalizable if and only if

98
00:12:35,600 --> 00:12:42,740
it has n linearly independent eigenvectors, and here we'll look at why that's the case.

99
00:12:42,740 --> 00:12:47,880
So on the left here we have essentially the definition of

100
00:12:47,880 --> 00:12:56,320
what it means to be an eigenvector, so u1, u2, all the way to un, are the eigenvectors of A.

101
00:12:56,320 --> 00:13:03,060
and lambda: this should be λ1 (lambda 1), λ2, all the way to λn.

102
00:13:03,060 --> 00:13:07,060
These lambda values are the corresponding eigenvalues of A.

103
00:13:07,720 --> 00:13:21,340
Now imagine kind of blocking out or expressing these as their corresponding matrices here. This is just a diagrammatic representation of what that looks like, these equations.

104
00:13:21,340 --> 00:13:27,700
So here you have A, and then what you can do is stack the column vectors,

105
00:13:27,700 --> 00:13:33,820
(stack the eigenvectors as column vectors,) into this matrix, right here,

106
00:13:33,820 --> 00:13:40,880
this n cross n matrix where each column of this matrix is an eigenvector of A.

107
00:13:41,980 --> 00:13:56,360
And on the right-hand side here you have another n by n matrix where each column is going to be λi ui for i from 1 to n.

108
00:13:56,360 --> 00:14:04,840
Essentially what this is showing us is this entire system of linear equations.

109
00:14:04,840 --> 00:14:19,460
It's telling us that Au=λu for all matching values of λ and u, for 1 through n.

110
00:14:19,460 --> 00:14:23,260
OK. So...

111
00:14:23,760 --> 00:14:42,520
You can take this right-hand side matrix, and you can actually decompose it into just a matrix of eigenvectors (here, that's the U matrix again), and a diagonal matrix

112
00:14:42,520 --> 00:14:45,660
that is one where only the diagonals are nonzero,

113
00:14:45,660 --> 00:14:49,160
(only the elements of the major diagonal are nonzero,)

114
00:14:49,160 --> 00:14:58,300
where in each column the position at the major diagonal is occupied by the corresponding eigenvalue.

115
00:14:58,300 --> 00:15:00,300
So this is what that looks like.

116
00:15:00,300 --> 00:15:08,740
Everything is zero except for the values in the major diagonal, and those values are just λ1 through λn.

117
00:15:08,740 --> 00:15:15,360
You can actually try this. If you multiply this out, you'll get another n by n matrix, right,

118
00:15:15,360 --> 00:15:20,500
where the first column is λ1u1, the second column is λ2u2, etc.

119
00:15:20,500 --> 00:15:26,740
all the way to the last column being λn un. That's essentially this matrix here.

120
00:15:26,920 --> 00:15:37,440
So what you have is, A times this matrix of eigenvectors equals that same matrix of eigenvectors times this diagonal matrix.

121
00:15:37,440 --> 00:15:40,460
We're going to call this diagonal matrix D.

122
00:15:40,460 --> 00:15:50,680
And what we can do here is, just right-multiply by the inverse of U to get this expression for A:

123
00:15:50,680 --> 00:15:57,920
A=UDU^(-1). Notice that this is only possible if U is invertible.

124
00:15:57,920 --> 00:16:10,720
That means that U inverse exists, U is a non-singular matrix, which requires that each eigenvector be linearly independent of all the others.

125
00:16:10,720 --> 00:16:14,200
It requires that A has n linearly independent eigenvectors.

126
00:16:14,200 --> 00:16:18,320
So that's how we get that A can be decomposed into

127
00:16:18,320 --> 00:16:25,820
a matrix of eigenvectors times this diagonal matrix of eigenvalues times the inverse of the matrix of eigenvectors

128
00:16:25,820 --> 00:16:31,500
if and only if it has n linearly independent eigenvectors.

129
00:16:31,500 --> 00:16:38,000
OK, lastly we're going to talk about singular value decomposition (SVD).

130
00:16:38,000 --> 00:16:45,920
SVD, singular value decomposition is a way to factorize a matrix into three matrices.

131
00:16:45,920 --> 00:16:56,620
So given a matrix A, we can write A as a product of three matrices U, Σ, and V transpose.

132
00:16:56,620 --> 00:17:03,980
Here U and V are orthogonal matrices and Σ is a diagonal matrix.

133
00:17:03,980 --> 00:17:15,280
Remember that orthogonal matrices are two matrices that have completely orthogonal column vectors

134
00:17:15,280 --> 00:17:22,220
which means that the dot product of any pair of those column vectors is zero.

135
00:17:22,580 --> 00:17:34,240
It's also true for orthogonal matrices that their transpose is also their inverse. You can do a quick example of this to find out why that's the case.

136
00:17:34,240 --> 00:17:39,640
So let's look at how to calculate singular value decompositions.

137
00:17:39,640 --> 00:17:44,840
So what exactly are these U and V matrices?

138
00:17:44,840 --> 00:17:50,500
Let's assume that you can decompose A into something like U Σ V^T. [Note that Σ=Σ^Τ because Σ is a diagonal matrix.]

139
00:17:50,500 --> 00:17:59,080
So A^T would be VΣU^T, and the matrix A^T A would be this:

140
00:17:59,600 --> 00:18:09,680
And we know that because U is orthogonal, its transpose is also its inverse, so these two (U^T U) actually become the identity matrix.

141
00:18:09,680 --> 00:18:24,660
So you end up with (A^T)A is V(Σ^T)Σ(V^T) and we know that Σ is a diagonal matrix so its transpose is itself.

142
00:18:24,660 --> 00:18:41,200
And, we know that therefore Σ^T Σ is going to be just a diagonal matrix of the diagonal values squared.

143
00:18:41,200 --> 00:18:49,520
So we let D=(Σ^T) Σ and there we have (A^T) A = V D (V^T),

144
00:18:49,520 --> 00:18:54,300
which basically looks like a diagonalization of A^T A.

145
00:18:54,300 --> 00:19:04,040
So, what are V and V^T? They're just the eigenvectors of A^T A.

146
00:19:04,040 --> 00:19:12,620
Similarly, if you compute A(A^T), that's going to give you UΣ(V^T)VΣ(U^T).

147
00:19:12,620 --> 00:19:24,040
And V is also an orthogonal matrix, so with similar arguments you can show that U is actually the matrix of eigenvectors of AA^T.

148
00:19:24,040 --> 00:19:35,000
So, here that is. We're doing SVD on a matrix M, which is an (m x n) matrix,

149
00:19:35,000 --> 00:19:43,400
and we have that U's columns are the eigenvectors of MM^T, right.

150
00:19:43,400 --> 00:19:57,500
The diagonals of S, which was sigma (Σ) in our previous notation, are the square root of the eigenvalues of (M^T)M.

151
00:19:58,680 --> 00:20:12,600
So this is because we saw that Σ^T Σ is the D matrix in the previous example, here,

152
00:20:14,720 --> 00:20:20,740
right, and these are called the eigenvalues of M^T M so,

153
00:20:20,740 --> 00:20:25,480
the diagonals of S are just the square root of those values.

154
00:20:25,480 --> 00:20:28,660
So to compute these diagonal values you would just

155
00:20:28,660 --> 00:20:33,960
compute the eigenvalues of M^T M and take the square root.

156
00:20:33,960 --> 00:20:39,600
These σ1, σ2, (...) are called the singular values of M.

157
00:20:39,600 --> 00:20:45,500
The columns of V are the eigenvectors of M^T M.

158
00:20:45,500 --> 00:20:55,820
So note that in the decomposition you actually have V^T, so the rows of V^T are the eigenvectors of M^T M.

159
00:20:55,820 --> 00:21:00,040
Good, so that was singular value decomposition.

160
00:21:00,040 --> 00:21:10,780
Now what we're going to do is look at a quick demo of all the linear algebra concepts we've learned and how to use them in NumPy.

161
00:21:11,100 --> 00:21:24,960
OK. So here we are on Jupyter Notebook, and we're just going to go over what we can use to implement these linear algebra concepts we looked at in Python.

162
00:21:26,260 --> 00:21:32,320
The way we generally want to define vectors is as NumPy arrays.

163
00:21:32,320 --> 00:21:38,500
So here we go, here we have three vectors x, y, and z. They're just NumPy arrays.

164
00:21:38,500 --> 00:21:52,220
We can use np.zeros to define zero vectors. We can just give it a 4 here and it's going to give us a zero vector of size four.

165
00:21:53,480 --> 00:22:01,540
And some operations of vectors like we discussed, like the dot product or inner product,

166
00:22:01,540 --> 00:22:08,540
is just a function called numpy.dot (here, imported as np.dot).

167
00:22:08,540 --> 00:22:23,980
I would consult the documentation for np.dot because it's a really versatile function and it takes care of a wide range of use cases.

168
00:22:23,980 --> 00:22:30,340
I would look at this documentation. So if both a and b are 1-D arrays, it's an inner product.

169
00:22:30,340 --> 00:22:38,880
It can also do matrix multiplication when both the inputs are 2-D arrays and there are some other use cases as well.

170
00:22:39,500 --> 00:22:48,740
So, yeah. So let's take a look at...

171
00:22:48,740 --> 00:22:54,260
Yeah, here we go. So that's how we use np.dot for two vectors.

172
00:22:54,260 --> 00:23:09,440
Next is the outer product, and as we know the outer product takes two vectors x and y, and they can be of different lengths, and it returns a matrix.

173
00:23:09,440 --> 00:23:17,700
If x is of length m and y is of length n it returns an mxn matrix.

174
00:23:18,500 --> 00:23:23,060
So, I'd encourage you to investigate that as well.

175
00:23:23,060 --> 00:23:36,420
Let's look at an example here. Let x equal an np.ones of size five: n = np.ones(5)

176
00:23:36,420 --> 00:23:48,860
And then we'll do, y = np.arange((6))

177
00:23:48,860 --> 00:23:55,000
And then we'll do, np.outer(x,y). OK.

178
00:23:55,000 --> 00:24:01,620
You can pause the video here and kind of think about what this might give us.

179
00:24:01,620 --> 00:24:09,060
As we expected, it's a 5x6 matrix, and each row is

180
00:24:09,060 --> 00:24:13,900
the corresponding element of x times all of y,

181
00:24:13,900 --> 00:24:19,900
and each column is the corresponding element of y times x.

182
00:24:24,180 --> 00:24:31,380
OK. Let's look at the Hadamard product, or element-wise multiplication.

183
00:24:31,380 --> 00:24:37,980
So, between two similarly-shaped arrays (arrays that can be broadcast together),

184
00:24:37,980 --> 00:24:43,380
the star operation is the element-wise multiplication operation

185
00:24:43,380 --> 00:24:48,280
so you can also achieve this with a function called np.multiply

186
00:24:48,280 --> 00:24:54,440
Let's actually recalculate this because our x has changed. They couldn't be broadcast together.

187
00:24:54,440 --> 00:25:00,560
We'll redefine x here (to be of size 4), and then run this again.

188
00:25:00,560 --> 00:25:10,300
Our x*z is [1, 2, 3, 0]. Now let's see what this gives us (with np.multiply). Yep, these two are the same.

189
00:25:12,120 --> 00:25:21,400
I'd also look at the documentation for numpy.multiply. Here it tells us exactly what it is. It multiplies arguments element-wise.

190
00:25:21,400 --> 00:25:24,240
All right, moving on.

191
00:25:24,240 --> 00:25:34,780
NumPy has a module known as linalg that has multiple very useful abstractions for linear algebra.

192
00:25:34,780 --> 00:25:42,780
One specifically is the norm. let's take a look at "numpy.linalg.norm"...

193
00:25:46,160 --> 00:25:52,460
Yeah, so, you pass in the input as the first argument,

194
00:25:52,460 --> 00:25:55,900
and the second argument is the type of norm you'd like to calculate.

195
00:25:55,900 --> 00:25:59,900
So this is the L1 norm, and this would be the L2 norm.

196
00:25:59,900 --> 00:26:05,340
The L2 or Euclidean norm is the one we discussed in class.

197
00:26:05,340 --> 00:26:08,400
So, this would be that.

198
00:26:08,400 --> 00:26:12,400
All right, now we'll move on to matrix multiplication.

199
00:26:12,400 --> 00:26:16,400
So let's define two matrices here, A and B.

200
00:26:16,400 --> 00:26:27,480
A is of shape 2x3 and B is of shape 3x4 so our result should be of shape...

201
00:26:27,480 --> 00:26:31,480
Uh, well, pause the video, figure it out, but...

202
00:26:31,480 --> 00:26:37,480
we do have the answer here, and the result is of shape 2x4.

203
00:26:37,480 --> 00:26:43,480
Notice that we've used "a.dot" here, and so the output of a.dot...

204
00:26:43,480 --> 00:26:49,480
or otherwise, we could put this in as np.dot(a,b)...

205
00:26:49,480 --> 00:26:54,880
Oops, I forgot to run these.

206
00:26:57,260 --> 00:27:01,820
When they're 2D arrays as the input, it does matrix multiplication.

207
00:27:01,820 --> 00:27:12,040
You can also use the "at" symbol (@) for matrix multiplication. That will give you the same thing. [But, np.dot and @ behave differently for vectors.] https://stackoverflow.com/q/54160155

208
00:27:12,040 --> 00:27:19,020
Yeah, this is a simple way to represent matrix multiplication as long as the shapes are matched.

209
00:27:19,720 --> 00:27:32,520
There won't be any assumptions on the part of "np.dot" if you do it this way, so you'll be guaranteed that you're doing what you're trying to do.

210
00:27:32,520 --> 00:27:46,220
The transpose is a property of the matrix. You can just call ".T". Let's take a "b.T".

211
00:27:46,220 --> 00:27:53,520
Right. And we can do a "(a@b).T"... Yep, so...

212
00:27:53,520 --> 00:27:57,200
It (the transpose) will just interchange the rows and columns.

213
00:27:57,200 --> 00:28:07,620
linalg allows you to calculate the matrix rank as well, so let's calculate the rank of a, the rank of a@b, yeah. "np.linalg.matrix_rank(a@b)"

214
00:28:08,140 --> 00:28:13,060
This is an important set of things to note.

215
00:28:13,060 --> 00:28:23,060
Now we're going to get to the higher-level functions in linalg, namely computing the inverse.

216
00:28:23,580 --> 00:28:29,660
"np.linalg.inv" is going to be the inverse.

217
00:28:29,660 --> 00:28:36,480
And what this is computing the inverse of is b b^T (as "b.dot(b.T)")

218
00:28:36,480 --> 00:28:41,660
The SVD can be computed using np.linalg.svd.

219
00:28:43,520 --> 00:29:13,000
If you remember, the SVD of a matrix b decomposes it into a matrix of the eigenvectors of bb^T times a diagonal matrix of its singular values times a matrix of the eigenvectors of b^T b, so...

220
00:29:13,000 --> 00:29:20,160
That's why, just make sure that you know what u, s, and vh are here. That's important to know.

221
00:29:22,240 --> 00:29:28,040
And I would look up the documentation for this, to look at its various arguments.

222
00:29:28,040 --> 00:29:37,980
Another thing we're going to need to do is solve linear systems using NumPy, so take this linear system, for example.

223
00:29:37,980 --> 00:29:41,340
We looked at this at the beginning of the linear algebra slides.

224
00:29:41,340 --> 00:29:54,720
We'll represent this as an array: [[4, -5], [-2, 3]] for the matrix of coefficients, and then [-13, 9] for the constant vector.

225
00:29:54,720 --> 00:30:08,380
And then we'll just use x = np.linalg.solve, and "solve" takes the matrix of coefficients first and then the constant vectors.

226
00:30:09,880 --> 00:30:14,520
And the last thing to note is about broadcasting,

227
00:30:14,520 --> 00:30:22,180
Broadcasting is just a set of rules that you need to follow when you're doing operations on two arrays.

228
00:30:25,080 --> 00:30:31,000
To do an operation between two NumPy arrays they need to be compatible, broadcastable.

229
00:30:31,000 --> 00:30:36,660
So when operating on two arrays, NumPy compares their shapes

230
00:30:36,660 --> 00:30:41,720
element-wise starting from the end and working its way to the beginning.

231
00:30:41,720 --> 00:30:46,380
Two dimensions are compatible when they're equal or one of them is 1.

232
00:30:46,380 --> 00:30:54,480
Let's check out an example here: A is a 4-dimensional array of ones, of shape (3,1,2,4).

233
00:30:54,480 --> 00:30:58,180
B is a 4-dimensional array of ones of shape (1,2,2,1).

234
00:30:58,180 --> 00:31:05,540
So let's just compare these arrays the way NumPy would starting from the end, (from the right,)

235
00:31:05,540 --> 00:31:10,460
and going to the left, and try to figure out whether or not they're broadcastable.

236
00:31:10,460 --> 00:31:14,340
So, are these two equal: 4 and 1? No.

237
00:31:14,340 --> 00:31:17,760
But one of them is 1, so these are compatible.

238
00:31:17,760 --> 00:31:22,820
Are these two equal: 2 and 2? Yes. And here in the remaining two dimensions, we have 1s.

239
00:31:22,820 --> 00:31:29,460
So A and B are broadcastable, so let's multiply A and B and check the shape.

240
00:31:29,460 --> 00:31:40,620
OK, there we go. So, what happened was NumPy broadcasted these arrays together and the result had shape (3,2,2,4).

241
00:31:40,620 --> 00:31:53,540
For each dimension, the result will take the non-unit shape, so for the first from the right it took 4, then 2, 2, 3.

242
00:31:53,540 --> 00:31:56,460
OK. Moving on...

243
00:31:56,460 --> 00:32:09,360
Here we created two arrays, and they're of different shapes. The first one is of shape 2x2, the second one is of shape 2x1,

244
00:32:10,460 --> 00:32:17,600
What will the resulting shape be? We know from our procedure above that it'll be 2x2.

245
00:32:17,600 --> 00:32:22,940
And the way NumPy does the broadcasting is that it will duplicate

246
00:32:22,940 --> 00:32:28,220
this column vector (of B) along the dimension that needs to be matched for this array (A).

247
00:32:28,220 --> 00:32:37,480
So, you'll actually end up with a 2x2 array of 3s and that will get added to A.

248
00:32:37,480 --> 00:32:48,540
So that's very important to note: how exactly NumPy does the broadcasting and what it fills into the newly-formed arrays when it broadcasts.

249
00:32:48,540 --> 00:32:54,300
OK! Well, thank you for watching this linear algebra tutorial.

250
00:32:54,300 --> 00:32:59,480
If you have any questions, as always, please feel free to ask on Piazza.