einsum is all you need.

 To get more context about what is einsum notation, follow the blogs mentioned at the end of the article.

Eninsum notation helps in simplifying dot products, outer products, hadamard products, matrix-matrix multiplication, matrix-vector multiplication, etc. It's difficult to remember the shapes every time, we  always somehow stuck somewhere on the matrix shapes or finding difficulty. Einsum helps to mitigate that.

In this article, we will try to see how can we use einsum notation in build deep learning models. 

Einsum notation in numpy is implemented as numpy.einsum, in pytorch as torch.einsum, tf.eninsum in tensorflow.


A typical call to einsum notation would look like :

result=einsum("□□,□□□,□□->□□",arg1,arg2,arg3)
where, □ are the placeholder for a character the specify the dimension,
and arg1, arg2 are the actual arguments.


After -> we need to specify the output shape we want to have. The internal working is handled by einsum.

Let's look at some basic examples :


1. Matrix transpose 


   

It might be confusing at first, but its easy once understood. Here, in the matrix transpose example, we are just transposing the matrix hose dimension is specified as 'ij' and -> transposed to 'ji', and 'a' is the matrix that needs to be transposed. 


2.  Matrix summation 




Here, for matrix summation we have matrix 'a' with dimension 'i' , 'j' and the output is te scalar which is specified with no dimension. 


3. Matrix-Vector Multiplication 


Here, for matrix-vector multiplication, we specify the tensor dimension of matrix 'a' and the vector 'b' and the resultant shape i.e 'k'



4. Matrix-Matrix multiplication 


It's very simple, you just specify the dimension. The only thing is it the variable that you use to specify dimension should be constant 


5. Batch matrix multiplication 



Suppose, we are creating some deep learning models, and we need to perform batch matrix multiplication. 

Maybe (batch_size, embedding_size, embedding_dim ) and we want to multiply along with batch. We just specify the input and output dimensions.

We will try to look for more examples in the future building some deep learning models.


Resources:














Comments