 # Lesson 2: Tensors and Automatic Differentiation

Hi Marc,

First I would like to thank you very much for your class, it really helped me for approaching deep learning. I finally feel like I’m understanding the basic concepts, which can be tricky when you’re not a mathematician expert ^^

Then, I have a question concerning the exercise you gave us in the end of lesson 2 part 2.
I’m following the instructions there but I’m kind of stuck now.

I don’t understand how we’re supposed to do the back propagation of the gradient. In the example we see how it is done for the bias. In the ‘backward’ function the gradient is not changed, but where are we supposed to compute it if not in the ‘add_bias’ class ?

Here is my code:

``````import numpy as np
from numpy.random import random

def __init__(self, b):
# initialize with a bias b
self.b = b

def forward(self, x):
# return the result of adding the bias
return x + self.b

# save the gradient (to update the bias in the step method) and return the gradient backward

def step(self, learning_rate):
# update the bias

class dot_weight(object):
def __init__(self, w):
self.w = w

def forward(self, x):
self.x = x
return x * self.w

def step(self, learning_rate):

class exponential(object):
def forward(self, x):
self.y_est = np.exp(x)
return self.y_est

def step(self, learning_rate):
pass

class composition(object):
def __init__(self, layers):
self.layers = layers

def forward(self, x):
y_est = x.copy()
for layer in self.layers:
y_est = layer.forward(y_est)
self.y_est = y_est
return y_est

def compute_loss(self, y, y_est):
substract = y_est - y
self.loss = np.sum(substract ** 2)
return self.loss

def backward(self):
for layer in reversed(self.layers):

def step(self, learning_rate):
for layer in self.layers:
layer.step(learning_rate)

def launch_exo():
print('LESSON 2 : SGD EXPONENTIAL LINEAR REGRESSION\n')
w, b = 0.5, 2
xx = np.arange(0, 1, .01)
print('xx = {}'.format(xx))
yy = np.exp(w * xx + b)
# print('yy = {}'.format(yy))

estimated_b = 
estimated_w = 

learning_rate = 1e-4

losses = []
ws = []
bs = []

for i in range(10):
j = np.random.randint(1, len(xx))

# compute the estimated value of y from xx[j] with the current values of the parameters
y_est = my_composition.forward(xx[j])

# compute the loss and save it
losses.append(my_composition.compute_loss(yy[j], y_est))

my_composition.backward()
my_composition.step(learning_rate)
ws.append(my_composition.layers.w)
bs.append(my_composition.layers.b)

print('Estimated : losses={} \nb={} \nw={}'.format(losses, bs, ws))
``````

Matt

Found my error, in the function `backward` from the `dot_weight` class:
I needed to change the value with the multiplication with x and return the result :

``````class dot_weight(object):
def __init__(self, w):
self.w = w

def forward(self, x):
self.x = x
return x * self.w

def step(self, learning_rate):

Hi Matt,
thanks for starting the discussion!
Yes, your multiplication is now correct. This should work, congrats!
Marc

1 Like

Hello,

Thanks for the course !
I have been stuck in something for some time now. I don’t see the error in my code, and it is giving me really bad results, so I suppose I am doing something wrong but I don’t see it. Can you help me?

Here is my code:

``````

class multiplication_weight(object):

def __init__(self, w):

# initialize with a weight w

self.w = w

def forward(self, x):

# return the result of multiplying by weight

self.x = x

return self.w*x

def step(self, learning_rate):

# update the weight

class my_exp(object):

# no parameter

def forward(self, x):

# return exp(x)

self.x = x

return np.exp(x)

def step(self, learning_rate):

# any parameter to update?

# Hint https://docs.python.org/3/reference/simple_stmts.html#the-pass-statement

pass

class my_composition(object):

def __init__(self, layers):

# initialize with all the operations (called layers here!) in the right order...

self.weight = layers

self.bias = layers

self.expo = layers

def forward(self, x):

# apply the forward method of each layer

self.x = x

return self.expo.forward(self.bias.forward(self.weight.forward(x)))

def compute_loss(self,y, y_est):

# use the L2 loss

# return the loss and save the gradient of the loss

return (y-y_est)**2

def backward(self):

# apply backprop sequentially, starting from the gradient of the loss

# Hint: https://docs.python.org/3/library/functions.html#reversed

def step(self, learning_rate):

# apply the step method of each layer

self.expo.step(learning_rate)

self.weight.step(learning_rate)

self.bias.step(learning_rate)

``````

Hi,
What do you mean by really bad results?
I would not have coded like that but your code is running fine…

Marc

Hello,

I launched my code like this:

``````my_fit = my_composition([multiplication_weight(1),add_bias(1), my_exp()])
learning_rate = 1e-4
Loss = []
estimated_w = 
estimated_b = 

for i in range(5000):
j = j = np.random.randint(1, len(xx))
y_est = my_fit.forward(xx[j])
loss = my_fit.compute_loss(yy[j],y_est)
Loss.append(loss)
my_fit.backward()
my_fit.step(learning_rate)
estimated_w.append(my_fit.weight.w)
estimated_b.append(my_fit.bias.b)

print('')
print(f'Optimal weight:{my_fit.weight.w}')
print(f'Optimal bias:{my_fit.bias.b}')

``````

And I am getting w = -0.69 and b = -2.11 which is far from optimal.

OK, now I understand why I did not see your problem. I did run the same code as you, except in the computation of the loss where I did interchange the arguments. You can try and see that this works!
Your gradient for the loss is not correct, it should be `2*(y_est-y)`. With this modification, your code should run fine.

Oh, thank you very much!!

Hi Marc,

First of all, thank you for all your lessons and the excellent course.

1. I have read other posts, and I have did the same thing for saving the input x value in the forward method, since we need it for computing the gradient in backward method later, but beacause you make a comment “# return the result of multiplying by weight” in the forward method, but not “# save the input x and return the result of multiplying by weight”, I am wondering that perhaps that you have another way to save the x value ? If so, could you tell me please ?

2. One more question, please, for the backward of add_bias class, you do in the last line of code, `return grad`, but not `return self.grad` (which is the same thing with variable grad in this case). For backward method in my code, is it fine to `return self.grad` instead of `return self.x*grad` in my exemple ? Both work fine, but I just want to know if there is a rule that it is not advised to return a self property in a method ?

My code is like this :
`class multiplication_weight(object):`

``````def __init__(self, w):

# initialize with a weight w

self.w = w
def forward(self, x):
# return the result of multiplying by weight
self.x = x
return self.w*x

def step(self, learning_rate):
# update the weight
``````

Thanks a lot and have a nice day,

Zheyu XIE

Hi

1. no I am doing like you!
2. `return self.grad` is simpler ans more readable. You can return a `self.something`, I do not see any pb with that…
1 Like

Hi Marc;

Thank you very much for this course serie! I’m catching up a little late, hope you would still answer my questions.

I don’t understand why the` backward()` method of `multiplication_weight` returns the gradient with respect to w. Shouldn’t it return a gradient with respect to x ?
In the current example, it does not change anything, since the multiplication module appears only at the beginning of the composition, but I think it may matter if several multiplication blocks are composed together.

My implementation :

``````   def backward(self,grad):
``````

Many thanks;

Hi Severine,
here x is the input and w is a parameter. You want to update the parameter with a gradient descent algorithm hence, you need to compute the derivative of the loss with respect to the parameters not the input.
But you are right if you had another module before the multiplication, then you should also compute the derivative with respect to x which would not be anymore the input. In this case, you need to compute the full gradient i.e. a vector with both derivatives with respect to x and with respect to w. You can do this by creating a `self.grad_w` and a `self.grad_x`. In our case we only need to compute `self.grad_w`.

Best,
Marc

Hi Marc;
Thanks for your answer. Just to be sure, `backward`() will thus :

• compute store `self.grad_w = grad * self.x `
• compute and return `grad_x = grad * self.w `
Only `self.grad_w` will be used by the parameters update `step()` , but the returned gradient would be possibly used to compute the gradient of previous modules.

Is that correct ?
Thanks !

Yes, that’s it. You can try by adding another module before the multiplication!

ok thanks, I’ll try it !