Please ask any questions relevant to lesson 2 here.
Hi Marc,
First I would like to thank you very much for your class, it really helped me for approaching deep learning. I finally feel like Iâm understanding the basic concepts, which can be tricky when youâre not a mathematician expert ^^
Then, I have a question concerning the exercise you gave us in the end of lesson 2 part 2.
Iâm following the instructions there but Iâm kind of stuck now.
I donât understand how weâre supposed to do the back propagation of the gradient. In the example we see how it is done for the bias. In the âbackwardâ function the gradient is not changed, but where are we supposed to compute it if not in the âadd_biasâ class ?
Here is my code:
import numpy as np
from numpy.random import random
class add_bias(object):
def __init__(self, b):
# initialize with a bias b
self.b = b
def forward(self, x):
# return the result of adding the bias
return x + self.b
def backward(self, grad):
# save the gradient (to update the bias in the step method) and return the gradient backward
self.grad = grad
return grad
def step(self, learning_rate):
# update the bias
self.b = learning_rate * self.grad
class dot_weight(object):
def __init__(self, w):
self.w = w
def forward(self, x):
self.x = x
return x * self.w
def backward(self, grad):
self.grad = grad
return self.grad * self.x
def step(self, learning_rate):
self.w = learning_rate * self.grad
class exponential(object):
def forward(self, x):
self.y_est = np.exp(x)
return self.y_est
def backward(self, grad):
self.grad = grad
return grad * self.y_est
def step(self, learning_rate):
pass
class composition(object):
def __init__(self, layers):
self.layers = layers
def forward(self, x):
y_est = x.copy()
for layer in self.layers:
y_est = layer.forward(y_est)
self.y_est = y_est
return y_est
def compute_loss(self, y, y_est):
substract = y_est  y
self.loss = np.sum(substract ** 2)
self.grad = 2 * substract
return self.loss
def backward(self):
for layer in reversed(self.layers):
self.grad = layer.backward(self.grad)
def step(self, learning_rate):
for layer in self.layers:
layer.step(learning_rate)
def launch_exo():
print('LESSON 2 : SGD EXPONENTIAL LINEAR REGRESSION\n')
w, b = 0.5, 2
xx = np.arange(0, 1, .01)
print('xx = {}'.format(xx))
yy = np.exp(w * xx + b)
# print('yy = {}'.format(yy))
estimated_b = [1]
estimated_w = [1]
my_composition = composition([dot_weight(estimated_w[0]), add_bias(estimated_b[0]), exponential()])
learning_rate = 1e4
losses = []
ws = []
bs = []
for i in range(10):
j = np.random.randint(1, len(xx))
# compute the estimated value of y from xx[j] with the current values of the parameters
y_est = my_composition.forward(xx[j])
# compute the loss and save it
losses.append(my_composition.compute_loss(yy[j], y_est))
my_composition.backward()
my_composition.step(learning_rate)
ws.append(my_composition.layers[0].w)
bs.append(my_composition.layers[1].b)
print('Estimated : losses={} \nb={} \nw={}'.format(losses, bs, ws))
Matt
Found my error, in the function backward
from the dot_weight
class:
I needed to change the value with the multiplication with x and return the result :
class dot_weight(object):
def __init__(self, w):
self.w = w
def forward(self, x):
self.x = x
return x * self.w
def backward(self, grad):
self.grad = grad * self.x
return self.grad
def step(self, learning_rate):
self.w = learning_rate * self.grad
Hi Matt,
thanks for starting the discussion!
Yes, your multiplication is now correct. This should work, congrats!
Marc
Hello,
Thanks for the course !
I have been stuck in something for some time now. I donât see the error in my code, and it is giving me really bad results, so I suppose I am doing something wrong but I donât see it. Can you help me?
Here is my code:
class multiplication_weight(object):
def __init__(self, w):
# initialize with a weight w
self.w = w
def forward(self, x):
# return the result of multiplying by weight
self.x = x
return self.w*x
def backward(self,grad):
# save the gradient and return the gradient backward
self.grad = grad*self.x
return self.grad
def step(self, learning_rate):
# update the weight
self.w = learning_rate*self.grad
class my_exp(object):
# no parameter
def forward(self, x):
# return exp(x)
self.x = x
return np.exp(x)
def backward(self,grad):
# return the gradient backward
self.grad = self.forward(self.x)*grad
return self.grad
def step(self, learning_rate):
# any parameter to update?
# Hint https://docs.python.org/3/reference/simple_stmts.html#thepassstatement
pass
class my_composition(object):
def __init__(self, layers):
# initialize with all the operations (called layers here!) in the right order...
self.weight = layers[0]
self.bias = layers[1]
self.expo = layers[2]
def forward(self, x):
# apply the forward method of each layer
self.x = x
return self.expo.forward(self.bias.forward(self.weight.forward(x)))
def compute_loss(self,y, y_est):
# use the L2 loss
# return the loss and save the gradient of the loss
self.loss_grad = 2*(yy_est)
return (yy_est)**2
def backward(self):
# apply backprop sequentially, starting from the gradient of the loss
# Hint: https://docs.python.org/3/library/functions.html#reversed
self.grad = self.weight.backward(self.bias.backward(self.expo.backward(self.loss_grad)))
def step(self, learning_rate):
# apply the step method of each layer
self.expo.step(learning_rate)
self.weight.step(learning_rate)
self.bias.step(learning_rate)
Hi,
What do you mean by really bad results?
I would not have coded like that but your code is running fineâŚ
Marc
Hello,
I launched my code like this:
my_fit = my_composition([multiplication_weight(1),add_bias(1), my_exp()])
learning_rate = 1e4
Loss = []
estimated_w = [1]
estimated_b = [1]
for i in range(5000):
j = j = np.random.randint(1, len(xx))
y_est = my_fit.forward(xx[j])
loss = my_fit.compute_loss(yy[j],y_est)
Loss.append(loss)
my_fit.backward()
my_fit.step(learning_rate)
estimated_w.append(my_fit.weight.w)
estimated_b.append(my_fit.bias.b)
print('')
print(f'Optimal weight:{my_fit.weight.w}')
print(f'Optimal bias:{my_fit.bias.b}')
And I am getting w = 0.69 and b = 2.11 which is far from optimal.
OK, now I understand why I did not see your problem. I did run the same code as you, except in the computation of the loss where I did interchange the arguments. You can try and see that this works!
Your gradient for the loss is not correct, it should be 2*(y_esty)
. With this modification, your code should run fine.
Oh, thank you very much!!
Hi Marc,
First of all, thank you for all your lessons and the excellent course.
I have tow questions, please.

I have read other posts, and I have did the same thing for saving the input x value in the forward method, since we need it for computing the gradient in backward method later, but beacause you make a comment â# return the result of multiplying by weightâ in the forward method, but not â# save the input x and return the result of multiplying by weightâ, I am wondering that perhaps that you have another way to save the x value ? If so, could you tell me please ?

One more question, please, for the backward of add_bias class, you do in the last line of code,
return grad
, but notreturn self.grad
(which is the same thing with variable grad in this case). For backward method in my code, is it fine toreturn self.grad
instead ofreturn self.x*grad
in my exemple ? Both work fine, but I just want to know if there is a rule that it is not advised to return a self property in a method ?
My code is like this :
class multiplication_weight(object):
def __init__(self, w):
# initialize with a weight w
self.w = w
def forward(self, x):
# return the result of multiplying by weight
self.x = x
return self.w*x
def backward(self,grad):
# save the gradient and return the gradient backward
self.grad = self.x*grad
return self.grad
def step(self, learning_rate):
# update the weight
self.w = learning_rate*self.grad
Thanks a lot and have a nice day,
Zheyu XIE
Hi
 no I am doing like you!

return self.grad
is simpler ans more readable. You can return aself.something
, I do not see any pb with thatâŚ
Hi Marc;
Thank you very much for this course serie! Iâm catching up a little late, hope you would still answer my questions.
I donât understand why the backward()
method of multiplication_weight
returns the gradient with respect to w. Shouldnât it return a gradient with respect to x ?
In the current example, it does not change anything, since the multiplication module appears only at the beginning of the composition, but I think it may matter if several multiplication blocks are composed together.
My implementation :
def backward(self,grad):
# save the gradient and return the gradient backward
# grad w.r.t. w
self.grad = self.x * grad
# grad w.r.t. x
return self.w * grad
Many thanks;
Hi Severine,
here x is the input and w is a parameter. You want to update the parameter with a gradient descent algorithm hence, you need to compute the derivative of the loss with respect to the parameters not the input.
But you are right if you had another module before the multiplication, then you should also compute the derivative with respect to x which would not be anymore the input. In this case, you need to compute the full gradient i.e. a vector with both derivatives with respect to x and with respect to w. You can do this by creating a self.grad_w
and a self.grad_x
. In our case we only need to compute self.grad_w
.
Does it answer your question?
Best,
Marc
Hi Marc;
Thanks for your answer. Just to be sure, backward
() will thus :
 compute store
self.grad_w = grad * self.x
 compute and return
grad_x = grad * self.w
Onlyself.grad_w
will be used by the parameters updatestep()
, but the returned gradient would be possibly used to compute the gradient of previous modules.
Is that correct ?
Thanks !
Yes, thatâs it. You can try by adding another module before the multiplication!
ok thanks, Iâll try it !