# 3.4: Linear Regression with Gradient Descent – Intelligence and Learning

## 22 thoughts on “3.4: Linear Regression with Gradient Descent – Intelligence and Learning”

1. Mario Gianota says:

For a more complete and in depth discussion of Linear Regression with Gradient Descent check out Professor Andrew Ng of Stanford series of machine learning videos: https://www.youtube.com/watch?v=PPLop4L2eGk&list=PLLssT5z_DsK-h9vYZkQkYNWcItqhlRJLN

You have done the snap even before Thanos have done that =D

3. Sarang Chouguley says:

Thank you Dan. Really you made this topic so easy to understand. Keep up the good work.

4. Pablo Loves You says:

I tried to build a gradient descent algorithm from scratch. Why isn't mine working? Here's my code:

for i in range(4):
ypred = m * x + b
error = (ypred – y) **2
m = m – (0.001 * error)
b = b -(0.001 * error)
m = m.sum()
b = b.sum()

#My 'm' and 'b' values decrease infinitely

5. sahlool says:

what is the best book for machine learning?

6. Nelson says:

y=mx+c lol.

7. man ragh says:

2:41 Thanos of blackboard writings.

8. Saiteja psk says:

Awesome cool….. What a teaching style I really love it you made my day by understanding linear regression with simple story really love you man

9. Arzoo Singh says:

I must say i like the way you teach . You're a nice man God bless .

10. MakeItUrWay says:

the code is not in python….. can you do an explanation on a python code

11. darek4488 says:

You need separate learning rates for m and b. Then set the learning rate for b higher than the one for m so it would rotate faster, but move up and down slower.

nice video…you can also check linear regression using tensorflow here …..https://www.youtube.com/watch?v=PGm8pLp7T40

Would you have two separate learning rates for m and b? Seems like weighting the slope change higher could be beneficial.

2:39 thanos is jealous

15. Manoj says:

Really awesome video! Thank you for making machine learning and math so much fun!!

16. Aaron Fang Shenhao says:

Could we have an explanation of why the x, y coordinates were mapped between 0 and 1? Without that, it doesn't work at all, without changing the learning rate to something really small like 10^-6. But then the b value changes way too slow, so I had to have 2 learning rates, with m's learning rate being 10^-6 and b's learning rate being 0.05 to achieve the same results. I'm surprise how effective the mapping was at solving this problem. When I tried other similar algorithms that took the average error of all the points (which is meant to be better), it didn't work either unless I did the mapping trick.

17. Steve Taylor says:

Er, just use R to do regression!

18. Rueful Rabbit says:

I'm sure this video can be condensed without losing any real info. I don't have the patience to see it through half way.

19. Isaac Tawiah says:

That "come back to me" … hahahahaha

20. Hunar says:

Your snap has inspired Thanos 😀

21. Vishwajeet Singh says:

Can someone explain why this is correct ?
m = m + (error * x) * learning rate;
I mean how is it dimensionally correct ? Shouldn't error be divided by x so that m can be added to something that is of type m.

22. amit bansode says:

All videos by you are rocking