Neural Networks Explained – Machine Learning Tutorial for Beginners



in this video we're gonna dive deeper into how neural networks and machine learning actually works behind the scenes and if you're learning machine learning you've probably come across a diagram that looks like this which often times if not most of the time confuses people because it's a bunch of circles and arrows and the explanation is usually just as confusing well it's simple you take your input circles and move them through your circles and you end up with your wallah neural networks and so that's not really a very good explanation at least it wasn't very good for me when I was learning so hopefully we're gonna break this down make it really simple because it's not really rocket science going into this it's not simple like ABCs but it's not super complex either so if you guarantee if you watch this video a few times they'll give you a big head start on what neural networks are and maybe even you'll be able to go and code some of yourself if you have coding experience so here's the picture of the neural network we're actually going to use this diagram later but you're going to understand it so a good way to start is to block off this center layer we're gonna call that the black box and this is what we've been doing if you've been watching my last two videos on machine learning let's just stick to the input and output data the training data right given this information here's the outcome these are our input dimensions and the output the outcome dimensions and then somewhere that black box in between runs over that information over and over and over again and each time it has a thousand knobs on that black box and it twists a little knob just a little bit at a time until while the black box is tuned and we can give it new input information and get the correct predicted outcome and so this video is gonna dive into the hidden layers area that black box and how we configure it and what it does so there's some common configuration options no matter what machine learning or neural network library you're using there's some common configuration options that you're going to come across for configuring that neural network one how many hidden layers do we put in that black box we know we have an input layer and an output layer how many layers go in the middle and usually one is where you need to start off with that's pretty simple and straightforward and then how many nodes how many neurons go in that hidden layer and there's a few different ways you can kind of give yourself an answer to that there's not an exact science to it a good path is if your input and output dimensions are drastically different go somewhere in between if you have seven input values and two output values then maybe three or four also you definitely want it to be less than two times the input nodes if you get bigger than two times the input nodes you can get a situation called overfitting where you're just not going to get accurate outcome also a two-thirds the input nodes plus the output nodes that's a good kind of way to go and so if you kind of run the math on these three situations you'll kind of get some ideas on what how many nodes that hidden layer should have an activation function this is very important but I'm not going to go into it right now also learning rate and momentum are important but I'm not going to go into those right now lastly the iterations and the desired error level so iterations is how many times it goes over all the data in your data sets now I mean the desire to error level is how accurate do you want this thing to be and depending on what you're going for basically training stops until you get to one or the other till you've gone through say 20,000 iterations or until your error level is 0.0 0.1 ever you set the trainings going to go till you get to one or the other of those so that's the configuration options that's actually not too bad that's how you tune your black box so if you have a little bit of knowledge as to what the black box is doing you can usually toona neural network and move on from there and get good results so let's go to a real example using these circles and arrows that we have from before we have a data set this data set gives us fur color and weight of animals and also tells us if those animals were whosits or whatsits so we want to go through all of that data train our neural network to predict whosits and whatsits based on fur color and weight and so we've chosen three hidden layers based on the options that weigh showed you on the last slide so let's go ahead and then get our neural network initialize so to start off we basically want to create a stupid brain a brain that knows nothing so we're gonna just randomly create a bunch of weights you can see I'm gonna get my mouse out here you can see we've got we've assigned a point one way to point three weight just completely random numbers and these neurons these hidden layer neurons and ODEs are gonna also get a bias consider them coming in with an opinion already even though we don't know anything about how fur color and weights apply to whosits and whatsits we're just gonna start making guesses based on our bias consider them the liberals conservatives and libertarians and we're gonna get them all in a room show them a bunch of real-life data and hopefully at the end of the thing they'll all agree on what is truth I'm you know I'm not gonna say anything I know okay no going out going there moving on so that's our hidden layers we initialize the network with just a bunch of completely random weights and biases and then moving on from there we're going to start with the first entry in our data set so entry number one we have an animal fur color 25 and weight of 15 and it's a who's it it's a who's it one it's a what's at zero so now we start moving these input dimensions through the neural network it's called a feed-forward neural network for that reason so basically what we're going to do is we're going to take both of the input dimensions we're going to multiply them times the weight and then add the bias so here we go node one gets fir times 0.1 plus weight times 0.8 see we have our point one weight and our 0.8 weight and then we're going to add the point seven one bias so that's node number one so we're basically saying let's randomly guess how these two nodes affect our outcome let's just start with a guess and they're going to do that for each node a completely different set of weights and biases here a completely different set of weights and biases here you can see when we get down here we're taking fir times 0.5 seven plus weight times 1 and we're adding a point oh nine bias so we're basically making a random set of guesses and then we do the magic and then we run all those through an activation function and here's where the activation function is important here's where the magic of neural networks comes in is because we're trying to figure out answers to nonlinear data and so I'm going to do a little bit of a tangent over here and kind of show you the difference between linear and nonlinear data linear data is a straight line we can say that as the weight and the fur color goes up we know it's a greater chance of being a whozit as it goes down it's a greater chance of being a whozit or what's it or hey if the fur color is high or if the weight is high it's always going to be a whozit right so you could use this with some species if it's big it's a dog if it's small it's a mouse if it's small smaller than a certain size there's no way it's going to be a dog that's linear information you don't really need a neural network to solve answers on these you just need a little bit of math and the data should be pretty clear as to how the input correlates with the output nonlinear is like this it just turns out differently um it may be high here and high here or it may kind of follow some unusual curves that are really difficult to figure out on your own and so that's basically where your activation functions come in we're going to use a sigmoid or a tan or tan H or Kant depending on who you are and where you come from and so we're gonna basically apply that that's some value we're gonna run it through one of these functions and introduce non-linearity to our neural network to help us kind of find out the answers to those questions so we've taken the sum right we've taken the sometimes the weights and we've added in our bias and we're gonna run that through our activation function now we're going to get a nonlinear guess here and then we're going to continue running through until we get to our outcome and let's just throw out a number here let's say we ran it through ran it through activation function came through again here and we end up with just a random drawn out of the Hat guess that it's 0.35 1/2 that it's a whozit and 0.78 1 that it's a what's it completely wrong right our neural network is stupid it came in with just biases and random numbers and made a guess that was way off so then it's back propagation time we're going to calculate the error in the Delta which is the difference and we're going to adjust all the weights and the biases we're going to go backwards through each step and we're going to adjust these biases and these weights some we're not going to necessarily adjust them all the way but we're going to adjust them some how do we know how much we're going to adjust them well we do that through our configurations the first one is learning learning rate says how much should this step outcome affect our weights and our biases and learning rate is you almost want to think of it as personality types there's the slow calculating type of person where if you were to show them here's a cat here's a dog they're gonna say hmm I have some ideas show me some more and then after you show them a whole bunch they'll slowly lean into the answer and then come up with a very calculated answer that's a low learning rate a high learning rate a ridiculously high learning rate would be oh you showed me a cat and a dog I calculated the differences between the two I know the difference and then you show them a really fluffy cat and they think it's a dog because they had way too high of a learning rate so that's what learning rate is and then momentum also says how should our past outcomes affect our weights and biases so past outcomes could say hey that that initial weight there keeps giving us way too high of an answer so we're gonna keep that weight kind of going downward no matter what at least we're gonna give it a little bit of momentum into this next step so each step kind of takes learning rate and momentum into account and we don't just want to make snap judgments and we also don't want to learn too slowly because we learned too slowly it takes forever to train our neural network so right it's kind of that balance between how fast do we want to train our data versus how accurate do we want that data to be that's kind of where your learning rate and momentum come from so here's kind of an example formula we're gonna take our learning rates we're gonna multiply that multiply it by the difference and multiply it by what the actual value is now and we're gonna add that to our momentum times our past change amount and that is our current change amount so that's kind of an example formula for how we use learning rate and momentum to determine how much to change a given weight and a bias so now we go through the next piece of data yay we did it we did one iteration through the first piece of our data set or not throughout their iteration yet we're gonna go to the next piece of our data set we're fur color is 15 and weight is 35 and that's a whatsit and so we do that through each piece of data in our data set and that is considered one iteration so we have an iteration and we have an average error rate there and then we can determine do we go through and do more iterations have you requested more than that or have you requested that to be an acceptable error rate and if not then we just keep going so that's how you train a neural network you give it inputs through the random weights which get more accurate over time biases activation function which is huge and then you're going to calculate your error and you're gonna back propagate some adjustments here to the weights and biases and when you're done you get a set of weights that are accurate a set of biases that are accurate and you can then run any input into it it goes through the weights and biases and you get a pretty good estimation of what that outputs going to be and that's known that works in machine learning in a nutshell I hope this video helps in the next video we're gonna actually go back to brain j/s and watch all these configurations and all these options in action

22 thoughts on “Neural Networks Explained – Machine Learning Tutorial for Beginners”

  1. This crap is a good example why self teaching using youtube is highly dangerous.

    This just complete bullshit!

    Please stop making videos about stuff you have understood a shit about!

  2. I hope some day they invent a neural network that is able to explain in a simple way what neural networks are. We're obviously not there yet.

  3. Why does every neuron have a different bias? Isn't the bias common for all neurons of a single layer? Or is this a different type on NN than the one explained by 3Blue1Brown?

  4. First, thanks for this valuable series.
    Can neural networks be used to auto-tune PID routines in servo controllers?

  5. Great video! Thanks. I just spotted one mistake at 03:55 – you say "We've chosen three hidden layers." Do you mean "We've chosen three nodes in our hidden layer"?

  6. Does this mean that in the future organizations will be judged by how much deep learning they put into whatever it is they stand for, an 'accuracy score' if you will. So hospitals, for example would have very high accuracy scores because of the importance of what they do, they need to spend the time and money. Same would go for governments and legal system. And on the other end, to use a simplistic example, perhaps FOX will say, "this trailer gets approval ratings from 90% of married women", but we will all know FOX doesn't spend a lot of money on deep learning, because, let's face it, who cares if a married woman watches it and doesn't like it. Would we pull Fox's license, not likely. But all the in-betweens is where life happens. I can't bare to think how it will be used. Eugenics, cultural extermination. Who figures out first that they actually have the capacity to easily take the world now that their AI has told them how to do it? People have agendas.

    And I think here is the key…… The more complex it gets, the more difficult it gets for Citizen Group #1 to analyze and control it, therefore harder to control the government.(Unless we have personal defense, code breaking networks in our VECs[voluntary external chip… cell phone,lol])

    And worse! What about the ability of a nation to understand what another nation is doing with it. This will lead to a rise in old school espionage, because if such a move did manifest for a nation, the rest of the world would be at a loss for identifying the source of X Nation's confidence and victories. It is simply a strategy produced deep in the recesses of a unique network with access to more information than a human can perceive. So… defend with AI, analyzing the other AI, lol, but what do you tell your AI? You have no idea what they know, or even if you did it doesn't matter, their AI accounted for your AI figuring out the plan, but it's too late, the plan employed a time element. HA HA HA I have you now! Lol… Unless we have spies everywhere! Deep learned, AI robot spies, of course.

  7. beautifully explained! I've never come across such a simple and understandable explanation of neural networks!

  8. Unfortunately, this video just further obfuscates the topic (for beginners). You're far better off explaining neural nets from a SIMPLE algebraic perspective, i.e. explain them first in terms of INDIVIDUAL perceptrons whose role is to simply generate bisectors of N-dimensional space (i.e. for 2D :equation of a line, for 3D: equation of a plane, and so on…)
    Once you've established to the viewer that these perceptrons basically generate BISECTORS of space (and hence define DECISION REGIONS in that space), then a network of such perceptrons can classify input data into specific regions, i.e. it can CLASSIFY input data.
    Lastly, introduce the role/purpose of Activation Functions for introducing nonlinearity to the function approximation process, i.e. inject curvature into the decision boundaries.

    Check out this video for a good INTRODUCTION to neural networks: https://www.youtube.com/watch?v=BR9h47Jtqyw

Leave a Reply

Your email address will not be published. Required fields are marked *