ml5.js: Transfer Learning with Feature Extractor



hello welcome to another ml 5gs video now I am really I'm very excited about this one I have made three videos so far at the moment at the time of this recording sort of intro to ml 5 and machine learning a video about doing image classification and a video about doing image classification with real-time images coming in from a webcam and both of those image classification projects use a pre trained model called mobile net so just to pick off where we last left off here it is this is the image classification mobile net with mobile net example you can see that I am today I'll come on I want to be a snorkel I really feel like a snorkel I am kind of like an oboe but we can see here I think if I get this ukulele and put it in here it's gonna see it's an acoustic guitar pretty quickly let me put this down over here so here's the thing we have determined that this model is not good at recognizing certain things like it cannot recognize a train whistle thinks it's a syringe or an oboe it cannot recognize my like purple water bottle sponsor of the coding train water thinks it's a power drill or a microphone okay so what if I thought what I was doing or why I'm here what if I want to have this example recognize things that I have here in this room that it doesn't recognize could I train it with my own data so this brings up so many questions and there's so many different paths we could go down but the path I want to go down in this video today is something called transfer learning right so with the concept of machine learning I could have a massive database of images and train a model and label all those images and train a model based on that data and show it new images and it'll tell me what's inside those names just based on what it learned but I am a person with no massive database of images so something that I could do is use somebody else's model that already was trained with a massive database of images that just got I'm going to use these images on top of it it's not a perfect solution but it is a quite powerful one that allows you to do certain kinds of things very quickly in fact I just came over here apparently to write transfer learning there is a project that I want to show you let me just hit back here called teachable Machine so teachable machine is a project made by a various collaboration of many different researchers at Google led by the Google Creative Lab and I'm gonna I'm gonna run through this in a second and basically what I'm going to do when it introduced the idea of show you teacher machine introduced the idea of transfer learning talk about how it works in ml5 with mobile net but take a break and then I'm gonna come back and actually make the code example I'm gonna make a teachable machine so let's first just run this and see so I'm gonna skip the tutorial and I'm going to just open this up I'm gonna zoom in a little bit here so I should say that teachable machine is using a slightly different algorithm behind the scenes than what I'm ultimately going to implement and I'll kind of talk about maybe the differences between those after I kind of get done with this but conceptually it's exactly the same thing so right now I can see there are three categories green purple orange so what I'm going to do is I'm going to say so now I'm going to attempt to train the teachable machine with my own images so I'm gonna step out of the frame just for the time being I'm gonna say train green I'm holding this down so I'm kind of I'm giving it lots of examples of a ukulele a different kind of angle of course has my arm in it too that's a part of what its learning and then I'm going to stop I'm gonna put the ukulele down awkwardly I'm gonna grab this train whistle and I'm going to hit train purple and I'm gonna give it lots of examples of the train whistle and then I'm going to let go and then let's do one more oh I really should have made this one purple train hit orange because it is purple this is my water bottle so I'm gonna train it with the water bottle a bunch of times I'm sort of in the frame when we get out of the frame and I'm gonna let go all right so I finished training the teachable machine if I come into the picture it actually thinks I'm the water bottle I'm orange but look at this let me now show at the ukulele evidence 99% let me now show it the train whistle confidence purple 99% and now let me show it the water-bottle confidence orange so you can see this works quite well if I stand in there it's kind of confused now you see I was standing in a lot of the training images with the water bottle so it really thinks my train whistle is the water bottle if I'm not in the picture it knows it's a train whistle so this is very important you have to remember the the machine learning system is not learning anything about these particular objects it's learning about the exact sets of pixels you're showing it to so if I'm standing in the background with the ukulele every time and then I'm not in the background anymore it's gonna be confused we see this in some machine learning models that if you hold something up no matter what you hold up it always just says cellphone because there's so many training images of people that thinks well whatever you're holding it's got to be a cell phone so so this is this so this is the idea so the wonderful thing about this is it's like it's really fun there's a really nice interface it's designed really well I can have it show different gifts play different sounds but I'm limited to these three categories green purple orange and I want to be able to do something like this in my own interactive experiment okay so let's stop it how does it even work how does this even work before you get to the code let's I might however all right so let's talk about how image classification works we said or I said there is something called mobile net this is the pre-trained model that I've talked about it has 1,000 image classes and it was trained on a database of like 15 million images from a database called image net when we use it we send our own image in it we send an image maybe it's from a webcam maybe it's a PNG we send it into mobile net and then mobile net gives us back a label and a probability so maybe it says something like cat 90% you know bird 5% clock 5% okay right this is what it gives us back so how are we going to retrain this model well here's the thing there's a lot of stuff going on in here and in order to retrain it we need to kind of like peel it open a little bit and the thing that we're gonna use to peel it open is something that's built into the ml5 library called a feature extractor okay well internally inside of the mountain mobile net model the mobile net model is running a neural network you might have heard that term before and at some point in this video series we might get more into that but a neural network is something that has layers it has multiple layers maybe it has layer 1 layer 2 layer 3 right beat and that data is actually being passed into layer 1 it's being processed and then sent to the layer 2 it's being processed again it's being sent to the layer 3 it's processed again those processes there are different kinds of processes for example most likely since we're sending an image data it's using something called you might have heard this term a convolution a convolution is actually an image process which is the same thing that happens like in Photoshop or any kind of image processing utility where you open up an image say hey let's make it brighter so a neural network is doing that to the image it's doing all these processes over and over and over again to try to reduce the image it's got a lot of pixels in it let's process it down to something smaller and down to something smaller let's do that many many many many times over multiple layers to eventually get to something which we can call features so if I say that the last layer after does all these processes is something called features and then those features which exist here they are then sent into they are then converted sorry through another layer into labels and probabilities so there's this whole process that's happening mobile net the image comes in it's processed through a convolutional layer maybe through another convolutional layer then through other some other kind of layer bla bla bla ends with all these numbers which are really just numbers a whole lot of numbers and then those numbers are processed one more time to get probabilities well what if what transfer learning is is it's hey let's just delete this part let's go into mobile net and stop right here feature extractor let's make a version of the mobile net Mont model where we stop right here at the feature extractor and then we take those features and trick we put our own training images in we say take this training image of the ukelele send it all the way through don't bother to get the label don't get a cooz of the guitar just stop here and say say hey you know what these features these features are a hundred percent ukulele so we're gonna retrain the model to math basically map the features to our own labels instead of the labels that are previously existed in mobile net all right so what are the features so here's the thing in theory we could this I mean I think it's some like crazy sort of theoretical sense we could eliminate all of these layers and just teach a machine to learn that this set of pixels are a cat and this set of pixels are a bird and let's now look at this whole other set of pixels which set of pixels does it resemble that's kind of what we're doing but what you have to remember is that images have many pixels and so when you have two like a compare images pixel by pixel by pixel by pixel there's so much data and so actually what this this idea of features the idea of features is boiling the essence of an image down to a smaller set of manageable numbers so in essence this image which might have started as like a 512 by 512 you do the math 512 squared number of pixels maybe ends up as 100 features just 100 numbers I don't actually remember what it is in mobile net we should look it up and then those numbers are typically just numb between zero and one this is also often referred to as a vector meaning a list of numbers it's the essence the numeric essence of that image that you just passed in and it's been it's learned these features over lots and lots of time of being trained with millions of millions of images so what we need to do in ml 5 is analyze a higher level library we don't have to do all this manipulation ourselves we basically just say hey instead of making an image classifier with mobile net we're going to make a feature extractor with mobile net and then turn that feature extractor into a classifier and train it with our own images so this is what I'm going to do in the next video I'm actually going to write the code to do exactly this and I'm going to train it with a few sets of images in here and actually what you'll see is what's interesting is you can actually get it to work with things like different facial expressions or different gestures there's a lot of wonderful possibilities there

20 thoughts on “ml5.js: Transfer Learning with Feature Extractor”

  1. In my opinion, if it will ever exist such a term like "teaching artist" it would be because of you. You are a very gifted man for teaching Dan! ps.: another idea for a video playlist: read this https://medium.com/@devdevcharlie/experimenting-with-brain-computer-interfaces-in-javascript-8d6cb891fda8?fbclid=IwAR37T1VuBStvh8nKv4G0QFI6Ht2nchS-RWy5Ttt4pFz0_o3Cbbb12d_5I7c
    I found it just amazing and so inspiring. Have you ever heard about it? and about Js library on it?

  2. Hiya again Dan, whats the chances of you doing a Delaunay Triangle coding challenge? from there you can do the voronoi diagram

  3. Hi Teach!
    Thought you might appreciate (perhaps even mention or get involved with) an international school that's being launched. It's a non profit, hosted in about 450+ countries by volunteers all looking to share the wonder of learning. It sounds like your jam and if you could get involved that would just be a dream come true. Please have a look and hope to see ya there 🙂

    https://youtu.be/8yu8rtXThy8

    I figured you wouldn't mind me posting about it here because I know you're all about sharing the love and helping people learn – I hope I haven't been rude.
    Cheers
    EDIT: PS I can't believe you managed to edit the original video down to about 12 minutes lol!!!! You should begin a video editing class ;p

  4. after I watched like 20 videos of you, I feel like I knew you from somewhere.
    "OHHH!!! He's the every IT guy on DC series". Anybody feels the same?

  5. your'e legend , I'm student in Computer Science and i'm really upset why they didn't teach us like those objects..
    Regards..

  6. Your videos are awesome! I study computers science in college, but all we learn is how to code websites, databases and boring
    accounting programs. I know computer science is so much more than that, but I guess this is where the jobs are nowadays :/

Leave a Reply

Your email address will not be published. Required fields are marked *