Machine Learning App that could Save Your Life!



yo guys so last year IBM contacting me and challenged me to embed AI into a drone and having done only one rather simplistic hardware punching up into that point it was a really hard challenge but was also a lot of fun to try and figure out and learn how to solve the problem for hoes and guess what iBM has slid back into my DMS with another challenge and right off the bat I am really excited collaborations with IBM developer for me has became a beacon for unlocking new potential within myself as a developer and just to be completely transparent with you guys yes this video is sponsored by IBM so what's this new challenge you ask well this is the second year IBM and their partner organizations are holding a global developer challenge called call for code in which the idea is a develop some software or hardware solution or both that could potentially save lives so kind of like a gang jam or hackathon but with saving lives as the theme which being such a huge company that they are I am very happy to see that they have such initiatives putting down massive amounts of money and resources to calm developers like you and I to use our coding superpowers to try and contribute to making the world a better place and so I can hit me up wondering some ways we could promote this initiative together and I replied why don't I just try my hand at entering the contest myself I mean after all first place project will win $200,000 but also IBM just launched a new initiative called code in response in which the top projects from call for called 2019 will get assistance from IBM to take their projects into production and so with that preface let us begin now let's get the first obvious question out of the way so you mean to tell me that you're going to build an application that can save lives by yourself in a couple of weeks right no not at all actually sure I may be lazy but I'm no fool I know that's something like this is a very very long game there are so many moving targets to try and hit when trying to solve problems as big as potentially lifesaving anywho the very first step was to try and identify a problem that I wanted to solve that could potentially save lives and this in itself was a really difficult challenge it took me many Schauer thoughts to land on this but this is the problem that I pitched the IBM that I wanted to try and solve now bear with me the way I like the concept is by shooting for the stars and see where you land so imagine you're in the middle of some crazy storm or something and in front your path to safety is an unexpected flood with what looks to be a not so deep calm moving body of water now you could just turn around and lose all the progress that you've made but you're really in the moment and really just want to get to safety and so you say ah what the hell this water isn't even that bad and so you try to drive across it but the flood turns out to be more dangerous than you realized and boom just like that you're now stuck on the top of your car because you had an accurate data and this is why I'd like to introduce project safe waters project safe waters is a program that you'll be able to point at any body of water and it'll be able to predict the depth of the body of water that's centered on screen along with the traveling speed of the body of water its direction vector and its overall threat level and this was my idea for IBM's 2019 call for code initiative now how does one expect to create such a program well well let's just say we have a lot of work to do you see I plan to train a neural network to learn from the patterns within some data to accurately enough predict its threat level now if you want this neural networks be able to generalize many different weather and geographical conditions to be intelligent enough to actually help save lives we'll have to gather millions upon millions of images with varying and every last photo must have a label I mean think about it if we're choosing a single camera as inputs for project safe waters well there are so many different variables to consider just take whether for example there's Rainey there's Sonny there's cloudy they're snowy and there's a bunch of different lighting conditions like day night dusk dawn car light street lamps not to mention the various forms that bodies of water can take like this this this one that one these again we really have our work cut out for us but again I think it's a really important note to stress that even though I believe highly in my skills as a developer I'm not even sure if I have the ability to develop this app as I've jumped it up to be or how possible the solution even is for this problem but regardless of the fact we're going to give it a shot anyway because I think if you persistently aim for some grandiose idea you're guaranteed to eventually fall somewhere in that general direction and so with that let's officially start developing now the very first thing that needs to be done is to figure out some sort of plan of attack because without this everything will fall apart and very quickly especially when dealing with anything machine learning related and so to do that we have to first ask a lot of questions are there any pre-existing labeled flood data out there are there a lot of photos of floods on the internet how long per image would take for me to label these photos at what rate could I gather my own original flood data and after a little while of researching I got some answers to these questions No yes two years if I guess labeled the photos and a very very long time and so are we gonna go about getting enough data and a reasonable amount of time to test this idea you can has any value I mean of course we could go out with a camera and the measuring stick collecting data on different bodies of water but this is a really bad idea because one I don't even know if this over idea will work and investing a month of time to get a couple thousand photos doesn't interest me and to trying to collect enough very data to solve this problem alone will take me years even decades perhaps and three I live in Southern California or it sprinkles the water that falls from the sky here does not count as rain sprinkles maybe five times a year if we're lucky so yeah that approach gets a big no for me now of course there are already images the floods that exists on the Internet and I'm sure I could easily collect more than a million of them simply by using web crawlers but there is a huge problem with this approach as well as I've mentioned it taken me almost two years to label every single photo just by looking at them and quickly guessing and over six ten years to research a better approximate answer for every single photo trust me I have time this but an even bigger issue to this approach I am far from a flood expert and if it's up to me to look at over a million photos of floods in guest level their depth velocity vector threat level etc then we're doomed I could end up training a neural network with the opposite effect of what I wanted to achieve my inaccurate guest labeling could end up putting people in some very dangerous scenarios if project safe waters were to be deployed that is why it's really important to have incredibly accurate label data for the data set that I'll be using for training but here's the thing there is no way around this this data set of millions of flood images with labels will have to be created in one way or another if I want to use current machine learning approaches to solve this problem be it by myself taking over 16 years or putting together some crowdsourcing effort it all have to be done but before we commit to such a major initiative I think it's a good idea to at least be able to know if this is even worth our time or not and so to get an answer to this we can use one simple industry trick and what does that trick you ask well we can create a simple simulation that is realistic enough to be able to take in Tsim photos essentially generating a simulated data set with labels for us quickly and effortlessly then we can feed that data set into a neural network for it to train on in fact while we're at it let's pause for a second and configure what I like to call a ladder in which we will start where we want to be and work backwards figuring out the steps to get there then we'll simply just climb up our ladder to make some progress now I put together a pretty complex ladder for the entirety of project safe waters but for this video we're just gonna try and climb up the first ladder and so at the top of this ladder we have the goal of training and accurate enough neural network based on data from a simple simulation not too challenging shouldn't be too hard but in order to get there we need to get convincing enough results from testing but in order to do that we need to train a neural network that converges but before we can train any model we need to obtain a bunch of data with labels but in order to do that we need to create a realistic enough flood simulation environment and so that's where we'll start now I started off pretty foolishly I thought that for whatever reason I needed to simulate every droplet of water for the simulation and so I start off using the Nvidia flex with unity 3d which is a beautiful robust liquid and cloth simulation API published by Nvidia but after many hours are trying to get the liquid to be less like viscous honey and more like water I learned that it's just not possible with this API in fact I learned that simulating water is actually a really big deal there are actually some papers that are working on trying to address this the processing power that is required to simulate water is just too hot so I give up there and try to use the liquid simulation in blender but quickly stop looking into this once I realized that I wasn't in the mood to learn how to use Python scripts in blender yet completely stumped on what to do next I started over from a completely different approach using unity3d I grabbed their windward City demo environment and thought about how it could flood this city and that is when a brilliant idea popped into my head why not just use a simple blue plane relax relax this isn't actually brilliant at all it's actually pretty industry standard and I'm not going to keep it like this you see most water and video games are usually just the plane with an animated texture on it so to speak so to make this training data as real as I can for now I purchased a pretty cheap water shader on the unity asset store for ten dollars applied it to our plane and I'm pretty happy with it it looks pretty realistic if you ask me right a realistic in the flood simulation environment check next I need to create a photographer's I can jump around the city taking pictures of the flood and labeling how deep the water is and I achieve this by simply creating an avatar attaching a camera to it to simulate a cellphone view then shoot a ray cast down from the sky and if that Ray cast point is within a certain height range and if the water that the cameras pointing at isn't too deep then we take a picture of that body of water label it then use it as data then we jump to a new spot and repeat that process pretty simple algorithm oh and you want to know the details of the algorithm that labels the water's depth well don't worry cuz I've got you covered we simply start by removing the layer of water shooting at ray cast straight from the center of the camera to the ground it's pointing out then we store the Y value from that 3d vector location as y1 next because the water is a single plane across the entire map we simply just take the Y value from the waters plane y2 and now we have the height of the water and the height of the ground that's below it so now we just have to do a simple equation of y2 minus y1 and that will give us our depth otherwise if the water isn't higher than the ground then this isn't a flood oh yeah and I also had to do a bit of metrics normalizing so that's converting our game world metrics into our real world metrics but yeah math and logic fun stuff and now that our simple simulation is complete all I have to do is just punching that I want ten thousand images and sit back and relax with a bag of candy a few minutes later and I now have 10,000 generated flood images with labels and I am ready to begin training obtain a bunch of data with labels check now comes the part I've been dreading for quite some time when I'm talking well before this video and that's writing my first convolutional neural network you see nine out of ten convolutional neural network tutorials that you'll find on the internet uses a little data set called m-miss which isn't the biggest deal except when you download this data set it's given to you already in pixel values which can be a real pain to try and figure out how to use your own data set with Python and its dependencies having less than two years of Python experience under my belt m-miss tutorials aren't the friendliest user experience for new users like myself but another reason why I've been dreading this is because I don't have the strongest intuition when it comes to convolution or no networks I've done next to no research on these things and I have a lot to learn but here goes nothing I started by downloading just your basic eminence convolutional neural network script from github and just start to tweak it to get it to work I was using Python and carrots if you're interested two days later of R&D and whatnot and I finally got the convolutional neural network working properly and ready for me to feed it some input and so that I did however the images that the simulation generate they were HD resolution so 1920 by 1080 but if you do the math that's over 6 million pixels per image and if we times that by even a data set of a thousand that's six billion pixel values and as far as I understand it we'd have to store all 6 billion plus pixel values into the RAM of our GPU which I'll tell you now just is not gonna work by my calculations that should be somewhere in the hundreds of gigabytes so our data needs to be compressed if you want any shot at this yes I know it's a lot to do but bear with me and so the first thing that we can do to compress our data is to turn our images into grayscale images instead of having RGB color values per pixel we can just use a single brightness value per pixel reducing each image from 6 million pixels per image two million pixels per image one third the amount of data a bit better but still a lot of information for your modern GPU to handle next thing we need to do to compress our data is to reduce the size of the images if we reduce both the height and width of the image to 10% of the original then we will reduce the pixel count per image by a factor of a hundred making it now a little over 20,000 pixels per image and the last thing we can do is just watch how many training samples that we use so that we don't overload the GPU and I just started with 20 samples just to make sure everything was working right let the training begin eventually the training model reached a low cost or what most people like to call la split hashtag make optimization cost again but anywho this is considered a converged model so training model that converges check and this performed ok here are the metrics that I'm deciding to use to measure the success of the neural network the first variable is average error this is essentially how far above or below the neural network is with predicting death the second is the on target variable which I decided that I'm okay with the neural network being at least one point two inches off with their predictions I believe that one point two inches of water is not a noticeable difference or you can walk in one point two inches of water with no problem so so long as the prediction is plus or minus a 1 point 2 inches margin of error then it is 100% accurate to me then the last metric is the same concept as before but for 6 inches which I personally think is completely unacceptable I mean just grab a ruler and see how high 6 inches is I just think that it could help with measuring the improvement of the model but by no means well I'll be using it as an official measure of success and here is how the 20 sample model did it had an average error of about 20 2.7 inches when I tested it on 1,000 unseen images from our simulation it only predicted about 4.6 of the images on target and 21% under half a foot now just looking at this data by itself what do we know we know that it's performing a extremely bad and that at the moment this is not proving to be a good idea for all we know this no network can be completely randomly guessing in these results are just results of luck although not completely true because when we ask it to predict on the data that it's been trained on you can see that it gets close enough estimates but doesn't over fit and so to further test this approach I then trained a new 100 sample model the results for that model that was tested on the same 1,000 unseen images as a 20 sample model was able to get an average error of about nineteen point two inches improving the average error by almost 16% it labeled five point one percent of the data on target improving the on target prediction by almost eleven percent and it labeled twenty eight percent of the new images under an error of six inches improving this by 33 percent I expected this model to perform poorly feeding at twenty and a hundred training samples but seeing these small amounts of improvement simply correlated to the amount of training samples that we feed it was all that I needed to see to make me a believer get convincing enough results from testing kinda check I'll give that a 50% opacity checkmark it's like I'm thrilled at what I see but I would like more results before I start feeling really confident and just for consistency here's how the hundred sample model performed and we asked it to predict on the data that it was trained on again pretty good it didn't over fit anywho these results led me to believe that if we fed it even more samples ideally the model should see even greater gains but of course there are other things that we can do to help the model generalize and improve as well but for now more data I tried to train the no network on even more samples but I start to run into GPU memory allocation problems which at the moment is something I only have a little understanding of I still haven't quite figured out how to get around this with Kerris and tensorflow yet but again the idea is that you have a GPU with a fixed space for allocated data then we have some mass chunking data that you need to squeeze into that limited space of the GPU I believe that this is done by chunking your day and spoon feeding a GPU chunk by chunk now I was ready to end this project here for now but then hope arrived as the glimmer of an angel by the name of Nick you see I submitted this very video to IBM and when they seen I was having troubles with my GPU they were like well hold up Bubba you were having GPU problems and didn't even tell us where IBM baby we've got GPUs for days besides using IBM services is a requirement for this competition anyway and that's when I was connected with their machine learning computer vision addict Nick Ford Actives who was actually in the middle of building a machine learning interface for IBM servers he helped me process all of my 10,000 images and convert my model to work with IBM infrastructure he even wrote a couple of scripts to help save me from some headaches all of which are included in the github repo in the link in the description Nick you're a real one man but once his conversion process was over I finally trained on my 10,000 images and after about an hour so later my model was done training now before we look at the results we need to quickly go over the new output labels that happen in the script conversion so deterring using IBM GPUs we had to use what IVM calls buckets for output instead of using a single floating-point value as our output we now have five buckets which are ankle deep waters knee deep waters waist deep waters floating deep waters that just means that you float in the water and your feet can't touch the ground and dangerously deep waters and here's how the IBM algorithm trained on 10,000 samples performed on the same data that it was trained on just by looking at it it did really well oh you wanted me to go into detail on how to read this don't worry don't worry I got you so this is what's called a confusion matrix and confusion matrices are often used to visualize the performance of machine learning classifiers on the Left we have our ground truth aka the actual answers or labels and on the top we have our models prediction this information will be important in a second is hang tight now we can tell the performance of our classifier by looking at how often the prediction matches with the actual label which if you were to take a look at the confusion matrix it should create a backslash diagonal line down the graph now what do the numbers represent well simply put the numbers represent the percentage from 0% all the way to 100% but there is only 100 percent in our ground truth or each row because our ground truth is calculated using the exact number of images for label while our predictions are just trying to predict the number of images per label it may not be correct therefore this can't make a bond receipt of the images now in a perfect world what we want this confusion matrix to show is a strong dark blue diagonal backslash line kind of like this but of course utopia only exists in one's mind now we could just stop here at this confusion matrix in think about what this visualisation says about our model but I'm gonna give you a confusion matrix that's just a bit better and has a bit more information in it check this one out now it may seem like there's a lot going on with this new graph but don't worry I promise it's the same thing going on with just a couple extra bits to it so here in the sinner or whatever it's pretty much the same thing that we just went over except instead of showing the percentages it now shows the actual counts of images oh yeah and the color map has also been normalized to the highest count by our models prediction these two axes here simply just add up the number of ground shoot photos and the number of predictions and this last axis is for precision and recall now precision simply means if we were to give our models predictions decision-making power how accurate would it be making decisions and so looking at all the predictions on the top we can follow those down to the bottom to see how precise our model was with predicting each and every label and these numbers are percentages to a hundred then finally we have the recall and the recall is measuring a different question for all the actual labels in our data set how accurate was the model with getting those correct and these are also in percentages up to a hundred and if you're confused about the difference between these two imagine this quick example you send your robot off to the swap meet until it to bring back twenty six bananas well your robot goes to the swap meet it doesn't quite know what bananas are and so it just grabs all the fruit it sees and so when it brings you back the fruit you can do a confusion matrix on it and you'll find that it has a 100% recall it grabbed 26 out of 26 bananas but it's precision is quite terrible because it also grabbed 40 apples 200 grapes and 12 watermelon that's a good example how you distinguish the two okay now that we're on the same page with the confusion matrix let's start doing some data analysis first thing the colors if we take a look at the total ground truth column we can see that the label dangerously deep is definitely underrepresented in this data set I think that adding more representation here might improve our model a bit but at the same time if you look at the total predictions row you see if the colors are kind of on par to each other meaning that the prediction distribution is pretty close to us actual representation within the dataset another color observation is it seems that the more shallow the water gets the less accurate the models predictions get another observation on this side of the line it shows us if the model is overestimating the water step which is what we prefer and on this side of the line it shows as if the model is under estimating the waters then what is what we do not want I don't know about you but I'm much rather a model telling me that a water is deeper than actually is thin to tell me that it's shallower than it actually is and last observation it appears that our recall is better than our precision which head do not mind I mean of course I rather our motto be precise and have good recall but if how to choose one I much rather our model caching all instances of deep water instead of being extremely precise however at the fact that it's currently leaning towards under estimating water depths I don't know how helpful high recall is in this instance but yeah that's the amazing power of data analysis from the super simple graph we were able to visualize data in a bunch of different ways to derive a bunch of analyses for it but but but of course this is only the training data that it seen many times before how does it perform on data it's never seen before well here is a sample of five hundred not bad and here is a sample of 1,000 and yo not bad at all and so get convincing enough results check which means that we've also successfully trained in no network based on data from a simple simulation about I mean yeah now a little more 70 percent is okay by most standards but there is still so much that we can do to improve our neural network for starters even though we fed the network 10,000 samples we can still feed our network even more data the domain space for this problem is actually quite large after all more data would just help it figure out its domain not to mention how noisy is the data for example this should not be in my training data set why are you here let's just delete that you didn't see anything does the input the output relation I'm asking the model to try and solve even make sense is there a better way to encode this relationship these questions are the reason why research and development takes so long you've got to answer all these questions if you want to build something impactful and so for now this project is over I'm not sure what the future holds for project safe waters I don't know if I'll follow this video with the part two because I have a lot of ideas that I want to explore however don't forget about ibm's call for code and boy do I have some good news for you so maybe you want into this contest but you're not exactly sure what idea you should work on to enter into the contest and I completely understand hell it took maybe a week or two just for me to think of this solution myself but check it out you just helped me develop project safe waters in spirit so do you know what this means this project is part yours as well why not enter the contest using project safe waters here's the open source code and you now have everything you need to pick up where I left off there's nothing in the rules that say we can't start from the same github repo but if you can think of your own unique solution that is potentially lifesaving that's fantastic I encourage you to give it a shot and enter what you got to lose huh and if you need even more convincing the contest ends July 29th so you've got some time but the winner each year gets a pretty sweet deal first-place project will win $200,000 but also IBM just launched a new initiative called code and response in which the top projects from call for code 2019 will get assistance from IBM to take their projects into production visit IBM dot biz / to brill CFC to register and get $200 worth of IBM cloud credits for six months instead of 30 days last year's winner was project owl a solution that aims to keep victims connected to first responders in the aftermath of a natural disaster and of course they've won $200,000 but also helped from IBM to test in the play project owl in Puerto Rico but you don't have to take it from me IBM developer put up a crazy-cool inspiring documentary on this please check that out in the description for some our slash inspo

20 thoughts on “Machine Learning App that could Save Your Life!”

  1. are street signs at a required height? I initially thought you taught a machine to measure the distance between recognizable street signs and the surface of the water.

  2. I stay in India and there's a lot of flood going on, I am so glad that you posted a solution and I can use your brilliant idea and help the people of my country, a big thank you! Loved your vdo, always been a fan of your work.

  3. hey jabril…you can use clusters. its where you classify some data into knee deep water, dangerously deep water and etcetera. just graph the data on a matplotlib graph…give each cluster a name…and give it new data to classify…please do a video trying this…i'll be grateful

  4. Pls help only if u wanna 🙂https://www.gofundme.com/f/twenty-one-pilots-an&rcid=r01-156359971783-48a00df2a67a4dbd&pc=ot_co_campmgmt_m

  5. Could you make a neural network that listens to my finished/unfinished beats and create beats that sound like I made them?

  6. Can anyone say what are the softwares he used in this video…please tell atleast in what software did he do his stimulation?

  7. Use a pre trained model like vgg16 or vgg19 or reset50 or inception v3 that would give better results

  8. i really wanted to join this kind of event, im competent in programming/electronics, but i dont have a team to interact with. I currently live in Mississauga, Ontario. If someone needs a member, let me know…

  9. The Project Owl Pilot: Puerto Rico | Code and Response Deployment Feature

    the documentary mentioned at 28:30
    https://youtu.be/d7aAdk87Yv8

  10. I would think coupling online photos that have date and location data with weather reports would be a good idea for data gathering.

  11. Isn't the data completely random and not useful at all? The shader doesn't simulate the depth of water, so what the surface looks like and how deep it is has no correlation for the network to work with…

  12. you're probably already aware of this but just in case you're not, you can also horizontally flip all your training images to double your training data for free :^)

Leave a Reply

Your email address will not be published. Required fields are marked *