Deep Learning to Solve Challenging Problems (Google I/O'19)



I'm excited to be here today to tell you about how I see sort of deep learning and how it can be used to solve some of the really challenging problems that the world is facing so and I should point out that I'm presenting the work of many many different people at Google so this is a broad perspective of a lot of the research that we're doing is not purely my work so first you know I'm sure you may have all noticed but machine learning is like growing in importance there's a lot more emphasis on machine learning research there's a lot more uses of machine learning this is a graph showing how many archived papers archive is a preprint hosting service for all kinds of different research and this is sort of the subcategories of it that are related to machine learning and what you see is that since 2009 we've actually been growing the number of papers posted at a really fast exponential rate actually faster than the Moore's Law growth rate of computational power that we got so nice and used to for 40 years but it's now slowed down that we sort of replaced the nice growth and computing performance with growth in people generating ideas which is nice and deep learning is this kind of particular form of machine learning it's actually a rebranding in some sense of a very old set of ideas around creating artificial neural networks these are these collections of simple trainable mathematical units organized in layers where the higher layers typically build higher levels of abstraction based on things that the lower layers are learning and you can train these things end to end and the algorithms that underlie a lot of the work that we're doing today actually were developed 35 40 years ago in fact my colleague Jeff Hinton just won the Turing award this year in along with yama kuhn and yoshua bengio for a lot of the work that they did it around you know over the past 30 or 40 years and really the ideas are not new but what's changed is we got amazing results 30 or 40 years ago on kind of toyish problems but couldn't didn't have the computational resources to make these approaches work on real large-scale problems but starting about eight or nine years ago we started to have enough computation to really make these approaches work well and so what are things think of a neural net as something that can easily that can learn really complicated functions that map from input to output now that sounds kind of abstract you think of functions as like y equals x squared or something but really these functions can be very complicated and can learn from very raw forms of data so you can take the pixels of an image and train a neural net to predict what is in the image as a categorical label like that's a leopard that's one of my vacation photos from audio waveforms you can learn to predict you know a transcript of what is being said how cold is it outside you can learn to take input in one language hello how are you and predict the output being that sentence translated into another language Bonjour comment allez-vous you can even do more complicated things like take the pixels of an image and create a trance caption that describes the image it's not just a category it's like a simple sentence a cheetah lying on top of a car which is kind of unusual anyway your prior for that should be pretty low and in the field of computer vision you know we've made great strides thanks to neural nets in 2011 the Stanford imagenet contest which is a contest held every year the winning entry did not use neural nets that was the last year the winning entry does not use neural nets I got twenty six percent error and that won the contest we know this task humans it's not a trivial task so humans themselves have about 5% error because you have to distinguish you know among a thousand different categories of things including like a picture of a dog you have to say which if 40 breeds of dog is it so it's not a completely like trivial thing and in 2016 for example the winning entry got three percent error so this is just a huge fundamental leap in computer vision you know computers went from basically not being able to see in 2011 to now we can see pretty darn well and that has huge ramifications for all kinds of things in the world not just computer science but like the application of machine learning and computing to perceiving the world around us okay so the rest of this talk I'm in a frame in a way of this but in 2008 the US National Academy of Engineering published this list of 14 grand engineering challenges for the 21st century and they got together a bunch of experts across lots of different domains and they all kind of collectively came up with this list of 14 things which I think if you can agree these are actually pretty challenging problems and if we made progress on all of them the world would be you know a healthier place we'd have more safer place we'd have more scientific discovery all these things are important problems and so given the limited time what I'm gonna do is talk about the ones in boldface and we have projects in Google research that are focused on all the ones listed in red but I'm not gonna talk about the other ones and so that's kind of the tour of the rest of the talk we're just going to kind of dive in and off we go I think we start with restoring and improving urban infrastructure right we know cities were designed you know the basic structure of cities has been designed quite some time ago but there's some changes that were on the cusp of that are going to really dramatically change how we might want to design cities and in particular autonomous vehicles are on the verge of commercial practicality this is from our lame-o colleagues part of alphabet they've been sort of doing work in this space for almost a decade and the basic problem of an autonomous vehicle is you have to perceive the world around you from raw sensor inputs things like lidar and cameras and radar and other kinds of things and you want to build a model of the world and the objects around you and understand what those objects are is that a pedestrian or a light pole is it a car that's moving what does the and then also be able to predict both a short time from now like where's that car gonna be in one second and then make a set of decisions about what actions you want to take to accomplish the goals you know get from A to B without having any trouble and it's really thanks to deep learning vision based algorithms infusing of all this sensor data that we can actually build maps of the world like this there are sort of understandings of the environment around us and actually have these things operate in the real world and way mo you know this is not some distant far-off dream way mo is actually operating about a hundred cars with passengers in the backseat and no safety drivers in the front seat in the Phoenix Arizona area and so this is you know a pretty strong sense that this is pretty close to reality now Arizona is one of the easier self-driving car environments it's like it never rains it's too hot so there aren't that many pedestrians the streets are very wide the other drivers are very slow Sam downtown San Francisco's harder but you know this is a sign that it's not that far off obviously a vision works it's easier to build robots that can do things in the world if you can't see it's really hard to do things but if you can start to see you can actually have you know practical robotics things that use computer vision to then make decisions about how they should act in the world so this is a video of a bunch of robots practicing picking things up and and then dropping them and picking more things up and essentially trying to grasp things and it turns out that if you can one nice thing about robots is you can actually collect the sensor data and pool the experience of many robots and then collectively train on their collected experience and then get a better model of how to drive the you know how to actually grasp things and then push that out to the robots and then the next day they can all practice with a slightly better grasping model this is unlike humans that you plop on the the carpet in front of your in your living room they don't get to pool their experience okay so in 2015 the success rate on a particular grasping task of grasping objects that a robot has never seen before was about 65% when we use this kind of arm arm farm that's what that thing is called I wanted to call it the armpit but I was overruled basically by collecting a lot of experience we were actually able to get a pretty significant boost in grass success rate up to 78% and then with further work on algorithms and more refinement of the approach we're now able to get a 96% grasp success rate so this is like pretty good progress in three years we've gone from you know a third of the time you fail to pick something up which is very hard to actually string together a whole sequence of things and actually have robots actually do things in the real world – you know grasping almost working quite reliably so that's exciting we've also been doing a lot of work on how do we get robots to do things more easily how can you rather than having them practice themselves maybe we can demonstrate things to them so this is one of our a I residents doing work they also do fantastic machine learning research but they also film demonstration videos for these robots and what you see here is a simulated robot trying to emulate from the raw pixels of the video what it's seeing and on the right you see a few demonstrations of pouring and the robot using those video clips you know five or ten seconds of someone pouring something and some reinforcement learning based trials to attempt to learn to pour on its own and I would say after 15 trials in about 15 minutes of training it's able to pour that well I would say like at the level of a four-year-old not an eight-year-old but that's actually much you know in 15 minutes of effort it's able to get to that level of success which is a pretty pretty big deal okay one of the other areas that was in the Grand Challenges was advanced health informatics I think you saw in the keynote yesterday the work on lung cancer we've also been doing a lot of work on an eye disease called diabetic retinopathy which is the fastest growing cause of blindness in the world there's 415 million people in the world with diabetes and each of them ideally would be screened every year to see if they have diabetic retinopathy which is a degenerative eye disease that if you catch in time it's very treatable but if you don't catch it in time you can suffer full or partial vision loss and so it's really important that we be able to screen everyone that is at risk for this and yeah regular screening and that's kind of the image that you get to see as an ophthalmologist and in India for example there's a shortage of more than a hundred thousand eye doctors to do the necessary amount of screening of this disease and so 45 percent of patients suffer vision loss before they're diagnosed which is tragic because it's a completely preventable thing if you catch it in time and basically the way an ophthalmologist looks at this is they look at these images and they grade it on a five-point scale one two three four or five looking for things like these little hemorrhages that you see on the right hand side and it's a little subjective so if you ask two ophthalmologists to grade the same image they agree on the score one two three four or five sixty percent of the time and if you ask the same ophthalmologist to grade the same image a few hours later they agree with themselves sixty five percent of the time and this is why second opinions are useful in in in medicine because some of these things are actually quite subjective and it's actually a big deal because the difference between a two and a three is actually like go away and come back in a year versus we better get you into the clinic next week nonetheless you this is actually a computer vision problem and so instead of having a classification of a thousand general category categories of dogs and leopards you can actually just have five categories of the five levels of diabetic retinopathy and train the model on eye images and an assessment of what the score should be and if you do that you can actually get the images labeled by several ophthalmologists six or seven so that you reduce the variance that you already see between ophthalmologists assessing the same image so we actually you know five five of them say it's a two two of them say it's the three it's probably more like a two then a three and if you do that then you can essentially get a model that is on par or slightly better than the average board-certified ophthalmologist at a set at doing this task which is great this is work published at the end of 2016 with by my colleagues in JAMA which is a top medical journal we wanted to do even better though so it turns out you can actually instead of you can get the images labeled by retinal specialists to have more training in retinal eye disease and instead of getting independent assessments you get three retinal specialists in a room for each image and you essentially say okay you all have to come up with an adjudicated number what what number do you agree on for each image and if you do that then you can train on the output of this consensus of three retinal specialists and you actually now have a model that is on par with retinal specialist which is kind of the gold standard of care in this this area rather than the the nadas good model on trained on ophthalmologist opinion and so this is something we've seen born out where you have really good high-quality training data and you can actually then train a model on that and get the effects of retinal specialists in to the model but the other neat thing is you can actually have completely new discoveries so someone new joined the ophthalmology research team and as a as a sort of warm-up exercise to understand how our tools worked Lilly Peng who you saw on the stage yesterday said oh why don't you go see if you can predict age and gender from the retinal image just to see if the machine learning pipeline you know person could get that machine learning pipeline going and ophthalmologists can't predict gender from the and I image they don't know how to do that and so Lily thought the average a you see on this should be no better than flipping a coin you see 0.5 and the person went away and they said okay I've got it done my a you see is 0.7 and Lily is like hmm that's weird go check everything and come back and so they came back and they said okay I've made a few improvements it's now point eight and Lily's that got people excited because all of a sudden we realized you can actually predict a whole bunch of interesting things from a retinal image and in particular you can actually detect someone's self-reported sex and you can predict a whole bunch of other things like their age things about their systolic and diastolic blood pressure their hemoglobin level and it turns out you combine those things together and you can get a prediction of someone's cardiovascular risk at the same level of accuracy that are normally a much more invasive blood test where you have to draw blood send it off to the lab you know wait 24 hours get the lab tests back now you can just do that with the retinal image so there's real hope that this could be kind of a new thing that if you go to the doctor you'll get it up you know a picture of your eye taken and we'll have a longitudinal history of your eye and be able to learn new things from it so we're pretty excited about that a lot of the Grand Challenges were around sort of understanding molecules and and chemistry better one is engineer better medicines but this work that I'm going to show you might apply to some of these other things so one of the things quantum chemists want to be able to do is predict properties of molecules you know will this thing bind of this other thing is it toxic water it's quantum properties and the normal way they do this is they have a really computationally expensive simulator and you've plug in this molecule configuration you wait about an hour and at the end of that you get the output which says okay here are the things the simulator told you so it turns up and it's kind of a slow process you can't you can't consider that many different molecules like you might like to it turns out you can use the simulator as a teacher for a neural map so you can do that and then all of a sudden you have a neural-net that can basically learn from this learn to do what the simulator can do but way faster and so now you have something that is about 300,000 times faster and you can't distinguish the accuracy of the output of the neural mat versus a simulator and so that's kind of a completely game-changing thing if you're a quantum chemist all of a sudden your tools sped up by 300,000 times and all of a sudden that means you can do a very different kind of science you can say oh well I'm going to lunch I should probably screen 100 million molecules and when I come back they'll have a thousand that might be interesting so that's kind of a pretty interesting trend and I think it's one that will play out in lots and lots of different scientific fields or engineering fields where you have this really expensive simulator but you can actually learn to approximate it with a much cheaper neural net or a machine learning based model and get a simulator this much faster ok engineer the tools of scientific discovery I have a feeling this 14th one was just kind of a vague catch-all thing that the the panel of experts that work and that was convened decided should do but if it's pretty clear that if machine learning is going to be a big part of you know scientific discovery and engineering we want good tools to express machine learning algorithms and so that's sort of the motivation for why we created tensorflow is we wanted to have tools that we could use to express our own machine learning ideas and share them with the rest of the world and have other researchers sort of exchange machine learning ideas and sort of put machine learning models into practice in products and other environments and so we released this in the end of 2015 with this Apache 2.0 license and basically it has this graph based computational model that you can then optimize with a bunch of kind of traditional compiler optimizations and it then can be mapped onto a lot variety of different devices so you can run the same computation on CPUs or GPUs or or RTP use that I'll tell you about in a minute eager mode kind of makes this graph implicit rather than explicit which is coming in 2.0 and the community seems to have adopted tensorflow reasonably well and we've been like excited by all the different things that we've seen other people do both in terms of contributing to the core tensorflow system but also making use of it to do interesting things and so you know it's got some pretty good engagement kinds of stats 50 million downloads for sort of a fairly obscure programming package is is a fair number that seems like a good good mark of traction and we've seen people do things so I mentioned this in the keynote yesterday I like this one that's basically a company building fitness sensors for cows you can tell which of your hundred dairy cows is behaving a little strangely today there's a research team at Penn State and the international institute of tropical agriculture in Tanzania that is building a machine learning model that can run on device on a phone in the middle of a cassava field without any network connection to actually detect does this cassava plant have disease and how should I treat it I think this is a good example of how we want machine learning to run in lots and lots of environments lots of places in the world sometimes you have connectivity sometimes you don't a lot of cases you want it to run on device and it's really really going to be the future you're gonna have machine learning models running on tiny microcontrollers all kinds of things like this okay I'm going to use the remaining time to take take you on a tour through some research projects and then sketch how they might fit together in the future so I believe what we want is we want bigger machine learning models than we have today but in order to make that practical we want models that are sparsely activated so think of a giant model maybe with a thousand different pieces but you activate you know twenty or thirty of those pieces for any given example rather than the entire set of a thousand pieces we know this is a property that real organisms have in their neural systems is most of the most of their neural capacity is not active at any given point that's partly they're so power efficient right so so some work we did a couple of years ago at this point is what we call a sparsely gated mixture of experts layer and the essential idea is these pink rectangles here are normal neural-net layers but between a couple of neural-net layers we're going to insert another collection of tiny little neural maps that we call experts and we're going to have a gating Network that's gonna learn to activate just a few of those it's gonna learn which of those experts is most effective for a particular kind of example and the expert might have a lot of parameters it might be you know pretty large a matrix of parameters and we're gonna have a lot of them so we have in total 8 billion ish parameters but we're going to activate just a couple of the experts on any given example and you can see that when you learn to route things you try to learn to use the expert that is most effective at this particular example and when you send it to multiple experts that gives you a signal to train the routing network the gating Network so that it can learn that this expert is really good when you're talking about language that is about innovation and researching things like you see on the left hand side and this Center expert is really good at like talking about playing a leading role on central role and the one on the right is really good at kind of quickie adverb e things until they actually do develop very different kinds of expertise and the nice thing about this is if you compare this in a translation task with the bottom row you can essentially get significant improvement in translation accuracy that's the blue score there so one blue point improvement is a pretty significant thing we we really look like one blue point improvements and because it has all this extra capacity we can actually make the sizes of a pink layers smaller than they were in the original model and so we can actually shrink the amount of computation use per word by about a factor of two so 50 percent cheaper inference and the training time goes way down because you know we just have all this extra capacity and it's easier to train a model with a lot of parameters and so we have about one-tenth the training cost in terms of GPU days okay we've also been doing a lot of work on auto ml and which is this idea behind automating some of the machine learning tasks that a machine learning researcher or engineer does and the idea behind auto ml is currently you think about solving a machine learning problem where you kind of have some data you have some computation and you have an ml expert sit down and they do a bunch of experiments and they kind of stir it all together and run run lots of GPU days worth of wherever and you hopefully get a solution so what if we could turn this into using more computation to replace some of the experimentation that a machine learning someone with a lot of machine learning experience would it would actually do and one of the decisions that a machine learning expert makes is what architecture what neural network structure makes sense for this problem you know should I use a thirteen layer model or a nine layer model should it have three by three or five by five filters should have skipped connections or not and so if you're willing to say let's try to take this up a level and how do some meta learning then we can basically have a model that generates models and then try those models on the problem we actually care about so we're the basic sort of iteration of meta learning here is we're gonna have a model generating model we're gonna generate ten models we're gonna train each of those models and we're gonna see how well they each work on the problem we care about and we're gonna use the loss or the accuracy of those models as a reinforcement learning signal for the model generating model so then we can steer away from models that didn't seem to work very well and towards models that seemed to work better and then we just repeat a lot and when we repeat a lot you know we essentially get more and more accurate models over time and it works and it produces models that are a little strange looking like they're a little more unstructured than you might think of a model that a human might have designed so here we have all these kind of crazy skipped connections but they're sort of analogous to some of the ideas that machine learning researchers themselves have come up with in for example the ResNet architecture has a more structured style of skip connection but the basic idea is you want information to be able to flow more directly from the input to the output without going through as many sort of intermediate computational layers and the the system seems to have developed that intuition itself and the nice thing is these these models actually work pretty well so if you look at this graph accuracy is on the y-axis for the image net problem and computational cost of the models which are represented by dots here are is on the x-axis so generally you see this trend where if you have a more computationally expensive model you generally get higher accuracy and each of these black dots here is something that was a significant amount of effort by a bunch of top computer vision researchers that then or machine learning researchers that then they published and sort of advanced the state of the art at the time and so if you apply although amount of this problem what you see is that you actually exceed the frontier of the sort of hand generated models that the community has come up with and you do this both at the high end where you care most about accuracy and don't care as much about computational costs you can get a model that's slightly more accurate with less computational cost and at the low end you can get a model significantly more accurate for a very small amount of computational cost and that's then I think is a pretty interesting result it says that we should really lat computers and machine learning researchers work together to develop you know the best models for these kinds of problems and we've turned this into a product so we have cloud auto ml as a cloud product and you can try that on your own problem so if you maybe company that doesn't have a lot of machine learning researchers or machine learning engineers yourselves you can actually just take a bunch of images and categories of things you want to do maybe you have pictures from your assembly line you want to predict what part is this image of you can actually get a high quality model for that and we've extended this to things more than just vision so you can do videos and language and translation and more recently we've introduced something that allows you to predict relational data from other relational data you know you want to predict you know will this customer buy something given their past orders or something we've also obviously continued research in the auto ml field so we've got some work looking at the use of evolution rather than reinforcement learning for the search learning the optimization update rule learning the non-linearity function rather than just assuming we should use relu or some other kind of activation function we've actually got some work on incorporating both inference latency and the accuracy let's say you want a really good model that has to run in seven milliseconds we can sort of find the most accurate model that will run in your time budget allowed by sort of using a sort of more complicated reward function we can learn how to augment data so that you can stretch the amount of labeled data you have in interesting ways more effectively than handwritten data augmentation and we can explore lots of architectures to make this whole search process a bit more efficient okay but it's clear if we're going to try these approaches we're gonna need more computational power and I think one of the one of the truisms of machine learning over the last you know decade or so is more computational power tends to get better results when you have enough data and so it's really nice that deep learning is this really broadly useful tool across so many different problem domains because that means you can start to think about specializing Hardware for deep learning but have it applied to many many things and so there are two properties that deep learning algorithms tend to have one is they're very tolerant of reduced precision so if you you know do calculations to one decimal – precision that's perfectly fine with most of these algorithms you don't need six or seven digits of precision and the other thing is that they are all May all these algorithms I've shown you are made up of a handful of specific operations things like matrix multiplies vector dot products essentially dense linear algebra so if you can build machines computers that are really good at reduced precision dense linear algebra then you can accelerate lots of these machine learning algorithms quite a lot compared to more general-purpose computers that have you know general-purpose CPUs that can run all kinds of things or even GPUs which tend to be somewhat good at this but tend to have for example higher precision than you might want so we started to think about building specialized hardware when I did this kind of thought exercise in 2012 we were starting to see the initial success of deep neural nets for speech recognition and for image recognition and starting to think about how would we deploy these in some of our products and so there was this kind of scary moment where we realized that if speech started to work really well and at that time we couldn't run it on device because the devices didn't have enough computational power what if a hundred million users started talking to their phones for three minutes a day which is not implausible if speech starts to work a lot better and if we were running those speech bottles on CPUs we need to double the number of computers in Google Data Centers which is slightly terrifying to launch a one feature in one product and so we started to think about building these specialized processors for the deep learning algorithms we wanted to run and keep EUV one has been in production used since 2015 was really the outcome of that thought exercise and it's in production use on basically every query you do on every translation you do speech processing image processing alphago use a collection of the uses of the actual racks of machines that were competed in the alphago match you can see the little go board we've commemorated with on the side and then we started to tackle the bigger problem of not just inference which is we already have a trained model and you can just want to apply it but you actually do training in an accelerated way and so the second version of TP use are for training and inference and that's one of the TPU devices which has four chips on it this is TPU v3 which also has four chips on it it's got water cooling so it's slightly scary to have water in your computers but we do and then we design these systems to be configured together into larger configurations as we call pods so there's a TPU v2 pod this is a bigger TPU V 3 pod with water cooling you can actually see one of the racks of this in the machine learning dome and really these things actually do provide a lot of computational power individual devices with the four chips are you know up to 420 teraflops have a fair amount of memory and then the the actual pods themselves are up to 100 pedo flops of computers this is a pretty substantial amount of compute and really lets you very quickly try machine learning research experiments train very large production models on large data sets and these are also now available through our cloud products as of yesterday I think we announced them to be in beta one of the keys to performance here is the network interconnect between the chips and the pods is actually your super high speed 2d mesh with sort of wraparound links that's why it's toroidal and that means you can essentially program this thing as if it's a single computer and the software underneath the covers kind of takes care of distributing the computation appropriately and can do very fast all reduce kinds of operations of broadcast operations and so for example you can use a full TPU v2 pod to Train imagenet in 7.9 minutes versus the same problem using eight GPUs you get 27 times faster training at lower cost the v3 pod is actually even substantially larger you can train an image net model and scratch in less than two minutes more than a million images per second in training which is essentially the entire image net dataset every second and you can train very large bird language models for example as I was discussing on stage in the keynote yesterday in about 76 minutes on a fairly large corpus of data which normally would take days and so that really helps make our researchers and ml production systems more productive by being able to experiment more quickly you know if you can run an experiment in two minutes that's a very different kind of science and engineering you do then if that extreme experiment would take you a day and a half right you just think about running more experiments trying more things and we have lots of lots of models already available ok so let's take some of the ideas we talked about and think about how they might fit together so I said we want these really large models but have them be sparsely activated I think one of the things we're doing wrong and machine learning is we tend to train a machine learning model to do a single thing and then we have a different problem we tend to train a different model to do that other thing and I think really we should be thinking about how can we train models that do many many things and leverage the expertise that they have in doing many things to then be able to take on a new task and learn to do that new task more quickly and with less data this is essentially multi task learning but often multi task learning in practice today means three or four or five tasks not thousands or millions I think we really want to be thinking bigger and bolder about really doing you know in the limit one model for all the things we care about and obviously we're gonna try to Train this large model using fancy ml hardware okay so how might this look so imagine we've trained a model on a bunch of different tasks and it's learned these different components which can be sometimes shared across different tasks sometimes independent you know specialized for a particular task and now a new task comes along so with the kind of auto ml style reinforcement learning we should be able to use an RL algorithm to find pathways through this model that actually get us into a pretty good state for that new task because it hopefully has some commonalities with other things we've already learned um and then we might have some way to add capacity to the system so that for a task where we really care about accuracy we can add a bit of capacity and start to use that for this task and have that pathway kind of be more specialized for that task and therefore hopefully more accurate and I think that's an interesting direction to go and how can we think more about building a system like that than the current kind of models we activate we have today where we tend to fully activate the entire model for every example and tend to have them just for a single task okay I want to close on you know how we should be thinking about using machine learning in all the different places that we might consider using it and I think one of the things that I'm really proud of as a company is that last year we published a set of principles by which we think about how we're gonna use machine learning for different things and I think these seven things when we look at using machine learning in any of our products or settings we think carefully about how are we actually fulfilling these principles by using machine learning in this way and I think you know there's more on the actual principles website that you can go find but I think this is really really important and and I'll point out that some of these things are evolving research areas as well as aspirin as principles that we want to apply so for example number two avoid creating your reinforcing unfair bias and bias in machine learning models is a very real problem that you get from a variety of sources could be you know bias training data it could be your training on real-world data and the world is it itself is biased in ways that we don't want and so there's research that we can apply and extend in how do we reduce bias or eliminate it from machine learning models and so this is sort of an example of some of the work we've been doing on on bias and fairness but what we try to do in our use of ml models is apply the best known practices for our actual production use but also advance the state of the art in an understanding bias in fairness and making it better and so with that in conclusion deep neural Nets and machine learning are really tackling some of the world's great challenges I think I think we're really making progress in a number of areas there's a lot of interesting problems to tackle and and to still work on and they're gonna affect not just computer science right we're affecting many many aspects of human endeavor like medicine science other kinds of things and so I think it's a great responsibility that we have to make sure that we do these things right and to continue to push for the state of the art and apply it to great things so thank you very much

32 thoughts on “Deep Learning to Solve Challenging Problems (Google I/O'19)”

  1. Great talk. One thing I hear being said too much, though, is that humans don't get to pool their experiences whereas robots do. I'm sure the efficiency and integrity of robots sharing knowledge is much higher than with humans, but shared knowledge amongst humans is the basis of civilization. There would be no Google if every person ever born had to learn from scratch. Rant over.

  2. With reference to scientific learning: When you have a lot of data, but no data in the particular point in parameter hyperspace that you are interested in, what do you do? Extrapolate the model will result in bias an loss of accuracy. Experiments in real world systems seems unavoidable , experimental datapoint that is ofthen very expensive. The interaction between Machine Learning modeling and the planing and execution of experiments seems to be new and very interesting research area.

  3. I was all happy until the very end whenever I heard them talk about bias. I'm sorry, too many legitimate channels have been brought down for supposed "bias". Until you can 99.9999999999999% chance that a computer is unbiased, please just stick to image recognition, because Google, so far you have been not good regarding bias.

  4. Regarding autoML, after time there would seem to be an ever increasing corpus of models. Humans, being the limited creatures that tend to have the same problems, might not actually need to have a ‘fresh’ model trained every time their brain perceives a problem that needs solving. That solution probably already exists and has been solved. Rather , it might be faster (and much less energy intensive)to simply archive these models with a set of useful metadata so that a google search can find the model that solves the problem. And metadata selection and assignment to individual models can be automated after they are designed by autoML. The metadata can be considered as the ‘label’ for the model. This metadata can also be used to ‘explain’ to a user ‘why’ the machine selected a particular model/algorithm. in addition, the machine would be able to engage the user in a ‘conversation’ – as it ‘asks for metadata’. The user would perceive this discourse as questions about the dataset/problem that he/she has-meanwhile the machine is building an information tree to sift/sort from its vast library of models. This also addresses the human problem where the user often starts by choosing the wrong approach to solving the problem. Or just as often uses the ‘cooked spaghetti’ approach to model selection – throw them all against the wall of the problem and see what sticks.

  5. Regarding automobiles, we built the auto interface for humans. We leveraged our built in sensors (eyes and ears) and designed a bipartite system – vehicle and road. But why are we now trying to shoehorn AI into that human centric system? If we were designing a system from scratch for AI and machines, would we build it the same way? Would it not make sense to build telemetry into the road – make the road more intelligent and let it direct vehicles more directly? Do we need vehicles that can go where there are no roads? This would reduce the cost of complex and hack able vehicle based systems.

  6. This is good for robotics and computer vision applications.
    Are these slides available anywhere?
    Expect more developers to share the trained AI models into the Model Play.

  7. Excellent talk. 

    This is a great example that how a true expert talks about innovations and deep learning with simple but accurate words; without overhyping and bombarding buzz words on their audience.

  8. Please explain how developing artificial intelligence solutions for subsurface data analysis in oil and gas exploration and production is 'socially beneficial'.

  9. Jeff Dean! my idol! yes! recently i began to try something new, and me with my team has made an app for google coral dev board edge tpu, and it goes well my app: model.gravitylink.com

  10. I always had this idea that AI should be smart enough to determine what models should be tried when it was given a data file. It should be able to run an initial analysis to classify the nature of the file and predict the intention of file usage. Base on this analysis it should be able to find the best one from the existing models. And if it could not find one, it should be able to create a new one. I guess Google has already put my idea in practice.

  11. The slide at 4:45 with the grand engineering challenges for the 21st century helped a lot. I often get overwhelmed or confused by all the projects and applications coming out of the tech world. Many of them don't make sense. This slide gave me a good framework for making sense of what these technologies are trying to solve.

    Good presentation. 😊

  12. i wonder how can someone trust decisions he does not understand. models built by a.i. must be clear to humans before use.
    i wonder how can someone say that image recognition made a big leap when child need a few examples and computer need a huge library of images. something wrong with this "nutcracker" approach – isn't it obvious?

  13. Outline:
    Restore & improve urban infrastructure(Combining vision and robotics for grasping task, Self-supervised imitation learning)
    Advance health informatics(Predicting properties molecules)
    Engineer the tools of scientific discovery(Tensorflow and its applications)
    Some pieces of work and how they fit together / Bigger models, but sparsely activated(Sparsely gated mixture of experts layer-MoE)
    AutoML-Automated Machine learning"learning to learn"(Cloud AutoML)
    Special computation properties of deep learning(Reduce precision, Handful of specific operations)
    More 36:49

Leave a Reply

Your email address will not be published. Required fields are marked *