“Deep Learning was great, what’s next?” – Yoshua Bengio (2/4)


So I often hear some journalists or all
kinds of people ask me so okay so Deep Learning was great, what’s
the next thing and what answer usually is I believe that the next thing is,
just like we normally do in science, it’s gonna grow on top of the concepts we’ve
built in our toolbox of science so we found a lot of interesting things like I
mentioned the idea of distributed representations, the importance of depth
having multiple levels of representations, the optimization methods
that we use that are working a lot better than than we thought they should
so the the idea of end to end training for example so there are a lot of
important concepts that we already know in this set of principles that we think
are necessary for intelligence and what I think is gonna happen is we’re gonna
put more things in this toolbox that allow us to go to tasks and types of
intelligence that we don’t master yet with Deep Learning. So what is it that that we’re missing and that I want to focus a bit of the
discussion on for today. So one of the most practical aspects so it’s
not about how it’s done but the outcome so that’s something I mentioned
already like sample complexity that the ability to generalize from a few examples
when you’re learning a new task. Another thing I didn’t mention but is maybe even
more important, humans are able to generalize in a way that’s more powerful
than current machine learning in the sense that we can generalize to what we
call out of distribution data in other words data that doesn’t look like your
training data when I tell you a story of something hypothetical like a science
fiction story you can easily understand the story you can imagine what’s going
to be the end of the story if I tell you the beginning even though it’s a
nonsensical story like it’s not gonna happen in real world so under your life
it’s an impossible thing right but humans have no problem with that we can
we can we can do what mathematicians call
counterfactuals we can imagine what if such and such thing was true what would
be the consequences right and this doesn’t fit well with the the framework
of machine learning where we assume that there’s one data distribution and then
we’re gonna collect some random part of it it’s gonna be our training set and
we’re gonna hope that we’re gonna train a machine applied on data from the same
distribution but if I show you an example like a story that comes from a
different distribution that it doesn’t even overlap with your training
distribution right in other words it has zero probability under your training
distribution. How could the machine possibly generalize but humans can do it
and they can because somehow that story that science fiction story that I told
you or that that you imagined that I told you um is told using concepts
that you know right it’s recombining pieces of knowledge that you already
know in ways that are unusual but still you can make up what the meaning is okay
this is something that linguists called systematic generalization which we know
humans are able to do and that seems to be hard for current machine learning
systems we have recent papers studying this and we can see that they will they
will not generalize well so there will be a severe drop in performance when you
do these kinds of things on them and this has practical importance because
let’s say you consider an industrial deployment of machine learning it’s
trained on data that was collected and there’s some circumstances maybe in the
lab or something and then it’s gonna be deployed in some other circumstances
maybe he’s trained on data from one country and it’s gonna be deployed in
another country or think about very rare cases, so one example I like is you
know people are trying to build autonomous vehicles and it’s gonna be
very difficult to collect a lot of data about very rare and dangerous situations
like accidents so how do humans manage to drive even
though they haven’t been exposed to a lot of personal data of these rare
dangerous situations like you know I had actually only one serious accident in my
life when I was 18 and before, or 20, so and before that accident of
course I had zero examples of really dangerous situation and then for the
rest of my life I drove with a single example okay and I guess it worked
because I haven’t done any more accidents so you know when you’re young
you know but anyways the point is we can generalize. So how did I do it? Right
well even before an accident I could imagine what could happen right so I I
could I could imagine what would happen if I broke in the middle of the highway
pushing the brakes and the car behind me you know hit me or something so I can I
can create fake situations in my mind this kind of thing we don’t have right
now in our models right but we can we can imagine those things and we can
design models that would have these properties

1 thought on ““Deep Learning was great, what’s next?” – Yoshua Bengio (2/4)”

Leave a Reply

Your email address will not be published. Required fields are marked *