Data Scientist vs. Machine Learning Engineer. Who Has a Cooler Job?



smiling who has a cooler job but I should stop so I think obviously in a data-driven Nana I can conclude that a data science job is cooler there scientist Java schooler because my name is Nikesh Bajaj I'm a machine learning engineer at Facebook I'm street Agora party I'm a scientist at uber I think we know each other through like friends we were really good friends I think nikunj was like my first friend in Berkeley when we started doing our graduate studies together we started in 2013 we were in different programs but we had a common friend and then we started like hanging out a lot and kind of just talking and going on walks philosophically Scotians feminism you know Berkeley stuff basically and and then yeah I think it's just kind of evolved into eventually we are married yeah well yeah we are married right now yes so let's rock paper scissors to who goes first rock I go first go for it so nikon's what do you think about the future of AI hmm that's a big one oh okay so I I'm a machine learning engineer and I'm very positive about the future of AI because I think that we haven't even scratched the surface so far in how we can leverage AI to improve our lives like you know it is already affecting our lights the baby commute the way we shop the baby eat food and I believe that it's gonna go a lot more and I have a strong feeling that in the future AI will become such a nice counterpart to human beings that these two smart Minds can just interoperate with each other and make the world a better place to live in yeah I'm I totally agree you and I also actually I'm quite excited because I feel like we're in the situation where it's just scratching the surface yeah and it's creating those opportunities and I feel like we're all starting up in this journey of AI and we can be the leaders who can actually make the important decisions about like privacy safety making sure that AI for good you know it's happening in that direction so I'm actually excited about notches the future but like the roles that we can play in kind of me taking it forward solutely I'm ready all right the big one how did he become a data scientist so I finished my undergraduate in IIT Bombay in civil engineering I was focusing on transportation engineering so that's where I like really loved the the field because it was extremely diverse like you could be doing anything you could be focusing on kind of the behavior of people in transportation choices how aircrafts fly how do we manage airports how do we model traffic on the road so it was a super diverse field where you could really focus on any one part so I was like I need to spend more time studying you know and understanding this this area so I did a master's in Berkeley I got like I kind of got a funded master so it was like pretty easy for me to come and come here and study and it was a no brainer I had to do it and then during my master's I actually by by pure like accident I got introduced to this like course on which is which is an introduction to machine learning and I was six yeah sounds the name sounds fancy let's just go see what this is like and I went and I I was just like just I was just so absorbed you know I was I would go to like oh my god like this you could you could like you could actually measure this you could predict so well you can actually understand how like where where the rain will fall so accurately something I am doing in my class you know it's predicting the rainfall in Sierra Nevadas so like we have this super super cool course projects and then I was like absolutely sure that I have to know study more apply my like my education to like some of the applied problems in transportation and so I did my PhD where I was focusing on air traffic management and everything to improve that and to provide kind of better tools and stuff so and then I did an internship in Apple maps to understand like what do I want to do industrial academia right and then I just loved the fact that in industry the the ideation to impact cycle is really short that's true like you you think of something you prototype it and then you experiment it and then it's live in like a couple weeks sometimes or even faster so I I just really loved that and I wanted to pursue data science I graduated looks for jobs going to Auburn so I and Here I am perfect match transportation and data science were super cool so I think this time we'll answer a little bit more technical questions on machine learning let me start with asking you a question so what's the difference between a dear scientist and an engineer I think that term data scientist and m/l engineer is used pretty loosely across the industry I think and like you know different companies have different rules different requirements from a data scientist and in-a-million jr. and sometimes people switch hats as well so it's difficult to give an answer that's like conducive to all kinds of scenarios however in general like when we think about nml engineer versus when we think about a data scientist there is still some difference that we almost fundamentally have in our head so we could potentially talk about that so when I think about an m/l engineer right one of the there are essentially two main parts to my role one that I'm doing machine learning and I'm trying to build models to solve a particular product use case and second I'm doing engineering as the new name goes right so not only I'm building the model but one of my main objectives is to take that model and deliver it to the end-users that how can I build an engineering system which is robust enough solid enough that it can deliver that model to the users and then take care of aspects like what is what's going to be the runtime complexity of this thing what's gonna which part of this model would run offline versus which part of this model would run and think about all those elements of solving the problem however in in fact I would like to hear like what what is the data scientists job from you instead of trying to answer myself yeah I that's actually a good point like it's it's it is very fuzzy I think across industries and there's a lot of local locality I think also not just like across different industries they're also the size of the company matters right if you're in a very big established company like Facebook like uber you have more very specified rules so it's like no clear what the differences are but if you're working in a start-up or slightly smaller company I feel like the roles really start merging and blending a lot because you need to be a little bit more full stack so so with like in terms of like more big company situations where it's more clear what the differences are as a data scientist my focus is how do I solve problems that I'm seeing for my users at the end of the room so an example is I know I can allude to the traffic before so just continuing on that theme if I need to find what's the time taken for uber to go from point A to point B the travel time to go to do that that's that's a real-world problem so as the different test I come in there and I'm like hey how do I formulate this problem into a mathematical question right so that that's where I start to think about like what's the mathematical problem so it's like okay so there's a there's an origin there's a destination and I need to find the time to go from point A to point B and so then I start thinking about okay so what's the data needed for that right so you know map data there's traffic information there's some amount of like information on kind of the signal surroundings is it cloudy is it is it like this congestion is a rush hour etc so I need to first collect data so that there's a lot of lot of work on data wrangling you're cleaning kit are coiling SQL and all that stuff and then once that's done the next question is like okay I need to now that I have the inputs I need to solve all the problems so that's where what's the right model for this question what are the assumptions that are made for this model are they valid for this particular scenario they make sense so then I start to think about model training model evaluation and then once I have something that works the next question is hey how do I know this is better than what I'm doing right now like it's it's okay it's one thing to build cool models for the coolness but you know it has to serve the purpose that that of the problem which is how do I predict the travel time to go from point A to point B and it's better than the options I have right now so that's where evaluation and metrics become really important I need to make sure that I understand what are good metrics for this problem right like what are the numbers that impact my users the most like what about what are the ETA s that are most important for the drivers and riders and the consumers of the app right and then once I have these metrics in place and I can say well this is a bet this is a good algorithm the next step is how do I launch this how do I find like how do I close the loop and like measure the impact on on my users when it's live so kind of go into the whole cycle of like going from very abstract problem to like actually quantifying the impact of it end to end right I think that's the data scientists and the difference from what I'm hearing from you which is really well put well is that I am thinking more of the problem solving and how to solve it what to solve and how to measure and you're thinking of how to implement right so that's kind of the theme yep do you need a CS degree to do your job well short answer is no because I'm employed in the other science I don't have a CS degree so I think I'm living proof for that answer but I think on a more serious note so a CS degree is is useful at the resume level so I think the resume level is when you have to you have to kind of concisely put the information together that hey I have good skills that are relevant for data science for machine learning engineering and kind of the general area of this work and so there a degree comes in handy no because it's standardized do you everyone take certain courses to finish a degree so you write the person who's reading a resume says hey I know you've finished all these requirements but that's not the only way to kind of put the information there so if you can like if you have taken any if you've done boot camps if you're taking any online courses if you have done research projects if you've done cattle competitions that could be other ways to show your enterprise in their designs and how you how you like kind of a self like self learner so once you go beyond the resume resume screen stage I think what matters the most is what you know not the degree you have because I've seen both cases where people have like these just amazing degrees less beautiful resumes but they come on site or they come on a phone call and I'm just disappointed like there's there's really no understanding of how to solve problems and I've seen the other cases where people have like a please please like thin norske P resume which without without like a lot of standard degrees but they're just they just know how to solve a problem and and that's what really matters that we've taken a bunch of interviews and we understand that what matters to get the job yeah was getting screened for the job make sense yeah yeah so the question is can a data scientist become an m/l engineer and vice versa I think so I think I think weak people make transitions from being a data scientist to being an m/l engineer and vice versa pretty often honestly and the reason I believe that transition is not like a big jump is because anyways in your day to day job a data scientist and animal engineer have to work very hand in home to deliver a solution because let's say a data scientist is building a model to solve a problem like you know one of the examples you mentioned is the ETA problem yeah now that ETA problem can potentially be solved using like 20 different models however looking at the engineering say within Guber potentially not all of the models are feasible either based on the limitation of some kind of data or based on the engineering system that is powering the the app right so the data scientist has to understand some of the engineering aspect to actually make the right model choice and for that they have to understand they have to have the ML engineering know-how in their toolkit right similarly an ml engineer cannot really design a system and until the understand what goes inside the model because like you know some models can be training time heavy some orders could be prediction time heavy so like you know they do they need to make decisions on what goes offline what goes online how do I make my database choices how do I even implement the right algorithm right so for all of that ml engineer needs to be pretty well words with with the data science models etc so clearly like you know all of these two people understand each other's jobs and and that would definitely help when they're making the transition they I guess like when they are really making the transition they need to go knee deep into each other's roles so for example if you are a data science engineer and you design data scientist and you understand engineering you need to be able to go knee deep so now you can start actually taking calls on like you know you can make engineering decisions so you need to do some study or some practice for that and voiceovers are like you know as an ml engineer I understand some of the models but can I actually make decisions myself maybe I need some coaching for that you need to understand a little bit deeper into the other role but you can make the transition yeah that's a totally fair point I actually have a couple friends who've made this transition recently yeah they came from a start-up I said your data scientist and I as we said before the roles can be quite different you know in startups versus big in big industries and they came in and they realized that you know what makes them happy it's the more machine learning engineer part of the role than the data science part of the role and they came in and they started like doing their science activities that I kind of I was talking about before and they they kind of like built more and more and more technical strengths more and more coding coding skills and they made the transition yeah and I'm sure it helped that because they were coming from a start-up they probably were doing a bunch of different things so they incline more towards engineering yes yes but also the fact that I really like the point the fact the point that you said which is you know as a as a data scientist I do have engineering know-how but I'm not making the decisions I'm not leading the decisions right and vice versa and kind of to make that transition I need to be in a position where I'm trained enough to know what's the pros and cons of different options I have which needs some land of work right yeah ready for the next question okay what skills do you need to become a dear scientist and the same question for an ml engineer for machine learning engineering I think it's pretty obvious the name says it all you need two different skills one is you need to be good at machine learning second is you need to because that engineering so when I say you need to be good at machine learning right you need to understand the models and not just how to use the models because a lot of times people tend to use these popular libraries and almost think that they know machine learning but I really think that to be a good ml engineer you need to zoom into it and understand what's happening inside the model like you know what is the maths behind the model what are the like you know in what scenarios what model could be properly used so you need to have that understanding and that's a very important skill to build so yes understanding the theory understanding the mathematics is important secondly in terms of engineering you need to be phenomenal coder hands down like that's like probably one of the most important skills but beyond that you also need to understand the general engineering fundamentals that like you know so that you can make proper design choices when you are developing a system and that comes with both like you know reading and actually experience like you know you work you read and learn from what other people have done but you really become an expert at it by designing more and more systems like that so that's kind of like you know more acquired kind of an skill so I think for your science it's three pronged the first part is I cannot emphasize this enough is problem-solving skills you need to be able to go from really abstract problems into mathematical formulations effectively and that includes having a good amount of business acumen so you need to be able to I go from hey like my users are are churning to like how can I go from that kind of like business problem into a more mathematical problem so that problem formulation that communication is really important there the second phase is the technical part you need to be good obviously at data science you need to be really good on two aspects in data science the first being just theoretical data science like I think as a you someone like whose D s should know what models are relevant at what point what assumptions are made for these models are they valid for the problem that you're looking at is the data is the data value getting the assumptions for these models and will the will this model provide the output of the output of choice like probability scores versus the continuous variables versus classification so you need to have this good know-how of what what model applies where and what the limitations are and assumptions are and the second part is coding I think this is this is very often overlooked for data scientists coding is really important you need to be good at prototyping whatever your ideas are to really be effective data scientist you need to go from ideation to execution relief iterate really effectively right so being good at program program skills like working in Python working in are super important and I think the third aspect is once you have you know something really fascinating and impactful ready you need to communicate you need to work with stakeholders to make sure that it can it can be production eyes it can be put out there for impact so you need to work with your engineering managers with your part managers you to communicate with leadership there's a lot of storytelling for data Sciences like why is this problem important how do I solve it what is the impact of doing it so I think it's like these three aspects together really form an effective data scientist last like a question my smiling who has a cooler job but I should start so I think obviously in a data-driven manner I can conclude that a data science job is cooler their scientist Java schooler because you know at the end of the day no matter what happens the data scientist has to sign off on the project they have the data scientist has to say this is making sense it's a goal this is not making sense it's a no-go so whatever happens within you know well I disagree and I disagree with all my heart because a machine learning engineer not only gets to solve the real cool problems like and feel like you know awesomest models and everything the most important part being they actually get to build code that is shipped to users how cool is that like you actually make an impact to users like a direct coding impact and that part is super cool to me like I make changes and that's affecting users life right away I guess we can go disagree on this and we wanna have peaceful lives as husband and wife oh yeah that I don't want to say too much I wanted to just tell one general thing that I have noticed quite a bit as part of like it being in the industry that machine learning engineers often make this mistake that like you know are not necessarily mistake but like you know they go into the loop of building the awesomest model ever and like you know increasing that accuracy metric or that precision or recall from that ninety five to ninety five point two to ninety five point five a ninety point ninety six percent and without often realizing that the product impact of that is not as much but that takes a lot of time because we know that taking a models accuracy from 20 to 80 percent is a lot easier than taking it from say 95 to 96 percent yeah the last bit keeps on becoming exponentially harder and what's really important is you build something that works something basic delivery system end to end and then keep on iterating from there because you never know by the time you have increased the models accuracy from 95 to 96 percent your product requirements might have changed you actually want to launch things fast to the user and then go ahead and iterate things so just a little tip that I have noticed if you're listening to us then I'm sure you're interested in dealer science this I think now is a great time and a great place to be in if you are interested in data science so you know kind of it's it can be hard in the beginning if you're just figuring on how to get how to get into this area but you're not limited by not having a certain degree not having computer science background or something you can always pick it up with sufficient help online so definitely keep going at it and good luck yep good luck everyone yeah hey I'm Eliza from springboard we're an online school that gets you hired all of our courses come with a job guarantee one-on-one mentorship and real-world projects we teach UX design data science and machine learning engineering to learn more check out the links below happy learning you

14 thoughts on “Data Scientist vs. Machine Learning Engineer. Who Has a Cooler Job?”

  1. 1:22 What does the future of AI look like?
    2:48 Becoming a data scientist @Uber
    5:15 What's the difference between data scientist and ML engineer?
    10:13 Do you need a CS degree?
    12:04 Can a data scientist become a ML engineer? Vice versa?
    12:42 What skills do you need to become a data scientist/ML engineer?
    19:35 Who has a cooler job?
    20:54 Tips and advice
    22:33 Kickstart your data science/ML engineering career

  2. Amazing video! It has really help me to decide what I want to do and where I am! Where I can give more justice.
    Thank you so much! Keep on giving such amazing content!

  3. Why would you work for a slave plantation like Uber? To help code the app to take half of the driver’s hard earned $?

Leave a Reply

Your email address will not be published. Required fields are marked *