Machine Learning Tutorial Python – 5: Save Model Using Joblib And Pickle



we will look into two different approaches of saving a trend model to a file which you can use later on to load the model forma file into a memory and use that to make actual actions solving a problem using machine learning consists of two steps typically the first step is training a model using your training data set and the second step is to ask your questions to the Train model which sort of looks like a human brain and that will give you the answers often the size of the training data set is pretty huge because as the size increases your model becomes more accurate it is like if you are doing a football training and if you train yourself more and more you become more and more better at your football game and when your training data set is so huge often it is in like gigabytes the training step becomes more time-consuming if you save the train model to a file you can later on use that same model to make the actual prediction so you don't need to Train it every time you want to ask these questions right so if you have it saved to a file now I don't have a training step here and I can directly ask a questions so that's what we are going to look into today we will write a Python code to save that model to a file here I have a Jupiter notebook which I used in my first tutorial of linear regression of predicting home prices the code here is pretty straightforward I am loading home prices from my CSV file and then using linear regression to make that your prediction and here saying that 5000 square feet home is gonna cost me 850 $9,000 so let's use pythons pickup model now you guys might be aware about pickup model already it allows you to serialize your Python object into a file so here I will use the file so I first safe with open model picker and I'm going to write binary data hence I'm using WB mode in the file so first I am opening a file and then what I will do is I will say pickle dot jump dump my model into this file when I run this what actually happens is in my working directory it created this model pickle file which if I open in minot pad looks like this this is some gibberish and it is expected to be gibberish because because it's a binary file okay you actually don't need to care about the content here but what you need to know is your model is saved into a file now now you can use the same model here so what I can do is here I can say model is equal to pickle dump load the file okay so again I have to open the file pointer so it's the same file but this time I am using it in a read mode and it's a binary for file hence I have supplied be here now I have my model I just say MP so now I have my model loaded from a file into a memory and MP is the object if I use not MP object to make the prediction okay I want to ask what is the price of my 5000 square feet home then you can see that it will give me the same answer as I got it here at this tab so this is beautiful because now I can supply this model file to a friend of mine and I can say okay here is my train model or a train brain go use it for your actual problem alright so you can ask the questions to this model and it will give the answers there is a second approach of saving model to a file which is using Escalon job Lib so if you google SK learn model persistent you will find this link where SK learns document documentation shows how you can use job Lib to do essentially the same thing so then if it is doing the same thing then what's the difference between pickle and job lip as per the documentation if your model consists of large number arrays then using job Lib might be more efficient now I have not done any profiling myself but you can go ahead and do it on your own and figure out which one you want to use but usually people say that when you have a lot of number Ray's job Lib tend tends to be more efficient but essentially it gives you the same functionality so I will first import job Lib in Jupiter notebook you can hit tab and it will show you the autocomplete so here there is external models from that I will import job Lib now the difference between job live API and pic alipio is that job lik can take the file name directly so I have my model and I want to see that model to a file I will say model job Lib I mean execute this it saved this model to this particular file and when I go to my working directory I will find this file here it is just updated right now 631 is the timestamp when I open that file into not paired I will again see some gibberish because this is also a binary file again you don't care about the content here what you need to know is your model is successfully saved and you can load that model using job Lib dot load give the file name in return you get your model object back and that model object you can use to make actual prediction and it gives you the same answer here what it is saving inside that binary file is different things such as for example if you look at coefficient the coefficient is same as what I got it here so it's saving all these essential pieces for your model okay that's all I have for this tutorial I don't have any exercise today but you can go ahead and save your model using job Lib and pickle into a file and I have gone through linear regression models today but you can pretty much say any other kind of machine learning models using these two awesome model you

15 thoughts on “Machine Learning Tutorial Python – 5: Save Model Using Joblib And Pickle”

  1. Can you please add some videos for neural network using python also, it will be very helpful, if you can upload the same , will be very thankful to you.

  2. Dear Sir:
    when i type model.predict(5000) it shows this error "Expected 2D array, got scalar array instead:

    array=5000"

  3. Thank you for the nice and clear explanation. Now, if I want to take that trained and saved model to use for a different data-file (with same labels of course), how to do that?

Leave a Reply

Your email address will not be published. Required fields are marked *