Machine Learning Tutorial Python – 10 Support Vector Machine (SVM)



support vector machine is a very popular classification algorithm and that's what we are going to cover today we'll start with some theory first and then we will solve a classification problem for iris flowers using SVM in the end we'll have an interesting exercise for you to solve so it's gonna be a lot of fun today so please stay till the end in the picture you are seeing iris flower which has four features petal width and height sepal width and height based on these four features you can determine the species of their eyes flower there are like three different species this data set is available in SK learn our data set module so you can easily import it on this scatter plot I have two features pattern length and petal width just to make things simple and based on that you can determine whether the species is setosa or versicolor now when you draw a classification boundary to separate these two groups you will notice that there are many possible ways of drawing this boundary all these three are valid boundaries so how do you decide which boundary is the best for my classification problem one way of looking at it is you can take nearby data points and you can measure the distance from that line to the data point so here you can see the distance is smaller here the distance is higher so this distance is called margin so which line is better the one with a lower margin or the one with a higher margin if you think carefully you will realize that the line with a higher margin is better because it classifies these two groups in a better way for example here if you have a data point in between these two lines then this line will probably misclassified versus this line will classify it better and that's what support vector machine tries to do so it will try to maximize this margin here between the nearby data points and the line itself these nearby data points are called support vectors hence the name support vector machine so in case of a 2d space where you have two features the boundary is aligned in case of 3d the boundary is a plane what will it be if you have n dimensional space usually you have n number of features right so just pause this video for a moment try to visualize what that boundary will look like and you'll realize that it's kind of impossible to visualize it but theoretically or mathematically still possible and that boundary is called a hyper plane so hyper plane is a plane in n-dimensional and dimension that tries to separate out different classification groups and that's what support vector machine algorithms tries to do so we need to familiarize ourselves with certain technical terms such as gamma and regular a regularization and we'll go with these terms so on this graph you can see that this decision boundary is only considering the data points which are very near to it ok so this is one way of drawing a decision boundary you can see that I have excluded these data points and these far away data points in making my decision for the decision boundary the other approach of looking at the same problem could be that you consider the Far Away data points as well so on the left hand side I have a high gamma and right hand side I have a low gamma and both the approaches are valid is just that with low gamma sometimes you might get you might get accurate a problem with accuracy but that might be ok you know it might be computationally more efficient so both approaches are right it depends on your individual situation the other example here is this is a separate data set where I try to draw my boundary very carefully to avoid any classification error you can see that this is almost all fitting the model so if you have a very complex dataset this line might be might be very zigzag and Wiggly and it might try to over fit the model on the other hand I can take some errors so here you can see this there is this classification error which might be okay and my line might look more smoother so on the left-hand side what I have is a high regularization on the right hand side I have a low regularization with low regularization you might get some error but that might be okay that might be even better for your model then we will use SQL and library to train our model you'll see a parameter called C and C means regularization basically you might have a complex data set like this so what do you do like here it's not very easy to draw the boundary one approach is create a third dimension so I have x and y here so what if you create Z dimension and the way you do it by doing X square plus y square so you are doing some transformation on your basic features and creating this new feature here and with that you will be able to draw this decision boundary so the Y plane is perpendicular to your monitor right now so that's why you are not able to see it okay so just try to visualize the situation it's not very hard and once you have the boundary you can superimpose it on your Y plane Y and X the plane form by X and Y axis and you will get a boundary like this so the Z here is called a kernel by kernel what we mean is we are creating a transformation on your existing features so that we can draw the decision boundary easily as usual I have my Jupiter notebook open here and I have imported iris data set from s a library and when you look at the IDIS data set the basic features you will see it has these values right and when I do feature names and see the four features that I shown you showed you in our first picture alright so what I'm going to do is create a data frame out of this because it's easier to visualize this data set with a data frame so I will do something like DF is equal to PD dot data frame and my data is in this eidest dot data and my columns is actually the feature names alright and when you do the F dot head you will see that I have nice data frame like this ready I also have target variable in this target parameter so I will append one more column called target in my data frame so now I have my target all right so the target be the possible values of target is 0 1 & 2 so what does 0 1 2 means so for that if you do IDs dot target names so 0 index means setosa one means versicolor 3 means virginica so we have three type of iris flowers and the type is determined by these four features sepal width and height and length the petal width and height okay all right so now I want to I want to do some exploration and see how many of my data points have one in it so what I'm going to do is I will say DF target equal to one I want to see how many rows in my data frame has alright I think I had a spelling mistake so you can see from row number 50 onwards my target value is 1 which means it is versicolor similarly if I do too you'll notice that it starts from 100 and my total data points is 150 so 0 to 50 is setosa 50 200 is versicolor and 100 to 150 is virginica let me add one more column called flower name so that is clear so flower name is equal to DF dot target so from one column you are trying to generate another column and the way you do that in pandas is by using apply function and here lambda is just a small function or a transformation that you can apply on target column and you can generate this new column called flower name okay so ideas dot target names maybe so for each value in my target column it will so that each value is X and it will return the index of the the value in target names so for example if it is 2 so in this array 0 1 2 is virginica so virginica will be placed in flower name column and if you want to see it you can see that now I have a new column called flower name if you export this to CSV file it will be even more clear on how this dataset looks so you can visualize it better basically now let's do some data visualization too and you know that in order to do that you have to use matplotlib you can use some other libraries such as bouquet or there are a couple of libraries available for visualization right so from mat plot lib import be wide plot SPL tea and this is inline magic this is a concert specific to Jupiter notebook right now let's first create three data frames what I want to separate these three species into three separate data frames okay so how do you do that the F dot target is equal to zero is first data frame the second one is this the third one is this so now I have a three different data frames right and if you want to look at those data frames you realize that the first one is setosa second one is versicolor the third one is virginica now let's draw a scatter plot so you can do PLT dot scatter and specify your X&Y here okay so what is your x and y so I am going to plot the first two flowers so DF 0 and it's a 2d plot so I will only use these two features okay so let's do this you can specify color and marker okay so let's say my marker is this so my scatter plot looks like this okay now I will do the same thing actually I made a mistake I should have DF 0 actually so DF 0 is a separ word and DF 1 and let's say this color is blue so you can see that there is a clear classification so if I use my you know SVM algorithm here it will probably draw a boundary like this all right you can also just to make this plot clear you can specify X label and Y label also so what is my X label X label is sepal length and while Abel is my weight you know when you're looking at the chart now it's very clear all right let's plot pattern length and petal width also okay so I'll just copy paste this guy here all you need to do is replace apple with petal and this is just plotting two more features and you can see that this has even clearer distinction so looks like our SVM is gonna perform really well because it will be able to easily draw very nice and clear boundary between these data points now here I have plotted only two features each on on these scatter plots actually when we train our algorithm we are going to use all four features and the classification will also be between all three species now let's train our model using Escalon right so the first step is as usual you strain test split to split your data set into training and test data set remember you don't want to taste your model on the same data set that you hope or found training on because it might be biased okay so do this or DF data frame has target column also so I want to remove those first all right and the way you remove them is you want to say I want to drop certain columns from my data frame and which are that which are the columns that you want to drop okay so let's look at our data frame our data frame has all these okay out of this first four columns are your features that you want to use for training your model second two columns are your target columns okay so I want to drop target and flower name perfect and my Y will be just the F dot target okay so if you look at why it's gonna be your usual numbers between zero one and two okay so now let's use train this split X&Y your test size let's say I want to use 20% of my samples as a test and 80% as a training all right so extreme the output of this train to split method is extreme X taste Y train Y test and if you look at length of your X tree in the next disc you will notice that this is thought 20% of your 150 samples right so it looks good now now let's from SK learn dot sv m import SVC so the classifier is as we see basically and this is how you create SVM classifier and as usual you can call fit method to train your model okay I'm going to use extreme white rain so my model is trained you will notice some parameters here such as see if you remember from the presentation earlier this is a regularization parameter you have gamma also and you have corner okay all right now let's look at the accuracy of our model you can call score method and now this time I'm going to use X test and wire test so my model strain and I am performing the accuracy of this model by supplying this X test and whitest what the score method is gonna do is it will take X test it will predict the values so it will be called like Y predicted and it will compare Y predicted with Y test to measure how accurate the model is and you can see that it is point 96 if you execute this again your x train and x taste changes you know the sample changes so I want to not train it on a different set of samples so again it's like point 96 you can now use the separate parameters for example here by default C is one point zero so what if I modify the C parameter let's say you do C is equal to 10 and on the same data set if you train it see increasing regularization is actually decreasing my score so you can use these parameters to tune your model and you can do this on like a couple of data sets using cross-validation techniques and you can figure out what parameters are best suited for your given problem right you can use even gamma also so gamma is let's see if I use gamma is equal to 1 what happens ok I get the same score gamma is equal to 10 again a same score so looks like this is not making much difference also gamma hundred is gonna make the model performance worse alright you can also use kernel so by default the kernel was RBF let's use linear now how do you know which kernel to pass here right so if you use sift tab it will show the help from the SQL API and you can say these are the possible kernels I have available to use right so let's do linear and with linear also you are getting a very high accuracy score we have realized that it never goes beyond point 96 so point 96 looks like is the optimum score that you can achieve alright let's move on to the most interesting part which is the exercise remember learning programming is more about practicing it it's sort of like taking swimming lessons right if you see someone giving instructions for swimming on video you're not probably going to learn it you have to jump into the pool all right so open your Jupiter not working let's start doing this exercise for this you are going to use SK learns digits data set so the digits leaders it looks like this it's like handwritten digits and you have to classify it into one of the numbers from 0 to 8 all right and you're going to use different kernels different regularization and gamma parameters to tune the model and you have to tell me which parameters gives you the best accuracy for this classification also use 80% of the samples for training if you look at the video description below I have a Jupiter not book used in this tutorial available on github repository look at that Jupiter notebook go towards the end and you will find the exercise section so I have mentioned all the criterias for that exercise so do this and post your answers in the comment section below one last thing is I didn't cover the mathematics behind this SVM model and different kernels because it is gonna make it will make probably this tutorial very long so that's something that we'll cover in some of the future videos alright thank you very much for watching bye

25 thoughts on “Machine Learning Tutorial Python – 10 Support Vector Machine (SVM)”

  1. Hey guys. Anyone can please show me how I can do the following but with list comprehension:
    df['name'] = df.target.apply(lambda x: iris.target_names[x])

  2. In linear kernel score is 96.9 percent and in rbf kernel score is 40 percent…
    With gamma value the score is 0.06… And with the regularization value the score is around 45. 83 percent

  3. Sir please make videos on neural networks, anomally detection and unsupervised learning….. I am eagerpy waiting….. Your last video i have seem is random forest… Please upload more

  4. hello great videos, loved this series. Can you please do a video on imbalanced data sets in classifications problems? Maybe just add onto a previous example you have but with a case where there are very few "1" or "true" values compared to "0" or "false" . thanks for you consideration!

  5. Learn Python programming full course step by step with Techno Tutorial Hub
    https://www.youtube.com/watch?v=Z_daH2nlYpk&index=25&list=UU93ZAQyQzNngFlzULWqvexg

  6. Thank you for the amazing Tutorial.

    My Observation on the Exercise:
    1)The accuracy falls very less if I change the value of other parameters like gamma and C without changing kernel.(0.087 = 8.7%)
    1) In 'rbf' Kernal mode the accuracy was around 42.222% and changing the kernel mode to 'linear' the accuracy was about 97.5%
    2) The accuracy remains constant if I change the value of other parameters like gamma and C after setting the kernel to linear. (97.5%)

    Thank you again. Hope the next video is going to be releasing soon……

  7. sir please do not take so mch time in uploading your ML videos I used to follow your videos earlier when you were uploading a video daily but then this huge gap I lost interest.

Leave a Reply

Your email address will not be published. Required fields are marked *