ml.net C# machine learning tutorial for beginners



Good morning good afternoon good evening welcome to Questpond's youtube channel, today we will talk about very important and a hot topic Machine Learning Machine Learning with ML.NET C# Machine Learning, Data Science, Artificial Intelligence Python is a forerunner it have awesome framework like numPy, pandas, SciPy which makes artificial intelligence and Data Science much easy to learn and use. For those topics chat on questpond.com In this video will focus on Machine Learning using C# and .NET Technologies. ML.NET is a great open source framework built by Microsoft specifically to do Machine Learning with C# & .NET. This is the first lab of 45 minutes covering 14 chapters. This is a step by step series for people who are very new to Artificial Intelligence and Machine Learning. We will start with basic definitions of what is Artificial Intelligence, Machine Learning, Labels, Features, Model and so on. Finally we will create a feedback predictive system. wherein we put a feedback and it will tell this feedback is good or bad. Let us get started with the definition of Artificial Intelligence and Machine Learning. Once we know the basic definitions we can start the hands-on. One of the goals and fantasy of humans have been to create intelligent computer machines which can think, work and react like humans. This area of study is termed as Artificial Intelligence. One of the capabilities of Humans is Learning. Depending on the learning understanding things and making more better and accurate decisions. Machine Learning is a kind of extension to artificial intelligence where the system or Algorithm will automatically mature. and improve from the trained DataView provide. We will have an Algorithm and will provide it training data Every time we provide training Algorithm will mature by itself. We do not have to make code changes but the code gets matured makes more intelligent decisions learning from training data. Will try to understand What are different things involved in Machine Learning, to ensure Machine Learning works properly, what are the difference phases and steps involved in Machine Learning. We need an Algorithm, depending on situation, needs inputs, outputs and need to make a choice of an Algorithm. To this Algorithm will be providing Training Data. The first step in Machine Learning is we need to have a training data and an Algorithm which will get trained by Training Data. While we training Algorithm with training data we need to be very clear with the expectations. What are the inputs to the Algorithm and what is the expected output or the predicted output from the algorithm. while training Algorithm we need to mention which of the fields are inputs that means features and which of the fields are Labels means the output which we want from the Algorithm. We have a simple Training Data of a Feedback. Give feedback to the Algorithm and Algorithm should give back the Ratings. The Feedback will become the Feature and the Ratings will become the Label. Features are inputs to the Algorithm and Labels are output which will be predicted from the Algorithm. Once Training Data trains the Algorithm with proper features and labels we get a very important output called as MODEL. This Model is the heart of Machine Learning Model is an experienced Algorithm which has gained experience because of Training Data. This Model is what we will find any query and expect the predicted output. In production and in live environment we only question the model. It is very much possible sometimes the model is not upto the mark so we can test the accuracy of the model by providing test data If it not upto the mark we can refine training data and retrain Algorithm again. This is like a continuous process of training, testing & checking if the Model is upto mark or not. Model is Algorithm which we has experience. It is the final important artifact of Machine Learning process. In this predictive feedback system we will build up a predictive feedback model which will take up feedback text on the basis of this text it will decide that this is a good feedback or a bad feedback. If we put a feedback saying This is nice then it will categorize this feedback as a good feedback. If we put That's bad it will categorize it as a bad feedback. Will build a very simple feedback model which will analyse the feedback text and it will say is this sentiment good or not good. For doing ML.NET we need Visual Studio 2017 or above versions. We have 2017 community edition. A community edition is a free edition it has almost all the features a Visual Studio which is needed by C# developer. At least 2017 is needed. Anything below that will not make examples working. All these examples we will be using .NET Core for demonstration. The future roadmap of Microsoft everything is now .NET Core. We have taken up 2017 community edition and created a simple .NET core project. File –> New Project and Create a simple console app .NET Core project. Incase you are new to .NET watch this video which is flashing on the screen which explains what is .NET core, .NET framework and so on. In this console application we will try to build up a predictive feedback system. We need the ML.NET framework. We will using Nuget. Go to NuGet Incase you are new to Nuget watch this video which will help to understand what is use of Nuget. Nuget helps to get third party frameworks inside project. We want ML.NET Go to Solution Explorer –> Right Click –> Manage Nuget packages. There are lots of tabs at the top – Browse, Installed and Uploads. The Installed tab says which of the third party frameworks have been installed in project. We want to get ML.NET, go to the Browser and search for ML.NET We are using 0.8 version. It is a great chance to learn it we can later on get ready for using Machine Learning with C#. This whole tutorial is targeted on the version 0.8. Click on install and get all the necessary dependencies of ML.NET. It will popup some screens of I accept and accept the thing. It is referencing the necessary components. This ensures that the libraries now referenced in project and start it. Now we have the reference of ML in the C# project and all set to get started. As we said Algorithm will be provided Training Data. Let us define the structure of Feedback System Training Data. Will create a simple class and this class will define the feedback training data. We will need the feedback Text The FeedBackText will have those Its nice, its good, its cool, it is bad and all that data stored in this property. The second thing we need is what we want to predict. We also want to train Algorithm with saying is this feedback good or not. We have created a class called as FeedBackTrainingData which defines the input data structure to the Algorithm. It has FeeBackText and a property IsGood which says the FeedBackText is good or not? Create a List of FeedBackTrainingData We are creating a List and inside this List we will be filling Training Data. Create a generic list we need to import System.Collections.Generic Here will create a function that function will load some test data inside Training data list. Create a function called as LoadTrainingData inside this fill some training data here. In this LoadTrainingData will load some dummy data which will say this TrainingText or FeedBackText is good or not. This TrainingData we will give it to Algorithm. At this moment we have loaded two values Let us load at least 10 to 15 values. Copy paste more values we have added some dummy training data This training data can be loaded from a file. For simplicity we have added all the training data manually. The only thing which we need to remember about a Training data is we need to make the training data proper. Whenever it finds the word good we want it to think this is a good feedback. With just one occurrence it will not work we need to have multiple occurrences of good. Somewhere down below we must have added something good out here. The Training Data should have multiple occurrences of the word or else Algorithm will not get trained properly. In the static void main before the application starts load Training Data. The first step is in Machine Learning ML.NET is we need to load the training data. ML.NET has lots of features it can load training data it can do predictions, algorithms, it has a way to test Algorithm and so on. All of these classes and APIs are exposed via a central point called as MLContext. Whenever we want to do any kind of ML.NET operation we need to ensure that we create an instance of MLContext. Create an instance of so we can access the features of Machine Learning. This training data can enter into ML.NET in various formats. At this moment we are providing the training data as a generic list. but someone can load on ADO.NET Data set, someone can load on entity framework collection someone can load on array and so on. Rather than ML.NET getting into different different formats how about they all giving the data in M.NET format which is IDataView. In ML.NET we have a concept of a DataView which is very generic. For whichever type of data source we are loading the data we need to give finally to MLContext or to ML.NET data in the format of IDataView. In order to use this IDataView we need to import using Microsoft.ML.Runtime.Data For this method as well we need to load some couple of APIs Whatever we want we get it from MLContext. MLContext will create a DataView from trainingdata collection. This DataView structure is in the format of FeedBackTrainingData means the class Step 3 is TrainingData comes in a very unorganized unstructured Textual and Nonuniform formats. To learn this kind of unorganized format it is very difficult Machines and Algorithms wants the data to be converted into specific formats so they can learn from it. Mostly these formats are number. If someone gives a text dark violet Machine cannot understand this. But if we can transform this into RGB 148, 0, 211 where the first value stands for Red the second value stands for Green and the last value stands for Blue Machines can learn better and more precisely. We need to transform this inputs These features needs to be transformed into array of numbers or we can officially term it as Vector. This whole process we can term it as Featurization of the text or Feature Vector wherein the input features are transformed into understandable numbers i.e. Vectors. We need to do the following task First we need to convert the data to IDataView then this Data needs to transformed from unstructured data to numeric vectors. these numeric vectorized data will be fed to Algorithm to get trained. We have a workflow involves transforming the text to Vectors and then training algorithm. This complete workflow we need to define in ML.NET in something called as a Pipeline. Pipeline is a series of workflow which gets executed to build a final model. Will create a pipeline Everything we want to access of ML.NET either it is transformation, dataview, algorithm goes to the MLContext. mlcontext we want to transform this FeedBack which is a feature. This FeedBack text is the feature which we want to convert it into Vector. This is the first workflow in the pipeline process. The next step is to feed this vectorized feature or the feature vector to an Algorithm. This is the most important step in Machine Learning process to choose an Algorithm. Our goal is to see how ML.NET API looks like. How to code ML.NET, what are the different process in Machine Learning. In this first lab itself we will not go into in-debth into Algorithm, different types of Algorithm will do that in the coming labs. At this moment we have to make a choice of Algorithm. The requirement is we have a FeedBackText. In the FeedBackText has to be categorized or classified into two sections. Is this is a good feedback or Is this is a bad feedback. If somebody sends a feedback text saying Hey this is nice it should go into the category of Is Good = True If somebody says Hey this is bad it should go into a category which says Is Good = False. Here we will use a classification algorithm and that also binary means we have only two categories True or False Male or Female Yes or No We do not have lot of categories we just have two categories. In this pipeline we will append one more workflow In this workflow we will say we want to give this featurized text to binary classification algorithm. The first process in pipeline workflow was Transformation. The second process is Trainers Everything we need to access through MLContext Use BinaryClassification algorithm and train the FastTree. FastTree is a decision tree. This thing specifies algorithm as a BinaryClassifier This BinaryClassification internally is a decision tree. We also need to specify how many trees we want and how many leaves we want in that BinaryClassification tree. We want 50 leaves and 50 trees. Trees are nothing but where decisions are done and the leaves are final output. At the leaves level we just want only one output or one data point True or False that is why minDatapointsInLeaves is 1. Later on we will be understanding what is decision tree, binary classification what is mart and so on. We have created a pipeline where we have two workflows which are happening. One is Transformation and the other one is we are giving transform output to the algorithm for getting trained. Once we have defined the Pipeline next we will execute this pipeline and create a Model. Take this pipeline give this data view and get the model out of it. This line out here is doing the series of workflow what we have created where we have decided how the featurization will take place which kind of Algorithm will come in Into that workflow we are using DataView which we have created from the generic list and now it will create a Model. In the step 5 we are training the Algorithm with the training data. This model is the heart of the Machine Learning output. This is the final artifact which we are interested in. Before using this model for prediction ensure to test this model. Create some test data using this test data will try to check the model is good or not. Will create one more function here this function will load some test data. In this test data will find the word good that means the feedback is good.

23 thoughts on “ml.net C# machine learning tutorial for beginners”

  1. does this still works on v1.2, there's no error, but with 1 input it gives several outputs and it always false. does anyone have a copy of this on github??

  2. I'm getting an error T-T please help
    firstly i had to change [Column] to [Columname], as @Lukas said.
    But after doing this, the keyword "ordinal" shows an error: "the best overload for ColumNameAttribute does not have a parameter called ordinal" . There is nothing on the internet. Please help!

  3. Walk through the entire video, really its great. Congratualtions. I would like to ask a doubt

    For the above project output, we could use simple Contains() method in C# or Like operator from SQL to know whether the data is exist then true else false so, where we have to choose ML.net project (i meant where[which projects] we can use this ML.net)

  4. 22:24 you need to add Microsoft.ML.FastTree NuGet package in order to get FastTree for ML.NET version
    1.1.0 as per this https://docs.microsoft.com/en-us/dotnet/api/microsoft.ml.treeextensions.fasttree?view=ml-dotnet-1.1.0 example.

  5. This is a bit outdated with ML.NET version 1.0.. Here are the changes:

    – Instead of mlContext.CreateStreamingDataView() you have to use mlContext.Data.LoadFromEnumerable()
    – mlContext.Transforms.Text.FeaturizeText has a different order for the parameters, so correct for this example would be mlContext.Transforms.Text.FeaturizeText("Features", "FeedbackText")
    – To get mlContext.BinaryClassification.Trainers.FastTree() you also have to use NuGet to install Microsoft.ML.FastTree
    – mlContext.BinaryClassification.Trainers.FastTree() varible names have changed, numLeaves = numberOfLeaves; numTrees = numberOfTrees, minDataPointsInLeaves = minimumExampleCountPerLeaf
    – The [Column] Property is now [ColumnName] in Microsoft.ML.Data
    – Instead of model.MakePredictionFunction() you now have to use mlContext.Model.CreatePredictionEngine<FeedbackTrainingData, FeedbackPrediction>(model)

    Otherwise, nice tutorial!

  6. You could just say: "Hello" and this would take care of the "worrisome" incertitude of the viewers' Time of Day – won't it ? 🙂

  7. 7:45 I have always heard Nuget pronounced like New Get, instead of Nugget.

    Great video. Since I am going to be replaced by a C# robot, I better I learn how to build C# robots.

  8. What if there is more then just two properties in the training data class? Say there where more then one "Features"? How do you get the model to take the extra properties into consideration? Do you just Append more "Feature" categories to the pipeline?

  9. The way you explain each step is really good, you are one of the great mentors to learn and understand any technology.

  10. Feel like you're very happy when talking about shit and shitty man 😀

    Nice video about ML.net.

    IMO, this library is not well design so it's method messed up.

Leave a Reply

Your email address will not be published. Required fields are marked *