From 3818f807a1899849b4e3079f82d76d7b8a76127f Mon Sep 17 00:00:00 2001 From: Genevieve Warren <24882762+gewarren@users.noreply.github.com> Date: Mon, 26 Jun 2023 18:11:40 -0700 Subject: [PATCH] clean up metadata --- .../ScalableMLModelOnAzureFunction/README.md | 17 +++----- .../AnomalyDetection_PhoneCalls/README.md | 13 ++++-- .../README.md | 15 +++---- .../README.md | 43 +++++++++---------- .../Forecasting_BikeSharingDemand/README.md | 6 +-- .../csharp/getting-started/MLNET2/README.md | 10 ++--- .../README.md | 34 +++++++-------- .../README.md | 3 -- .../README.md | 4 -- .../ObjectDetection_StopSigns/README.md | 3 -- .../StopSignDetection/README.md | 1 - .../README.md | 28 ++++++------ 12 files changed, 75 insertions(+), 102 deletions(-) diff --git a/samples/csharp/end-to-end-apps/ScalableMLModelOnAzureFunction/README.md b/samples/csharp/end-to-end-apps/ScalableMLModelOnAzureFunction/README.md index 664550b98..4f50ef1fa 100644 --- a/samples/csharp/end-to-end-apps/ScalableMLModelOnAzureFunction/README.md +++ b/samples/csharp/end-to-end-apps/ScalableMLModelOnAzureFunction/README.md @@ -7,17 +7,14 @@ languages: - csharp products: - dotnet -- dotnet-core -- vs - azure - azure-functions - mlnet --- -# Azure Functions Sentiment Analysis Sample - -This sample highlights dependency injection in conjunction with the **.NET Core Integration Package** to build a scalable, serverless Azure Functions application. +# Azure Functions Sentiment Analysis Sample +This sample highlights dependency injection in conjunction with the **.NET Core Integration Package** to build a scalable, serverless Azure Functions application. | ML.NET version | Status | App Type | Data type | Scenario | ML Task | Algorithms | |----------------|-------------------------------|-------------|-----------|---------------------|---------------------------|-----------------------------| @@ -25,15 +22,15 @@ This sample highlights dependency injection in conjunction with the **.NET Core For a detailed explanation of how to build this application, see the accompanying [how-to guide](https://docs.microsoft.com/en-us/dotnet/machine-learning/how-to-guides/serve-model-serverless-azure-functions-ml-net) on the Microsoft Docs site. -# Goal +## Goal The goal is to be able to predict sentiment using an HTTP triggered Azure Functions serverless application. -# Problem +## Problem The problem with running/scoring an ML.NET model in multi-threaded applications comes when you want to do single predictions with the PredictionEngine object and you want to cache that object (i.e. as Singleton) so it is being reused by multiple Http requests (therefore it would be accessed by multiple threads). This is a problem because **the Prediction Engine is not thread-safe** ([ML.NET issue, Nov 2018](https://github.com/dotnet/machinelearning/issues/1718)) -# Solution +## Solution This is an Azure Functions application optimized for scalability and performance when running/scoring an ML.NET model. It uses dependency injection and the .NET Core Integration Package. @@ -74,7 +71,7 @@ Basically, with this component, you register the `PredictionEnginePool` in a sin .FromFile(modelName: "SentimentAnalysisModel", filePath:"MLModels/sentiment_model.zip", watchForChanges: true); ``` -In the example above, by setting the `watchForChanges` parameter to `true`, the `PredictionEnginePool` starts a `FileSystemWatcher` that listens to the file system change notifications and raises events when there is a change to the file. This prompts the `PredictionEnginePool` to automatically reload the model without having to redeploy the application. The model is also given a name using the `modelName` parameter. In the event you have multiple models hosted in your application, this is a way of referencing them. +In the example above, by setting the `watchForChanges` parameter to `true`, the `PredictionEnginePool` starts a `FileSystemWatcher` that listens to the file system change notifications and raises events when there is a change to the file. This prompts the `PredictionEnginePool` to automatically reload the model without having to redeploy the application. The model is also given a name using the `modelName` parameter. In the event you have multiple models hosted in your application, this is a way of referencing them. Then you just need to need to inject the `PredictionEnginePool` inside the respective Azure Function constructor: @@ -97,7 +94,7 @@ For a much more detailed explanation of a PredictionEngine object pool comparabl [How to optimize and run ML.NET models on scalable ASP.NET Core WebAPIs or web apps](https://devblogs.microsoft.com/cesardelatorre/how-to-optimize-and-run-ml-net-models-on-scalable-asp-net-core-webapis-or-web-apps/) -**NOTE:** You don't need to make the implementation explained in the blog post. Precisely that functionality is implemented for you in the .NET Integration Package. +**NOTE:** You don't need to make the implementation explained in the blog post. Precisely that functionality is implemented for you in the .NET Integration Package. ## Test the application locally diff --git a/samples/csharp/getting-started/AnomalyDetection_PhoneCalls/README.md b/samples/csharp/getting-started/AnomalyDetection_PhoneCalls/README.md index e83f6627f..c68ba12fb 100644 --- a/samples/csharp/getting-started/AnomalyDetection_PhoneCalls/README.md +++ b/samples/csharp/getting-started/AnomalyDetection_PhoneCalls/README.md @@ -7,8 +7,6 @@ languages: - csharp products: - dotnet -- dotnet-core -- vs - mlnet --- @@ -21,15 +19,18 @@ products: In this introductory sample, you'll see how to use [ML.NET](https://www.microsoft.com/net/learn/apps/machine-learning-and-ai/ml-dotnet) to detect **anomalies** in a series of number of calls data. In the world of machine learning, this type of task is called TimeSeries Anomaly Detection. ## Problem + We are having data on number of calls over 10 weeks with daily granularity. The data itself has a periodical pattern as the volumn of calls is large is weekdays and small in weekends. We want to find those points that fall out of the regular pattern of the series. In the world of machine learning, this type of task is called Time-Series anomaly detection. To solve this problem, we will build an ML model that takes as inputs: + * Date * Number of calls. and outputs the anomalies in the number of calls. ## Dataset + We have created sample dataset for number of calls. The dataset `phone_calls.csv` can be found [here](./SrCnnEntireDetection/Data/phone_calls.csv) Format of **Phone Calls DataSet** looks like below. @@ -47,9 +48,11 @@ Format of **Phone Calls DataSet** looks like below. The data in Phone Calls dataset is collected in real world transactions with normalization and rescale transformation. ## ML task - Time Series Anomaly Detection + Anomaly detection is the process of detecting outliers in the data. Anomaly detection in time-series refers to detecting time stamps, or points on a given input time-series, at which the time-series behaves differently from what was expected. These deviations are typically indicative of some events of interest in the problem domain: a cyber-attack on user accounts, power outage, bursting RPS on a server, memory leak, etc. ## Solution + To solve this problem, first, we should determine the period of the series. Second, we can extract the periodical component of the series and apply anomaly detection on the residual part of the series. In ML.net, we could use the detect seasonality function to find the period of a given series. Given the period, the STL algorithm decompose the time-series into three components as `Y = T + S + R`, where `Y` is the original series, `T` is the trend component, `S` is the seasonal componnent and `R` is the residual component of the series(Refer to [this](http://www.nniiem.ru/file/news/2016/stl-statistical-model.pdf) paper for more details on this algorithm). Then, SR-CNN detector is applied to detect anomaly on `R` to capture the anomalies(Refer to [this](https://arxiv.org/pdf/1906.03821.pdf) paper for more details on this algorithm). ![Detect-Anomaly-Pipeline](docs/images/detect-anomaly-pipeline.png) @@ -67,6 +70,7 @@ int period = mlContext.AnomalyDetection.DetectSeasonality(dataView, inputColumnN ### 2. Detect Anomaly First, we need to specify the parameters used for SrCnnEntire detector(Please refer to [here](https://docs.microsoft.com/en-us/dotnet/api/microsoft.ml.timeseriescatalog.detectentireanomalybysrcnn?view=ml-dotnet#Microsoft_ML_TimeSeriesCatalog_DetectEntireAnomalyBySrCnn_Microsoft_ML_AnomalyDetectionCatalog_Microsoft_ML_IDataView_System_String_System_String_System_Double_System_Int32_System_Double_Microsoft_ML_TimeSeries_SrCnnDetectMode_) for the details on the parameters). Then, we invoke the detector and obtain a view of the output data. + ```CSharp var options = new SrCnnEntireAnomalyDetectorOptions() { @@ -79,7 +83,8 @@ var outputDataView = mlContext.AnomalyDetection.DetectEntireAnomalyBySrCnn(dataV ``` ### 3. Consume results -The result can be retrived by simply enumerate the result. `Anomaly`, `ExpectedValue`, `UpperBoundary` and `LowerBoundary` are some of the useful output columns. + +The result can be retrieved by simply enumerate the result. `Anomaly`, `ExpectedValue`, `UpperBoundary` and `LowerBoundary` are some of the useful output columns. ```CSharp //STEP 5: Get the detection results as an IEnumerable @@ -135,5 +140,5 @@ foreach (var p in predictions) //25,0,0,0.018746201354033914,29.381125690882463,32.92296779138513,33.681408258162854,25.080843123602072 //26,0,0,0.0141022037992637,5.261543539820418,32.92296779138513,9.561826107100808,0.9612609725400283 //27,0,0,0.013396001938040617,5.4873712582971805,32.92296779138513,9.787653825577571,1.1870886910167897 -//28,1,0.4971326063712256,0.3521692757832201,36.504694001629254,32.92296779138513,40.804976568909645,32.20441143434886 < --alert is on, detecte anomaly +//28,1,0.4971326063712256,0.3521692757832201,36.504694001629254,32.92296779138513,40.804976568909645,32.20441143434886 < --alert is on, detected anomaly ``` diff --git a/samples/csharp/getting-started/DeepLearning_ImageClassification_Binary/README.md b/samples/csharp/getting-started/DeepLearning_ImageClassification_Binary/README.md index 1a811369c..d7a8b5eb8 100644 --- a/samples/csharp/getting-started/DeepLearning_ImageClassification_Binary/README.md +++ b/samples/csharp/getting-started/DeepLearning_ImageClassification_Binary/README.md @@ -7,8 +7,6 @@ languages: - csharp products: - dotnet -- dotnet-core -- vs - mlnet --- @@ -68,7 +66,7 @@ class ImageData class ModelInput { public byte[] Image { get; set; } - + public UInt32 LabelAsKey { get; set; } public string ImagePath { get; set; } @@ -92,7 +90,7 @@ class ModelOutput ## Load the data -1. Before loading the data, it needs to be formatted into a list of `ImageInput` objects. To do so, create a data loading utility method `LoadImagesFromDirectory`. +1. Before loading the data, it needs to be formatted into a list of `ImageInput` objects. To do so, create a data loading utility method `LoadImagesFromDirectory`. ```csharp public static IEnumerable LoadImagesFromDirectory(string folder, bool useFolderNameAsLabel = true) @@ -198,13 +196,14 @@ var trainingPipeline = mlContext.MulticlassClassification.Trainers.ImageClassifi ## Train the model -Apply the data to the training pipeline. +Apply the data to the training pipeline. ``` ITransformer trainedModel = trainingPipeline.Fit(trainSet); ``` ## Use the model + 1. Create a utility method to display predictions. ```csharp @@ -217,7 +216,7 @@ private static void OutputPrediction(ModelOutput prediction) ### Classify a single image -1. Make predictions on the test set using the trained model. Create a utility method called `ClassifySingleImage`. +1. Make predictions on the test set using the trained model. Create a utility method called `ClassifySingleImage`. ```csharp public static void ClassifySingleImage(MLContext mlContext, IDataView data, ITransformer trainedModel) @@ -241,7 +240,7 @@ ClassifySingleImage(mlContext, testSet, trainedModel); ### Classify multiple images -1. Make predictions on the test set using the trained model. Create a utility method called `ClassifyImages`. +1. Make predictions on the test set using the trained model. Create a utility method called `ClassifyImages`. ```csharp public static void ClassifyImages(MLContext mlContext, IDataView data, ITransformer trainedModel) @@ -302,7 +301,7 @@ Image: 7001-77.jpg | Actual Value: UD | Predicted Value: UD ## Improve the model -- More Data: The more examples a model learns from, the better it performs. Download the full [SDNET2018 dataset](https://digitalcommons.usu.edu/cgi/viewcontent.cgi?filename=2&article=1047&context=all_datasets&type=additional) and use it to train. +- More Data: The more examples a model learns from, the better it performs. Download the full [SDNET2018 dataset](https://digitalcommons.usu.edu/cgi/viewcontent.cgi?filename=2&article=1047&context=all_datasets&type=additional) and use it to train. - Augment the data: A common technique to add variety to the data is to augment the data by taking an image and applying different transforms (rotate, flip, shift, crop). This adds more varied examples for the model to learn from. - Train for a longer time: The longer you train, the more tuned the model will be. Increasing the number of epochs may improve the performance of your model. - Experiment with the hyper-parameters: In addition to the parameters used in this tutorial, other parameters can be tuned to potentially improve performance. Changing the learning rate, which determines the magnitude of updates made to the model after each epoch may improve performance. diff --git a/samples/csharp/getting-started/DeepLearning_ObjectDetection_Onnx/README.md b/samples/csharp/getting-started/DeepLearning_ObjectDetection_Onnx/README.md index 05061c30a..7f9167d89 100644 --- a/samples/csharp/getting-started/DeepLearning_ObjectDetection_Onnx/README.md +++ b/samples/csharp/getting-started/DeepLearning_ObjectDetection_Onnx/README.md @@ -7,8 +7,6 @@ languages: - csharp products: - dotnet -- dotnet-core -- vs - mlnet --- @@ -18,34 +16,35 @@ products: |----------------|-------------------|-------------------------------|-------------|-----------|---------------------|---------------------------|-----------------------------| | v1.6.0 | Dynamic API | Up-to-date | Console app | image files | Object Detection | Deep Learning | Tiny Yolo2 ONNX model | +For a detailed explanation of how to build this application, see the accompanying [tutorial](https://docs.microsoft.com/en-us/dotnet/machine-learning/tutorials/object-detection-onnx) on the Microsoft Docs site. -For a detailed explanation of how to build this application, see the accompanying [tutorial](https://docs.microsoft.com/en-us/dotnet/machine-learning/tutorials/object-detection-onnx) on the Microsoft Docs site. +## Problem -## Problem -Object detection is one of the classical problems in computer vision: Recognize what the objects are inside a given image and also where they are in the image. For these cases, you can either use pre-trained models or train your own model to classify images specific to your custom domain. +Object detection is one of the classical problems in computer vision: Recognize what the objects are inside a given image and also where they are in the image. For these cases, you can either use pre-trained models or train your own model to classify images specific to your custom domain. - ## DataSet + The dataset contains images which are located in the [assets](./ObjectDetectionConsoleApp/assets/images) folder. These images are taken from [wikimedia commons site](https://commons.wikimedia.org/wiki/Main_Page). Go to [Wikimediacommon.md](./ObjectDetectionConsoleApp/assets/images/wikimedia.md) to refer to the image urls and their licenses. ## Pre-trained model + There are multiple models which are pre-trained for identifying multiple objects in the images. here we are using the pretrained model, **Tiny Yolo2** in **ONNX** format. This model is a real-time neural network for object detection that detects 20 different classes. It is made up of 9 convolutional layers and 6 max-pooling layers and is a smaller version of the more complex full [YOLOv2](https://pjreddie.com/darknet/yolov2/) network. The Open Neural Network Exchange i.e [ONNX](http://onnx.ai/) is an open format to represent deep learning models. With ONNX, developers can move models between state-of-the-art tools and choose the combination that is best for them. ONNX is developed and supported by a community of partners. The model is downloaded from the [ONNX Model Zoo](https://github.com/onnx/models) which is a is a collection of pre-trained, state-of-the-art models in the ONNX format. -The Tiny YOLO2 model was trained on the [Pascal VOC](http://host.robots.ox.ac.uk/pascal/VOC/) dataset. Below are the model's prerequisites. +The Tiny YOLO2 model was trained on the [Pascal VOC](http://host.robots.ox.ac.uk/pascal/VOC/) dataset. Below are the model's prerequisites. **Model input and output** **Input** -Input image of the shape (3x416x416) +Input image of the shape (3x416x416) **Output** -Output is a (1x125x13x13) array +Output is a (1x125x13x13) array **Pre-processing steps** @@ -55,25 +54,25 @@ Resize the input image to a (3x416x416) array of type float32. The output is a (125x13x13) tensor where 13x13 is the number of grid cells that the image gets divided into. Each grid cell corresponds to 125 channels, made up of the 5 bounding boxes predicted by the grid cell and the 25 data elements that describe each bounding box (5x25=125). For more information on how to derive the final bounding boxes and their corresponding confidence scores, refer to this [post](http://machinethink.net/blog/object-detection-with-yolo/). - ## Solution -The console application project `ObjectDetection` can be used to to identify objects in the sample images based on the **Tiny Yolo2 ONNX** model. -Again, note that this sample only uses/consumes a pre-trained ONNX model with ML.NET API. Therefore, it does **not** train any ML.NET model. Currently, ML.NET supports only for scoring/detecting with existing ONNX trained models. +The console application project `ObjectDetection` can be used to to identify objects in the sample images based on the **Tiny Yolo2 ONNX** model. + +Again, note that this sample only uses/consumes a pre-trained ONNX model with ML.NET API. Therefore, it does **not** train any ML.NET model. Currently, ML.NET supports only for scoring/detecting with existing ONNX trained models. You need to follow next steps in order to execute the classification test: 1) **Set VS default startup project:** Set `ObjectDetection` as starting project in Visual Studio. -2) **Run the training model console app:** Hit F5 in Visual Studio. At the end of the execution, the output will be similar to this screenshot: +2) **Run the training model console app:** Hit F5 in Visual Studio. At the end of the execution, the output will be similar to this screenshot: ![image](./docs/Output/Console_output.png) +## Code Walkthrough -## Code Walkthrough There is a single project in the solution named `ObjectDetection`, which is responsible for loading the model in Tiny Yolo2 ONNX format and then detects objects in the images. ### ML.NET: Model Scoring -Define the schema of data in a class type and refer that type while loading data using TextLoader. Here the class type is **ImageNetData**. +Define the schema of data in a class type and refer that type while loading data using TextLoader. Here the class type is **ImageNetData**. ```csharp public class ImageNetData @@ -102,9 +101,9 @@ The first step is to create an empty dataview as we just need schema of data whi var data = mlContext.Data.LoadFromTextFile(imagesLocation, hasHeader: true); ``` -The image file used to load images has two columns: the first one is defined as `ImagePath` and the second one is the `Label` corresponding to the image. +The image file used to load images has two columns: the first one is defined as `ImagePath` and the second one is the `Label` corresponding to the image. -It is important to highlight that the `Label` in the `ImageNetData` class is not really used when scoring with the Tiny Yolo2 Onnx model. It is used when to print the labels on the console. +It is important to highlight that the `Label` in the `ImageNetData` class is not really used when scoring with the Tiny Yolo2 Onnx model. It is used when to print the labels on the console. The second step is to define the estimator pipeline. Usually, when dealing with deep neural networks, you must adapt the images to the format expected by the network. This is the reason images are resized and then transformed (mainly, pixel values are normalized across all R,G,B channels). @@ -115,7 +114,7 @@ var pipeline = mlContext.Transforms.LoadImages(outputColumnName: "image", imageF .Append(mlContext.Transforms.ApplyOnnxModel(modelFile: modelLocation, outputColumnNames: new[] { TinyYoloModelSettings.ModelOutput }, inputColumnNames: new[] { TinyYoloModelSettings.ModelInput })); ``` -You also need to check the neural network, and check the names of the input / output nodes. In order to inspect the model, you can use tools like [Netron](https://github.com/lutzroeder/netron), which is automatically installed with [Visual Studio Tools for AI](https://visualstudio.microsoft.com/downloads/ai-tools-vs/). +You also need to check the neural network, and check the names of the input / output nodes. In order to inspect the model, you can use tools like [Netron](https://github.com/lutzroeder/netron), which is automatically installed with [Visual Studio Tools for AI](https://visualstudio.microsoft.com/downloads/ai-tools-vs/). These names are used later in the definition of the estimation pipe: in the case of the inception network, the input tensor is named 'image' and the output is named 'grid' Define the **input** and **output** parameters of the Tiny Yolo2 Onnx Model. @@ -124,7 +123,7 @@ Define the **input** and **output** parameters of the Tiny Yolo2 Onnx Model. public struct TinyYoloModelSettings { // for checking TIny yolo2 Model input and output parameter names, - //you can use tools like Netron, + //you can use tools like Netron, // which is installed by Visual Studio AI Tools // input tensor name @@ -137,7 +136,7 @@ public struct TinyYoloModelSettings ![inspecting neural network with netron](./docs/Netron/netron.PNG) -Finally, we return the trained model after *fitting* the estimator pipeline. +Finally, we return the trained model after *fitting* the estimator pipeline. ```csharp var model = pipeline.Fit(data); @@ -147,7 +146,7 @@ When obtaining the prediction, we get an array of floats in the property `Predic # Detect objects in the image: -After the model is configured, we need to pass the image to the model to detect objects. When obtaining the prediction, we get an array of floats in the property `PredictedLabels`. The array is a float array of size **21125**. This is the output of model i,e 125x13x13 as discussed earlier. This output is interpreted by `YoloOutputParser` class and returns a number of bounding boxes for each image. Again these boxes are filtered so that we retrieve only 5 bounding boxes which have better confidence(how much certain that a box contains the obejct) for each object of the image. +After the model is configured, we need to pass the image to the model to detect objects. When obtaining the prediction, we get an array of floats in the property `PredictedLabels`. The array is a float array of size **21125**. This is the output of model i,e 125x13x13 as discussed earlier. This output is interpreted by `YoloOutputParser` class and returns a number of bounding boxes for each image. Again these boxes are filtered so that we retrieve only 5 bounding boxes which have better confidence(how much certain that a box contains the obejct) for each object of the image. ```csharp IEnumerable probabilities = modelScorer.Score(imageDataView); @@ -161,5 +160,3 @@ var boundingBoxes = ``` **Note** The Tiny Yolo2 model is not having much accuracy compare to full YOLO2 model. As this is a sample program we are using Tiny version of Yolo model i.e Tiny_Yolo2 - - diff --git a/samples/csharp/getting-started/Forecasting_BikeSharingDemand/README.md b/samples/csharp/getting-started/Forecasting_BikeSharingDemand/README.md index 251d36437..15c7f5912 100644 --- a/samples/csharp/getting-started/Forecasting_BikeSharingDemand/README.md +++ b/samples/csharp/getting-started/Forecasting_BikeSharingDemand/README.md @@ -7,8 +7,6 @@ languages: - csharp products: - dotnet -- dotnet-core -- vs - mlnet --- @@ -18,7 +16,7 @@ products: |----------------|-------------------|-------------------------------|-------------|-----------|---------------------|---------------------------|-----------------------------| | v1.6.0 | Dynamic API | Up-to-date | Console app | SQL Server | Demand prediction | Forecasting | Single Spectrum Analysis | -In this sample, you can see how to load data from a relational database using the Database Loader to train a forecasting model that predicts bike rental demand. +In this sample, you can see how to load data from a relational database using the Database Loader to train a forecasting model that predicts bike rental demand. For a detailed explanation of how to build this application, see the accompanying [tutorial](https://docs.microsoft.com/dotnet/machine-learning/tutorials/time-series-demand-forecasting) on the Microsoft Docs site. @@ -31,7 +29,7 @@ Bike Sharing Demand competition from Kaggle](https://www.kaggle.com/c/bike-shari The data used in this sample comes from the [UCI Bike Sharing Dataset](https://archive.ics.uci.edu/ml/datasets/bike+sharing+dataset). Fanaee-T, Hadi, and Gama, Joao, 'Event labeling combining ensemble detectors and background knowledge', Progress in Artificial Intelligence (2013): pp. 1-15, Springer Berlin Heidelberg, [Web Link](https://link.springer.com/article/10.1007%2Fs13748-013-0040-3). -The original dataset contains several columns corresponding to seasonality and weather. For brevity and because the technique used in this sample only requires the values from a single numerical column, the original dataset has been enhanced to include only the following columns: +The original dataset contains several columns corresponding to seasonality and weather. For brevity and because the technique used in this sample only requires the values from a single numerical column, the original dataset has been enhanced to include only the following columns: - **dteday**: The date of the observation. - **year**: The encoded year of the observation (0=2011, 1=2012). diff --git a/samples/csharp/getting-started/MLNET2/README.md b/samples/csharp/getting-started/MLNET2/README.md index d72d859fd..184bd30aa 100644 --- a/samples/csharp/getting-started/MLNET2/README.md +++ b/samples/csharp/getting-started/MLNET2/README.md @@ -7,16 +7,12 @@ languages: - csharp products: - dotnet -- dotnet-core - aspnet-core -- vs -- vs-ide - mlnet --- # ML.NET 2.0 Samples - | ML.NET version | Status | App Type | Data type | Scenario | ML Task | Algorithms | |----------------|-------------------------------|-------------|-----------|---------------------|---------------------------|-----------------------------| | v2.0.0 | Up-to-date | Console App | .csv file | AutoML, Text Classification, Sentence Similarity | Regression,Text Classification,Sentence Similarity | Sdca, NAS-BERT | @@ -35,11 +31,11 @@ The samples in this directory use the following datasets: To use these samples, download the datasets above and place them in the *Data* directory. -In Visual Studio, set any of the projects as the [Startup project and run the application](https://learn.microsoft.com/visualstudio/get-started/csharp/run-program?view=vs-2022). +In Visual Studio, set any of the projects as the [Startup project and run the application](https://learn.microsoft.com/visualstudio/get-started/csharp/run-program?view=vs-2022). **dotnet CLI** -You may have to update the `dataPath` in the console apps. Then, in the terminal, navigate to the project directory and enter `dotnet run`. +You may have to update the `dataPath` in the console apps. Then, in the terminal, navigate to the project directory and enter `dotnet run`. ## Samples @@ -61,4 +57,4 @@ You may have to update the `dataPath` in the console apps. Then, in the terminal - [**TextClassification**](TextClassification/Program.cs) - C# console app that shows how to use the [Text Classification API](https://devblogs.microsoft.com/dotnet/introducing-the-ml-dotnet-text-classification-api-preview/) for inferencing using code generated by Model Builder. The model is trained using [Model Builder](https://dotnet.microsoft.com/apps/machinelearning-ai/ml-dotnet/model-builder). - [**TextClassification_Sentiment_Razor**](../../../modelbuilder/TextClassification_Sentiment_Razor/README.md) - ASP.NET Core Razor Pages application for sentiment analysis. Code sample for [Analyze sentiment of website comments in a web application using ML.NET Model Builder tutorial](https://learn.microsoft.com/en-us/dotnet/machine-learning/tutorials/sentiment-analysis-model-builder). Model is trained using Model Builder. -- [**SentenceSimilarity**](SentenceSimilarity/Program.cs) - C# console app that shows how to use the Sentence Similarity API. Like the Text Classification API, the Sentence Similarity API uses a NAS-BERT transformer-based deep learning model built with [TorchSharp](https://github.com/dotnet/torchsharp) to compare how similar two pieces of text are. \ No newline at end of file +- [**SentenceSimilarity**](SentenceSimilarity/Program.cs) - C# console app that shows how to use the Sentence Similarity API. Like the Text Classification API, the Sentence Similarity API uses a NAS-BERT transformer-based deep learning model built with [TorchSharp](https://github.com/dotnet/torchsharp) to compare how similar two pieces of text are. diff --git a/samples/modelbuilder/BinaryClassification_Sentiment_Razor/README.md b/samples/modelbuilder/BinaryClassification_Sentiment_Razor/README.md index f2c2ff712..1df1f1647 100644 --- a/samples/modelbuilder/BinaryClassification_Sentiment_Razor/README.md +++ b/samples/modelbuilder/BinaryClassification_Sentiment_Razor/README.md @@ -7,16 +7,12 @@ languages: - csharp products: - dotnet -- dotnet-core - aspnet-core -- vs -- vs-ide - mlnet --- # Sentiment Analysis: Razor Pages sample optimized for scalability and performance when running/scoring an ML.NET model built with Model Builder (Using the new '.NET Core Integration Package for ML.NET') - | ML.NET version | Status | App Type | Data type | Scenario | ML Task | Algorithms | |----------------|-------------------------------|-------------|-----------|---------------------|---------------------------|-----------------------------| | v1.3.1 | Up-to-date | Razor Pages | Single data sample | Sentiment Analysis | Binary classification | Linear Classification | @@ -29,7 +25,7 @@ Create a *Razor Pages* web application that hosts an ML.NET binary classificatio ## Application -- SentimentRazor: A .NET Core Razor Pages web application that uses a binary classification model to analyze sentiment from comments made on the website. +- SentimentRazor: A .NET Core Razor Pages web application that uses a binary classification model to analyze sentiment from comments made on the website. - SentimentRazorML.ConsoleApp: A .NET Core Console application that contains the model training and test prediction code. - SentimentRazorML.Model: A .NET Standard class library containing the data models that define the schema of input and output model data as well as the persisted version of the best performing model during training. @@ -53,19 +49,19 @@ Model Builder uses automated machine learning (AutoML) to explore different mach You don't need machine learning expertise to use Model Builder. All you need is some data, and a problem to solve. Model Builder generates the code to add the model to your .NET application. -In this solution, both the *SentimentRazorML.ConsoleApp* and *SentimentRazorML.Model* projects are autogenerated by Model Builder. +In this solution, both the *SentimentRazorML.ConsoleApp* and *SentimentRazorML.Model* projects are autogenerated by Model Builder. ### The web application -Users interact with the application through a Razor Pages website. In a text box on the main page of the application, a user enters a comment which triggers a handler on the page's model to use the input to predict the sentiment of the comment using the trained model. +Users interact with the application through a Razor Pages website. In a text box on the main page of the application, a user enters a comment which triggers a handler on the page's model to use the input to predict the sentiment of the comment using the trained model. #### Challenges -A challenge when making a single prediction with an ML.NET model in multi-threaded applications is that the PredictionEngine is not thread-safe. +A challenge when making a single prediction with an ML.NET model in multi-threaded applications is that the PredictionEngine is not thread-safe. #### Solutions -For improved performance and thread safety, use the `PredictionEnginePool` service, which creates an `ObjectPool` of `PredictionEngine` objects for application use. To use it within your application, add the `Microsoft.Extensions.ML` NuGet package to your project and register the `PredictionEnginPool` as you would any other dependency inside the `Startup` class of the *SentimentRazorML* project. +For improved performance and thread safety, use the `PredictionEnginePool` service, which creates an `ObjectPool` of `PredictionEngine` objects for application use. To use it within your application, add the `Microsoft.Extensions.ML` NuGet package to your project and register the `PredictionEnginPool` as you would any other dependency inside the `Startup` class of the *SentimentRazorML* project. ```csharp services.AddPredictionEnginePool() @@ -80,31 +76,31 @@ var prediction = _predictionEnginePool.Predict(input); ## Try a different dataset -If you want to try out the application with a dataset that produces better results such as the UCI Sentiment Labeled Sentences dataset, you can make the following adjustments. +If you want to try out the application with a dataset that produces better results such as the UCI Sentiment Labeled Sentences dataset, you can make the following adjustments. ### Get the data -1. Download [UCI Sentiment Labeled Sentences dataset ZIP file](https://archive.ics.uci.edu/ml/machine-learning-databases/00331/sentiment%20labelled%20sentences.zip) anywhere on your computer, and unzip it. -1. Open PowerShell and navigate to the unzipped folder in the previous step -1. By default, the file does not have column names. To add column names to the training data, use the following PowerShell commands: +1. Download [UCI Sentiment Labeled Sentences dataset ZIP file](https://archive.ics.uci.edu/ml/machine-learning-databases/00331/sentiment%20labelled%20sentences.zip) anywhere on your computer, and unzip it. +1. Open PowerShell and navigate to the unzipped folder in the previous step. +1. By default, the file does not have column names. To add column names to the training data, use the following PowerShell commands: - ```powershell - echo "Comment`tSentiment" | sc yelp_labelled_columns.tsv; cat yelp_labelled.tsv | sc yelp_labelled_columns.tsv + ```powershell + echo "Comment`tSentiment" | sc yelp_labelled_columns.tsv; cat yelp_labelled.tsv | sc yelp_labelled_columns.tsv ``` -The output generated by the previous commands is a new file called *yelp_labelled_columns.tsv* containing the original data with the respective column names. +The output generated by the previous commands is a new file called *yelp_labelled_columns.tsv* containing the original data with the respective column names. Each row in the *yelp_labelled_columns.tsv* dataset represents a different restaurant review left by a user on Yelp. The first column represents the comment left by the user, and the second column represents the sentiment of the text (0 is negative, 1 positive). The columns are separated by tabs. The data looks like the following: | Comment | Sentiment | | :---: | :---: | -Wow... Loved this place.| 1 -Crust is not good. | 0 +Wow... Loved this place.| 1 +Crust is not good. | 0 Not tasty and the texture was just nasty. | 0 ### Train and use the model -1. Use model builder to train a binary classification model using the new dataset. +1. Use model builder to train a binary classification model using the new dataset. 2. Update the `OnGetAnalyzeSentiment` handler in the *Index.cshtml.cs* file. ```csharp diff --git a/samples/modelbuilder/ImageClassification_Azure_LandUse/README.md b/samples/modelbuilder/ImageClassification_Azure_LandUse/README.md index 58df263e0..e04b991bf 100644 --- a/samples/modelbuilder/ImageClassification_Azure_LandUse/README.md +++ b/samples/modelbuilder/ImageClassification_Azure_LandUse/README.md @@ -7,10 +7,7 @@ languages: - csharp products: - dotnet -- dotnet-core - aspnet-core -- vs -- vs-ide - mlnet --- diff --git a/samples/modelbuilder/MulticlassClassification_RestaurantViolations/README.md b/samples/modelbuilder/MulticlassClassification_RestaurantViolations/README.md index 3fdc47189..23e650f28 100644 --- a/samples/modelbuilder/MulticlassClassification_RestaurantViolations/README.md +++ b/samples/modelbuilder/MulticlassClassification_RestaurantViolations/README.md @@ -7,16 +7,12 @@ languages: - csharp products: - dotnet -- dotnet-core - aspnet-core -- vs -- vs-ide - mlnet --- # Restaurant Violation Inspections - | ML.NET version | Status | App Type | Data type | Scenario | ML Task | Algorithms | |----------------|-------------------------------|-------------|-----------|---------------------|---------------------------|-----------------------------| | v1.6.0 | Up-to-date | Console App | Single data sample | Issue Classification | Multiclass classification | Linear Classification | diff --git a/samples/modelbuilder/ObjectDetection_StopSigns/README.md b/samples/modelbuilder/ObjectDetection_StopSigns/README.md index 09c2826bb..e3ed069ce 100644 --- a/samples/modelbuilder/ObjectDetection_StopSigns/README.md +++ b/samples/modelbuilder/ObjectDetection_StopSigns/README.md @@ -7,10 +7,7 @@ languages: - csharp products: - dotnet -- dotnet-core - aspnet-core -- vs -- vs-ide - mlnet --- diff --git a/samples/modelbuilder/ObjectDetection_StopSigns/StopSignDetection/README.md b/samples/modelbuilder/ObjectDetection_StopSigns/StopSignDetection/README.md index 283e34178..cdde9b3fc 100644 --- a/samples/modelbuilder/ObjectDetection_StopSigns/StopSignDetection/README.md +++ b/samples/modelbuilder/ObjectDetection_StopSigns/StopSignDetection/README.md @@ -7,7 +7,6 @@ languages: - csharp products: - dotnet -- dotnet-core - vs --- diff --git a/samples/modelbuilder/TextClassification_Sentiment_Razor/README.md b/samples/modelbuilder/TextClassification_Sentiment_Razor/README.md index 62bc5f0e7..38af4e250 100644 --- a/samples/modelbuilder/TextClassification_Sentiment_Razor/README.md +++ b/samples/modelbuilder/TextClassification_Sentiment_Razor/README.md @@ -7,16 +7,12 @@ languages: - csharp products: - dotnet -- dotnet-core - aspnet-core -- vs -- vs-ide - mlnet --- # Sentiment Analysis: Razor Pages sample optimized for scalability and performance when running/scoring an ML.NET model built with Model Builder (Using the new Text Classification API) - | ML.NET version | Status | App Type | Data type | Scenario | ML Task | Algorithms | |----------------|-------------------------------|-------------|-----------|---------------------|---------------------------|-----------------------------| | v2.0.0 | Up-to-date | Razor Pages | Single data sample | Text classification | Text Classification | NAS-BERT | @@ -27,7 +23,7 @@ Create a *Razor Pages* web application that hosts an ML.NET deep learning text c ## Application -- SentimentRazor: A .NET Core Razor Pages web application that uses a deep learning text classification model to analyze sentiment from comments made on the website. +- SentimentRazor: A .NET Core Razor Pages web application that uses a deep learning text classification model to analyze sentiment from comments made on the website. ### The data @@ -59,35 +55,35 @@ In this solution, both the *SentimentAnalysis.training.cs* and *SentimentAnalysi ### The web application -Users interact with the application through a Razor Pages website. In a text box on the main page of the application, a user enters a comment which triggers a handler on the page's model to use the input to predict the sentiment of the comment using the trained model. +Users interact with the application through a Razor Pages website. In a text box on the main page of the application, a user enters a comment which triggers a handler on the page's model to use the input to predict the sentiment of the comment using the trained model. ## Try a different dataset -If you want to try out the application with a dataset that produces better results such as the UCI Sentiment Labeled Sentences dataset, you can make the following adjustments. +If you want to try out the application with a dataset that produces better results such as the UCI Sentiment Labeled Sentences dataset, you can make the following adjustments. ### Get the data -1. Download [UCI Sentiment Labeled Sentences dataset ZIP file](https://archive.ics.uci.edu/ml/machine-learning-databases/00331/sentiment%20labelled%20sentences.zip) anywhere on your computer, and unzip it. -1. Open PowerShell and navigate to the unzipped folder in the previous step -1. By default, the file does not have column names. To add column names to the training data, use the following PowerShell commands: +1. Download [UCI Sentiment Labeled Sentences dataset ZIP file](https://archive.ics.uci.edu/ml/machine-learning-databases/00331/sentiment%20labelled%20sentences.zip) anywhere on your computer, and unzip it. +1. Open PowerShell and navigate to the unzipped folder in the previous step. +1. By default, the file does not have column names. To add column names to the training data, use the following PowerShell commands: - ```powershell - echo "Comment`tSentiment" | sc yelp_labelled_columns.tsv; cat yelp_labelled.tsv | sc yelp_labelled_columns.tsv + ```powershell + echo "Comment`tSentiment" | sc yelp_labelled_columns.tsv; cat yelp_labelled.tsv | sc yelp_labelled_columns.tsv ``` -The output generated by the previous commands is a new file called *yelp_labelled_columns.tsv* containing the original data with the respective column names. +The output generated by the previous commands is a new file called *yelp_labelled_columns.tsv* containing the original data with the respective column names. Each row in the *yelp_labelled_columns.tsv* dataset represents a different restaurant review left by a user on Yelp. The first column represents the comment left by the user, and the second column represents the sentiment of the text (0 is negative, 1 positive). The columns are separated by tabs. The data looks like the following: | Comment | Sentiment | | :---: | :---: | -Wow... Loved this place.| 1 -Crust is not good. | 0 +Wow... Loved this place.| 1 +Crust is not good. | 0 Not tasty and the texture was just nasty. | 0 ### Train and use the model -1. Use model builder to train a binary classification model using the new dataset. +1. Use model builder to train a binary classification model using the new dataset. 2. Update the `OnGetAnalyzeSentiment` handler in the *Index.cshtml.cs* file. ```csharp