This repository demonstrates features of Azure Machine Learning (AML), including how to build and deploy low and pro code machine learning models for diabetes prediction. It provides a comprehensive, step-by-step guide through the entire machine learning lifecycle, from workspace creation to model training, evaluation, deployment, and developing ML use cases with MLOps considerations in mind.
Azure Machine Learning is a cloud-based service that enables data scientists and ML engineers to accelerate the end-to-end machine learning lifecycle. It provides a comprehensive set of tools for:
- Building and training models using a robust set of tools
- Deploying and monitoring models in production
- Managing the ML lifecycle with MLOps practices
- Scaling ML workloads efficiently
- Ensuring responsible AI principles
This repository uses the Pima Indians Diabetes Dataset, a standard toy dataset in machine learning. The dataset was chosen for several reasons:
- Simplicity: The dataset is small and straightforward, allowing us to focus on Azure Machine Learning concepts rather than complex data contexts.
- Well-documented: As a widely used dataset, its characteristics are well understood.
- Real-world relevance: Despite being a toy dataset, it represents a genuine healthcare use case.
- Practical size: The small size enables fast training iterations and low compute requirements, making it ideal for learning purposes.
To run the examples in this repository, you'll need:
- An active Azure subscription
- Sufficient permissions to create resources in your subscription
- Python 3.10 or later
- Azure Machine Learning SDK v2 for Python
Important: This repo uses Azure Machine Learning SDK v2, not the older v1. While many examples and documentation for v1 still exist on the internet, v2 is recommended for new projects, offering an improved and more consistent API design.
🧮 01-create-aml-workspace: Step-by-step instructions for creating an AML workspace in the Azure portal.
🧮 02-model-catalog: Intro to the model catalog in AML.
🧮 03-connections: Brief intro to connections in AML.
🧮 04-promptflow: Brief intro to promptflow.
🧮 05-automated-ml: A walkthrough of creating an automated machine learning job for diabetes classification using AML.
🧮 06-create-aml-compute: Guide for setting up compute resources in AML.
🧮 07-git-integration: Instructions for integrating Git repositories with AML.
🧮 08-create-the-dataset: Guide to working with data in AML, including registering the diabetes dataset.
🧮 09-exploratory-data-analysis: Notebook demonstrating exploratory data analysis on the diabetes dataset.
🧮 10-register-model-environment: Instructions for registering model environments in Azure Machine Learning.
🧮 11-train-model: Guide to training a diabetes prediction model using Azure Machine Learning, with two different approaches (job and pipeline).
🧮 12-deploy-model: Instructions for deploying the trained model as an online endpoint.
🧮 13-inference: Example inference request to the deployed model to make a diabetes prediction from patient diagnostics.
🧮 14-components: Intro to components in AML.
🧮 15-mlops-considerations: Some MLOps considerations for machine learning projects.
- Clone this repository to your local machine or Azure Machine Learning compute instance
- Follow the examples in order, starting with 01-create-aml-workspace
- Each example includes detailed instructions and explanations