In this lab, you will:
- Set up a CI/CD pipeline for the Iris dataset with GitHub actions
- Train a machine learning model using scikit-learn
- Version and deploy the model using Continuous Machine Learning
- Deploy to GitHub pages from the automated workflow
Estimated completion time 25 minutes
129
In this task, you will create a GitHub repository and set up the basic project folder structure for this lab.
- Open a Web browser, log in to GitHub (if necessary), and create a new public repository: e.g., iris-html-ci-cd add a description, and initialize with a README.
- Open the Visual Studio Code, open a new terminal window and clone the repository locally, using this command:
git clone https://github.com/your-username/iris-html-ci-cd.git- Using Windows Explorer, copy all of the following files:
ci.yml,generate_html.py,requirements.txtandtrain_model.pyfrom theC:\MLOps\Lab-Filesfolder, into the newiris-html-ci-cdproject folder:C:\Users\student\iris-html-ci-cd. - In VS Code, click File > Open Folder and search for the new project folder location (
C:\Users\student\iris-html-ci-cd). The folder will show in the left window of the Visual Studio Code. Now click on the Terminal menu item and then the sub-menu item, New Terminal. This will open a terminal. - Create a new virtual environment for this task. Run the following command in the terminal window to create the virtual environment.
virtualenv venv- Run the following command to activate the virtual environment.
.\venv\Scripts\activate- Now, run the following command to confirm that the virtual environment is activated.
python --version- A new virtual environment setup is completed for this project. To install the required libraries in the virtual environment, run the following command in the terminal window.
pip install scikit-learn pandas matplotlib- Create/verify the necessary folder structure, as below (create the directories and move the files, as required, in either Windows Explorer or VS Code).
iris-html-ci-cd/
├── Data/
└── iris.data
├── train_model.py
├── generate_html.py
├── requirements.txt
├── .github/
└── workflows/
└── ci.yml
├── index.html
# This file will be auto-generated.
├── README.md
- Copy the
iris.datafile fromC:\MLOps\Data-Filesinto the project directoryData/folder. Alternatively, execute the below command in the terminal window to automatically download the dataset in the data folder.
wget https://archive.ics.uci.edu/ml/machine-learning-databases/iris/iris.data -P Data/The goal of this task is to process the dataset, train a model, and write a script to automatically generate the HTML file in the project folder to be used later to show the results.
- Click on the file with the name
train_model.pyin the project folder and check/verify that the following code is present.
import pandas as pd
from sklearn.model_selection import train_test_split
from sklearn.ensemble import RandomForestClassifier
from sklearn.metrics import accuracy_score
import json
# Load the Iris dataset
df = pd.read_csv('Data/iris.data', header=None)
df.columns = ['sepal_length', 'sepal_width', 'petal_length',
'petal_width', 'species']
# Split the data
X = df.iloc[:, :-1]
y = df.iloc[:, -1]
X_train, X_test, y_train, y_test = train_test_split(X, y,
test_size=0.3, random_state=42)
# Train a Random Forest model
model = RandomForestClassifier(n_estimators=100, random_state=42)
model.fit(X_train, y_train)
# Evaluate the model
y_pred = model.predict(X_test)
accuracy = accuracy_score(y_test, y_pred)
print(f"Model Accuracy: {accuracy:.2f}")
# Save results to a JSON file
results = {
"accuracy": accuracy,
"feature_importances": list(model.feature_importances_)
}
with open("results.json", "w") as f:
json.dump(results, f)
print("Model training completed. Results saved to results.json.")- Run the script by executing the following command in the terminal.
python train_model.py- Verify that
results.jsonis generated and contains model accuracy and feature importances after executing the above model training script. - Now click on the Python file
generate_html.pyin the project folder and verify the contents, as below. This script will read theresults.jsonfile and generate an HTML file (index.html) for the app.
import json
# Load results from JSON
with open("results.json", "r") as f:
results = json.load(f)
# Generate HTML
html_content = f"""
<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="UTF-8">
<meta name="viewport" content="width=device-width, initialscale=1.0">
<title>Iris Model Results</title>
<style>
body {{ font-family: Arial, sans-serif; margin: 20px; }}
h1 {{ color: #4CAF50; }}
.results {{ margin-top: 20px; }}
</style>
</head>
<body>
<h1>Iris Model Results</h1>
<div class="results">
<p><strong>Accuracy:</strong> {results['accuracy']:.2f}</p>
<p><strong>Feature Importances:</strong></p>
<ul>
<li>Sepal Length:
{results['feature_importances'][0]:.2f}</li>
<li>Sepal Width:
{results['feature_importances'][1]:.2f}</li>
<li>Petal Length:
{results['feature_importances'][2]:.2f}</li>
<li>Petal Width:
{results['feature_importances'][3]:.2f}</li>
</ul>
</div>
</body>
</html>
"""
# Save HTML to a file
with open("index.html", "w") as f:
f.write(html_content)
print("HTML file generated: index.html")- Run the above Python script by executing the following command in the terminal window.
python generate_html.py- Verify that
index.htmlis generated with the model results (Open the file from Windows Explorer, which will open a browser window).
This task aims to set up a CI/CD pipeline that pushes the files to the GitHub and deploy on the GitHub pages.
- Click on the
ci.ymlfile, which should be now located in the.github/workflowsfolder, and confirm that the content is as below.
name: CI/CD for Hosting Iris HTML App
on:
push:
branches: [ main ]
pull_request:
branches: [ main ]
jobs:
build:
runs-on: ubuntu-latest
steps:
- name: Checkout code
uses: actions/checkout@v3
- name: Set up Python
uses: actions/setup-python@v3
with:
python-version: 3.9
- name: Install dependencies
run: |
python -m venv venv
source venv/bin/activate
pip install -r requirements.txt
- name: Train the model
run: |
source venv/bin/activate
python train_model.py
- name: Generate HTML app
run: |
source venv/bin/activate
python generate_html.py
- name: Deploy to GitHub Pages
uses: peaceiris/actions-gh-pages@v4
with:
github_token: ${{ secrets.GITHUB_TOKEN }}
publish_dir: ./- Click on the
requirements.txtfile and confirm that the following dependencies are listed in it.
pandas
scikit-learn
matplotlib
- To commit and push changes to the GitHub repository, execute the following commands in the terminal window.
git add .
git commit -m "Set up CI/CD pipeline for GitHub Pages"
git push origin main- To enable GitHub Pages, go to the GitHub repository settings, select Pages, set the source to the Deploy from a Branch, the Branch should be set to Main and / (root). Click Save.
- You should now see a message in the GitHub pages section that the site is live. Click on the Visit site button to view your site in a browser window or access the URL e.g., https://yourusername.github.io/iris-html-ci-cd/.
- Which command is used to push the data to GitHub?
A. git add B. git push origin main C. github push repo main D. git commit -m "Push to GitHub"
STOP
You have successfully completed this lab.
137