- Duke Spring 2024 IDS721 Final Project
- Group Members: Daniel, Emily, Yilin, Hiep
- video demo: https://youtu.be/VpOEaIy_U44
In this project, we operationalize machine learning by serving an open-source model through a web service developed in Rust. This involves containerizing the service for deployment on Kubernetes and automating the workflow with a CI/CD pipeline. Monitoring, metrics collection, and thorough documentation are essential components of the project, alongside a clear and concise demonstration of the application through a YouTube video.
This project demonstrates the integration of a machine learning model using Rust with Actix Web to serve model predictions over a web service. Specifically, it utilizes the bloom-560m-15_1-ggjt.bin model from Rustformers (based on the Huggingface models) to perform text generation tasks. The application is designed to be deployed on AWS Elastic Kubernetes Service (EKS) for scalable and robust operations. Be sure to download the model and put the file in the src folder along with main.rs before running the project.
-
Obtain Open Source ML Model: We select and acquire an open-source machine learning model suitable for serving. The model is capable of providing inferences based on input data it receives.
-
Create Rust Web Service for Model Inferences: We develop a web service in Rust that can serve the ML model's inferences. The service is robust, efficiently handling requests and providing accurate responses from the model.
-
Containerize Service and Deploy to Kubernetes: We containerize the Rust web service using Docker, ensuring it's well-prepared for deployment. Subsequently, deploy the containerized service to a Kubernetes cluster, configuring it for scalability and reliability.
-
Implement CI/CD Pipeline: We establish a Continuous Integration and Continuous Deployment (CI/CD) pipeline to automate the testing, building, and deployment processes of the application. This pipeline also support rapid iteration and deployment of changes to the service.
main.rs: Sets up the Actix Web server, routes, and CORS configuration.
core.rs: Contains the logic for loading the model, performing inference, and handling errors.
/src: Contains the model binaries and tokenizer configurations.
Cargo.toml: Manages Rust dependencies and project settings.
bloom-560m-15_1-ggjt.bin: Rustformer model \
-
Download the bloom-560m-q5_1-ggjt.bin file from https://huggingface.co/rustformers/bloom-ggml/blob/main/bloom-560m-q5_1-ggjt.bin, and put it under the src directory before running the project.
-
Run backend: go to
artix_backendfolder:
cargo run- To test the backend locally:
curl -X POST http://127.0.0.1:8080/ -H "Content-Type: application/json" -d '{"message": "Hello, A"}'The result should be a Json, with response as key: {"response":"Hello, Aunt Sally."}%
- Create Docker image:
docker build -t rs-model-frontend .
- Deploy front end on Docker:
docker run -p 3000:3000 rs-model-frontend
Demo deployment of backend and frontend:

- Build the Docker image
docker build -t artix-backend .- Tag the Docker image
docker tag artix-backend:latest artix-backend:latest- Run the Docker image locally
docker run -p 5000:5000 artix-backend:latest- Push the Docker image to Docker Hub
docker push artix-backend:latestLink to Docker Image: https://hub.docker.com/r/cr7forever666/artix_backend

- Prepare Your AWS Environment (Install AWS CLI, kubeclt, eksctl, etc.)
- Create an EKS Cluster
eksctl create cluster --name my-cluster --region us-west-2 --nodegroup-name my-nodes --node-type t3.medium --nodes 3 --nodes-min 1 --nodes-max 4 --managed- Configure kubectl to Connect to Your EKS Cluster
aws eks --region region-code update-kubeconfig --name cluster-name- Create a Kubernetes Deployment
apiVersion: apps/v1
kind: Deployment
metadata:
name: artix-backend-deployment
spec:
replicas: 2
selector:
matchLabels:
app: artix-backend
template:
metadata:
labels:
app: artix-backend
spec:
containers:
- name: artix-backend
image: yourusername/artix_backend:latest
ports:
- containerPort: 8000- Create a Kubernetes Service
apiVersion: v1
kind: Service
metadata:
name: artix-backend-service
spec:
type: LoadBalancer
ports:
- port: 80
targetPort: 8000
protocol: TCP
selector:
app: artix-backend- Apply the Deployment and Service
kubectl apply -f deployment.yaml
kubectl apply -f service.yaml- Access the Service
kubectl get services

