BeanSense is a machine learning system designed to classify coffee samples based on sensor readings. It combines deep learning architectures for feature extraction with various classifiers to achieve accurate coffee bean identification and classification.
- Features
- System Architecture
- Technology Stack
- Installation
- Project Structure
- Usage
- Models
- Datasets
- Performance Metrics
- Multiple Classification Models: Implements 7 different machine learning models for flexibility and performance
- Deep Feature Extraction: Uses pre-trained CNN architectures (MobileNet, ResNet) to extract meaningful features from raw sensor data
- Client-Server Architecture: Provides a distributed system for remote prediction and training
- Autoencoder Dimensionality Reduction: Employs autoencoders for feature compression
- Feature Selection: Incorporates ICCS (Improved Cuckoo Search) for optimal feature selection
- Cross-Validation: Implements stratified k-fold cross-validation for model evaluation
- Hyperparameter Tuning: Uses GridSearchCV for SVM model optimization
- Real-time Classification: Supports immediate prediction of coffee samples
- Memory & Performance Tracking: Measures execution time and memory usage for each operation
BeanSense employs a client-server architecture:
┌─────────────────┐ ┌─────────────────┐ ┌───────────────────┐
│ │ │ │ │ │
│ User Interface │<────>│ Server │<────>│ Client │
│ (CLI) │ │ (HTTP) │ │ (ML Models) │
│ │ │ │ │ │
└─────────────────┘ └─────────────────┘ └───────────────────┘
- Server: Handles requests, stores datasets, and delivers commands to the client
- Client: Executes the machine learning models, performs training and prediction
- Wrapper: Provides a unified interface to all classification models
- Python 3.x: Primary programming language
- PyTorch: Deep learning framework for feature extraction
- scikit-learn: Machine learning tools for model evaluation and preprocessing
- LightGBM, CatBoost: Gradient boosting frameworks
- NumPy, Pandas: Data manipulation and analysis
- HTTP Server: Simple HTTP interface for client-server communication
-
Clone the repository
git clone https://github.com/yourusername/BeanSense.git cd BeanSense -
Install dependencies
pip install -r requirements.txt
BeanSense/
├── .backup/ # Backup files directory
├── cache/ # Cache directory for temporary files
├── catboost_info/ # Information generated by CatBoost models
├── datasets/ # Dataset files
│ ├── origin/ # Original dataset files
│ ├── dataset4.csv # 4-sensor dataset
│ ├── dataset6.csv # 6-sensor dataset
│ └── dataset8.csv # 8-sensor dataset
├── model/ # Trained model files
│ ├── default_model/ # Default model files
│ ├── adaboost_resnet_model_4.pkl # AdaBoost + ResNet model (4 sensors)
│ ├── adaboost_resnet_model_6.pkl # AdaBoost + ResNet model (6 sensors)
│ ├── adaboost_resnet_model_8.pkl # AdaBoost + ResNet model (8 sensors)
│ ├── autoencoder_lightgbm_model_4.pkl # Autoencoder + LightGBM model (4 sensors)
│ ├── autoencoder_lightgbm_model_6.pkl # Autoencoder + LightGBM model (6 sensors)
│ ├── autoencoder_lightgbm_model_8.pkl # Autoencoder + LightGBM model (8 sensors)
│ ├── catboost_resnet_model_4.pkl # CatBoost + ResNet model (4 sensors)
│ ├── catboost_resnet_model_6.pkl # CatBoost + ResNet model (6 sensors)
│ ├── catboost_resnet_model_8.pkl # CatBoost + ResNet model (8 sensors)
│ ├── lightgbm_resnet_model_4.pkl # LightGBM + ResNet model (4 sensors)
│ ├── lightgbm_resnet_model_6.pkl # LightGBM + ResNet model (6 sensors)
│ ├── lightgbm_resnet_model_8.pkl # LightGBM + ResNet model (8 sensors)
│ ├── mobilenet_iccs_lightgbm_model_4.pkl # MobileNet + ICCS + LightGBM model (4 sensors)
│ ├── mobilenet_iccs_lightgbm_model_6.pkl # MobileNet + ICCS + LightGBM model (6 sensors)
│ ├── mobilenet_iccs_lightgbm_model_8.pkl # MobileNet + ICCS + LightGBM model (8 sensors)
│ ├── mobilenet_lightgbm_model_4.pkl # MobileNet + LightGBM model (4 sensors)
│ ├── mobilenet_lightgbm_model_6.pkl # MobileNet + LightGBM model (6 sensors)
│ ├── mobilenet_lightgbm_model_8.pkl # MobileNet + LightGBM model (8 sensors)
│ ├── rbf_svm_gs_model_4.pkl # RBF SVM + GridSearch model (4 sensors)
│ ├── rbf_svm_gs_model_6.pkl # RBF SVM + GridSearch model (6 sensors)
│ └── rbf_svm_gs_model_8.pkl # RBF SVM + GridSearch model (8 sensors)
├── utils/ # Utility scripts and helpers
├── venv/ # Python virtual environment
├── __pycache__/ # Python cache files
├── .gitignore # Git ignore file
├── AdaBoostClassifier.py # AdaBoost classifier implementation
├── AutoencoderLightGBM.py # Autoencoder + LightGBM implementation
├── CatBoostClassifier.py # CatBoost classifier implementation
├── ClassifierWrapper.py # Unified interface for all classifiers
├── CoffeeClassifierClient.py # Client implementation
├── CoffeeClassifierServer.py # Server implementation
├── LightGBMMobileNet.py # LightGBM + MobileNet implementation
├── LightGBMResNet.py # LightGBM + ResNet implementation
├── MobileNetICCSLightGBM.py # MobileNet + ICCS + LightGBM implementation
├── RBFSVMGridSearch.py # RBF SVM + GridSearch implementation
├── README.md # Project documentation
└── requirements.txt # Python dependencies
-
Start the server
python CoffeeClassifierServer.py
-
Select operations from the interactive menu:
- Train model
- Make prediction
- Check status
- View latest result
-
Start the client (in a separate terminal)
python CoffeeClassifierClient.py
-
Training a model
- Select "Train model" on the server
- Choose dataset (4, 6, or 8 sensors)
- Select model type
- Monitor progress on the client
-
Making predictions
- Select "Make prediction" on the server
- Choose the dataset and model type
- Enter sensor values (comma-separated)
- View results
You can also use the models directly in your own code:
from LightGBMMobileNet import MobileNetLightGBMModel
# Initialize model
model = MobileNetLightGBMModel()
# Load data
X, y = model.load_data("datasets/dataset4.csv")
# Train model
metrics = model.train(X, y)
# Make prediction
result = model.predict_single([123, 456, 789, 101])
print(f"Prediction: {result}")
# Save model
model.save_model("model/my_model.pkl")BeanSense includes seven distinct classification models:
- LightGBM + MobileNet: Combines MobileNetV2 feature extraction with LightGBM classification
- LightGBM + ResNet: Uses ResNet18 for feature extraction with LightGBM classification
- MobileNet + ICCS + LightGBM: Adds feature selection using Improved Cuckoo Search algorithm
- AdaBoost + ResNet: Combines ResNet18 features with AdaBoost ensemble learning
- Autoencoder + LightGBM: Uses an autoencoder for dimensionality reduction before classification
- CatBoost + ResNet: Utilizes CatBoost classifier with ResNet features
- RBF SVM + GridSearch: Implements Support Vector Machines with grid search for hyperparameter tuning
Each model employs different strategies for feature extraction, selection, and classification to provide comprehensive analysis options.
The system works with three types of datasets based on the number of sensors:
- dataset4.csv: 4 sensors (MQ135, MQ2, MQ3, MQ6)
- dataset6.csv: 6 sensors (MQ135, MQ2, MQ3, MQ6, MQ138, MQ7)
- dataset8.csv: 8 sensors (MQ135, MQ2, MQ3, MQ6, MQ138, MQ7, MQ136, MQ5)
The datasets classify coffee samples with labels like:
- aKaw-D, aKaw-M, aKaw-L (Arabica Kawisari - Dark/Medium/Light roast)
- aSem-D, aSem-M, aSem-L (Arabica Semeru - Dark/Medium/Light roast)
- rGed-D, rGed-M, rGed-L (Robusta Gedung - Dark/Medium/Light roast)
- rTir-D, rTir-M, rTir-L (Robusta Tirtoyudo - Dark/Medium/Light roast)
The system evaluates models using multiple metrics:
- Accuracy: Overall prediction accuracy
- F1-Score: Harmonic mean of precision and recall
- AUC: Area Under the ROC Curve
- Memory Usage: Peak memory consumption during operation
- Execution Time: Time taken for training and prediction
Each model outputs these metrics during training, allowing for direct comparison.
Created by:
- Iwan Dwi: [iwan.dwp@gmail.com]
- Ahmad Zainul: [ahmadzainularifin6@gmail.com]