Skip to content

Allows experimentation with various features based on the performance of different machine learning algorithms.

Notifications You must be signed in to change notification settings

BhattAyush17/Feature_Engineering_SandBox

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

7 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Feature_Engineering_SandBox

A hands-on Streamlit application for quick evaluation, implementation, and analysis of feature engineering and preprocessing strategies to improve machine learning model performance.

🔗 Live Application: https://feature-evaluation-lab.streamlit.app/


Purpose

Feature_Engineering_SandBox is designed as a practical experimentation workspace where data preprocessing, feature extraction, and evaluation decisions can be tested quickly and observed directly through model behavior and metrics.

The focus is not on building a single optimized model, but on enabling:

  • Rapid validation of preprocessing and feature choices
  • Direct comparison of feature pipelines
  • Clear understanding of how features affect model performance
  • Faster iteration during data preparation and experimentation

What This App Helps With

  • Evaluating preprocessing strategies before full model training
  • Understanding feature relevance and redundancy
  • Testing feature extraction ideas in a controlled environment
  • Identifying stable versus sensitive features
  • Making data-driven decisions for better downstream models

Core Capabilities

  • Interactive preprocessing configuration (imputation, encoding, scaling)
  • Feature extraction and transformation analysis
  • Feature correlation and distribution inspection
  • Model evaluation under identical feature pipelines
  • Feature ablation and sensitivity checks
  • Clear visual feedback for faster iteration

Screenshots

Screenshots illustrating the workflow and evaluation views.

Preprocessing & Feature Setup

Preprocessing Configuration

Feature Analysis

Feature Analysis

Model Evaluation

Model Evaluation

Feature Ablation & Sensitivity

Feature Ablation

Project Structure

Feature_Engineering_SandBox/
│
├── app.py                  # Streamlit entry point
├── requirements.txt
├── README.md
├── .gitignore
│
├── src/
│   ├── data/               # Data loading & validation
│   ├── preprocessing/      # Cleaning & transformations
│   ├── features/           # Feature extraction & analysis
│   ├── models/             # Model training & checks
│   ├── evaluation/         # Ablation & sensitivity analysis
│   └── visualization/      # Plot generation
│
├── scripts/                # Debug & experiments
└── docs/
    └── screenshots/        # UI screenshots

Tech Stack

  • Python
  • Streamlit
  • Pandas, NumPy
  • Scikit-learn
  • Matplotlib / Plotly

Running Locally

git clone https://github.com/BhattAyush17/Feature_Engineering_SandBox.git
cd Feature_Engineering_SandBox
pip install -r requirements.txt
streamlit run app.py

Design Approach

  • Fast feedback over long training cycles
  • Explicit feature and preprocessing control
  • Minimal abstraction to keep behavior visible
  • Reusable logic for real ML pipelines
  • Designed for iteration, not one-off results

Future Extensions

  • Extending support beyond Random Forest, Decision Tree, and Logistic Regression
  • Enhanced feature importance comparison across supported models
  • Exportable summaries for feature and preprocessing evaluations
  • More configurable feature extraction pipelines for tabular datasets

Feature_Engineering_SandBox helps developers gain clear insights into feature behavior, evaluate preprocessing and extraction strategies, and make precise, data-driven decisions to improve feature-driven model performance.

About

Allows experimentation with various features based on the performance of different machine learning algorithms.

Resources

Stars

Watchers

Forks

Packages

No packages published

Languages