🩺 Breast Cancer Diagnosis ML Web Application

📋 Project Overview

An end-to-end machine learning application for breast cancer diagnosis that predicts whether a breast mass is benign or malignant based on cytology lab measurements. The project includes both model training and an interactive web interface.

🚀 Features

1. Machine Learning Pipeline

Data preprocessing and cleaning from the Wisconsin Breast Cancer Dataset
Feature scaling using StandardScaler
Logistic Regression classification model
Model evaluation with accuracy metrics and classification reports
Serialized model and scaler for production use

2. Interactive Web Application (Streamlit)

Real-time interactive sliders for 30+ cell nuclei measurements
Dynamic radar chart visualization comparing:
- Mean values
- Standard error values
- Worst-case values
Instant prediction results with probability scores
Responsive two-column layout design

3. Key Functionalities

Data Cleaning: Automatic handling of missing values and column mapping
Feature Scaling: Min-max scaling for visualization and model input
Model Prediction: Real-time inference with probability outputs
Visual Analytics: Plotly-based radar charts for multi-dimensional data visualization
User-Friendly Interface: Intuitive sidebar controls and clear result displays

📁 Project Structure

  ├── main.py # Streamlit web application
  ├── model_training.py # ML model training script
  ├── model.pkl # Trained logistic regression model
  ├── scaler.pkl # Fitted StandardScaler object
  ├── dataset/
  │ └── cdata.csv # Breast cancer dataset
  ├── requirements.txt # Python dependencies
  └── README.md # This file

🔧 Installation & Setup

Prerequisites

Python 3.8+
pip package manager

Installation Steps

Clone the repository:

git clone https://github.com/yourusername/breast-cancer-prediction.git
cd breast-cancer-prediction

Install dependencies:
```
pip install -r requirements.txt
```
Run the web application:
```
streamlit run main.py
```

Dependencies (requirements.txt)

streamlit==1.28.0
pandas==2.0.3
numpy==1.24.3
scikit-learn==1.3.0
plotly==5.17.0

🧪 Model Training

To retrain the model:

 python model_training.py

This will:

Load and clean the dataset
Split data into training and testing sets
Train a logistic regression model
Evaluate model performance
Save the model and scaler as .pkl files

🎮 Using the Application

Adjust Measurements: Use the sidebar sliders to input cell nuclei measurements
View Visualization: Observe the radar chart showing three measurement categories
Get Predictions: See the prediction (Benign/Malignant) with probability scores
Medical Disclaimer: Always consult healthcare professionals for actual diagnoses

📊 Dataset Information

The application uses the Wisconsin Breast Cancer Dataset containing:
569 instances with 30 features each
Features include mean, standard error, and worst values of:
Radius, Texture, Perimeter, Area
Smoothness, Compactness, Concavity
Concave Points, Symmetry, Fractal Dimension
Binary target variable: Malignant (M) or Benign (B)

🔍 Model Performance

The logistic regression model achieves:
High accuracy on test data
Detailed classification metrics
Probability outputs for confident decision-making

⚠️ Important Disclaimer

This application is designed to assist medical professionals and should NOT be used as a substitute for professional medical diagnosis, advice, or treatment. Always consult qualified healthcare providers for medical decisions.

🤝 Contributing

Contributions are welcome! Please feel free to submit a Pull Request.

📄 License

This project is licensed under the MIT License - see the LICENSE file for details.

🙏 Acknowledgments

University of Wisconsin for the Breast Cancer Dataset
Streamlit for the amazing web app framework
Scikit-learn for machine learning tools
Plotly for visualization capabilities

📞 Contact

For questions or feedback, please open an issue in the GitHub reposito

Key Files to Upload to GitHub:

main.py - Streamlit web application
model_training.py - Model training script (from your second file)
model.pkl - Trained model
scaler.pkl - Scaler object
dataset/cdata.csv - Dataset file
requirements.txt - Dependencies
README.md - Documentation (created above)
.gitignore - To exclude unnecessary files

Quick Start Commands:

# Create requirements.txt
pip freeze > requirements.txt

# Initialize git repo
git init
git add .
git commit -m "Initial commit: Breast Cancer Diagnosis ML App"
git branch -M main
git remote add origin https://github.com/yourusername/repo-name.git
git push -u origin main

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

🩺 Breast Cancer Diagnosis ML Web Application

📋 Project Overview

🚀 Features

1. Machine Learning Pipeline

2. Interactive Web Application (Streamlit)

3. Key Functionalities

📁 Project Structure

🔧 Installation & Setup

Prerequisites

Installation Steps

Dependencies (requirements.txt)

🧪 Model Training

🎮 Using the Application

📊 Dataset Information

🔍 Model Performance

⚠️ Important Disclaimer

🤝 Contributing

📄 License

🙏 Acknowledgments

📞 Contact

Key Files to Upload to GitHub:

Quick Start Commands:

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 32 Commits
app		app
dataset		dataset
model		model
screen-shots		screen-shots
LICENSE		LICENSE
README.md		README.md
model.pkl		model.pkl
scaler.pkl		scaler.pkl

Folders and files

Latest commit

History

Repository files navigation

🩺 Breast Cancer Diagnosis ML Web Application

📋 Project Overview

🚀 Features

1. Machine Learning Pipeline

2. Interactive Web Application (Streamlit)

3. Key Functionalities

📁 Project Structure

🔧 Installation & Setup

Prerequisites

Installation Steps

Dependencies (requirements.txt)

🧪 Model Training

🎮 Using the Application

📊 Dataset Information

🔍 Model Performance

⚠️ Important Disclaimer

🤝 Contributing

📄 License

🙏 Acknowledgments

📞 Contact

Key Files to Upload to GitHub:

Quick Start Commands:

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages