A Python project for analyzing and modeling car insurance claim data. Demonstrates data cleaning, exploratory data analysis (EDA), visualization, and predictive modeling.
This project analyzes a real-world car insurance dataset to uncover insights, visualize trends, and build a predictive model for insurance outcomes. The workflow covers data cleaning, EDA, correlation analysis, and logistic regression, ending with actionable business recommendations.
- Data Cleaning: Handles missing values, duplicates, and data type conversions
- Exploratory Data Analysis: Visualizes distributions, relationships, and correlations
- Pivot Tables: Summarizes outcomes by demographic features
- Predictive Modeling: Logistic regression for outcome prediction
- Business Insights: Outputs key findings and recommendations
- Python 3
- pandas, numpy
- matplotlib, seaborn
- scikit-learn
- Python 3.7+
- Clone the repository
- Install dependencies:
pip install -r requirements.txt
- Place your data file (
Car_Insurance_Claim.csv) in the project directory - Run the script:
python Car_Insurance.py
- Review the printed outputs and generated plots
python-data-analysi/
├── Car_Insurance.py # Main analysis script
├── Car_Insurance_Claim.csv # Data file (not included in repo)
├── requirements.txt # Python dependencies
├── README.md # This file
- Data cleaning and preprocessing
- Exploratory data analysis (EDA)
- Data visualization
- Predictive modeling (logistic regression)
- Business analytics and reporting
The dataset used in this project is for educational and demonstration purposes only. Do not use real customer data without proper authorization and compliance with data privacy laws.
This project is open source and available under the MIT License.
Author: Azhar Mehmood Language: Python Category: Data Science, Analytics, Machine Learning