Skip to content
This repository was archived by the owner on May 5, 2025. It is now read-only.

nishatrhythm/Data-Science-Lab

Repository files navigation

Data Science Lab

A comprehensive collection of data science assignments covering various topics from exploratory data analysis to advanced machine learning techniques.


Overview

This repository contains four assignments completed as part of a Data Science Laboratory course. Each assignment focuses on different aspects of the data science workflow, from data preprocessing and visualization to implementing various machine learning algorithms.


Assignments

Assignment 1: Titanic EDA and Data Preprocessing

  • Focus: Exploratory Data Analysis and Data Preprocessing
  • Dataset: Titanic passenger data
  • Techniques:
    • Data cleaning and handling missing values
    • Feature engineering
    • Visualization with Seaborn and Matplotlib
    • Data preprocessing (encoding, scaling)
    • Logistic Regression model implementation
  • Key Outcomes: Developed a predictive model for Titanic survival with comprehensive data preparation steps

Assignment 2: Data Science Lab 2

  • Content: Advanced data analysis techniques
  • Visualizations: Multiple figures demonstrating data relationships and model performance

Assignment 3: Ensemble Learning Techniques

  • Focus: Implementation of various ensemble learning methods
  • Datasets: Iris dataset (classification) and synthetic data (regression)
  • Techniques:
    • Bagging (Bagging Classifier and Regressor)
    • Boosting (AdaBoost, Gradient Boosting, XGBoost)
    • Stacking (Stacking Classifier and Regressor)
  • Key Outcomes: Comparative analysis of different ensemble methods for both classification and regression tasks

Assignment 4: Data Science Lab 4

  • Content: Further advanced data science techniques
  • Focus: Advanced modeling and evaluation

Repository Structure

Data-Science-Lab/
├── Assignment 1/
│   ├── Titanic_EDA_and_Data_Preprocessing.ipynb
│   ├── Assignment 1.docx
│   ├── Code Images/
│   └── Output Images/
├── Assignment 2/
│   ├── data_science_lab_2.ipynb
│   ├── Assignment 2.docx
│   └── [Various visualization images]
├── Assignment 3/
│   ├── data_science_lab_3.ipynb
│   ├── Assignment 3.docx
│   └── [Various visualization images]
├── Assignment 4/
│   ├── data_science_lab_4.ipynb
│   ├── Assignment 4.docx
│   └── [Various visualization images]
├── Assignment 1.pdf
├── Assignment 2.pdf
├── Assignment 3.pdf
├── Assignment 4.pdf
└── LICENSE

Technologies Used

  • Python Libraries:
    • Pandas for data manipulation
    • NumPy for numerical operations
    • Matplotlib and Seaborn for data visualization
    • Scikit-learn for machine learning implementations
    • XGBoost for gradient boosting

Getting Started

  1. Clone this repository
  2. Ensure you have Python and the required libraries installed
  3. Navigate to the specific assignment folder
  4. Open the Jupyter notebook files to view the code and analysis

License

This project is licensed under the terms of the license included in the repository.

About

A collection of data analysis, visualization, and machine learning projects using Python, Pandas, NumPy, Matplotlib, and Scikit-learn. Includes real-world datasets and hands-on experiments for data-driven insights.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors