Skip to content

msdsai/sales_analysis

Repository files navigation

Data_Analysis_python

In this project, we use Python Pandas & Matplotlib to analyze and answer business questions about 12 months of sales data. The data contains hundreds of thousands of electronics store purchases broken down by month, product type, cost, purchase address, etc.

We start by cleaning our data. Tasks during this section include:

  • Drop NaN values from DataFrame
  • Removing rows based on a condition
  • Change the type of columns (to_numeric, to_datetime, astype)

Once we have cleaned up our data a bit, we move to the data exploration to explore 5 high-level business questions related to our data:

  • What was the best month for sales? How much was earned that month?
  • What city sold the most product?
  • What time should we display advertisements to maximize the likelihood of customers buying products?
  • What products are most often sold together?
  • What product sold the most? Why do you think it sold the most?

To answer these questions we walk through many different pandas & matplotlib methods. They include:

  • Concatenating multiple csvs together to create a new DataFrame (pd.concat)
  • Adding columns
  • Parsing cells as strings to make new columns (.str)
  • Using the .apply() method
  • Using groupby to perform aggregate analysis
  • Plotting bar charts and lines graphs to visualize our results
  • Labeling our graphs

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors