E-Commerce Sales Data Analysis

Project Overview

This project focuses on analyzing sales data from an e-commerce platform. The goal is to gain insights into sales patterns, identify trends, and provide actionable recommendations based on the data. The analysis includes examining product categories, sales performance, and customer behavior.

Features

Data Cleaning: Handling missing values, correcting data types, and removing outliers.
Exploratory Data Analysis (EDA): Visualizing key metrics such as sales distribution across categories, time-based trends, and product performance.
Descriptive Statistics: Summary statistics for product categories and their sales performance.
Visualizations: Graphs and plots to communicate findings, including bar charts and sales distributions.

Technologies Used

Python: Programming language used for data manipulation and analysis.
Pandas: For data wrangling and manipulation.
Matplotlib & Seaborn: For generating visualizations.
Jupyter Notebook: For interactive data analysis.

Dataset Preview

The dataset consists of various fields such as product_category, sales, and order_date. Below is a preview of the first few rows of the dataset:

order_id	product_category	sales	order_date
1	Clothing	1200	2023-06-01
2	Electronics	1500	2023-06-02
3	Home Goods	700	2023-06-03
4	Clothing	850	2023-06-04
5	Electronics	1300	2023-06-05

Key Findings

Product Categories:
- The dataset is grouped by product categories like Clothing, Electronics, and Home Goods.
- Electronics category exhibits higher variance in sales compared to other categories.
Sales Performance:
- Mean and median sales values were calculated per category.
- The analysis revealed that Clothing and Electronics have similar sales averages, but Electronics sales exhibit more variability.

Code Examples

Here are some code snippets from the project:

1. Data Loading and Preview

import pandas as pd

# Load the dataset
sales_data = pd.read_csv('data/ecommerce_sales.csv')

# Preview the first few rows
print(sales_data.head())

2. Grouping Data by Product Category

# Group by product category and calculate descriptive statistics for sales
category_summary = sales_data.groupby('product_category')['sales'].describe()
print(category_summary)

3. Visualizing Sales by Product Category

import seaborn as sns
import matplotlib.pyplot as plt

# Plot sales distribution for each product category
plt.figure(figsize=(10, 6))
sns.boxplot(x='product_category', y='sales', data=sales_data)
plt.title('Sales Distribution by Product Category')
plt.show()

Getting Started

Running in Visual Studio Code

To run the notebook in Visual Studio Code, follow these steps:

Install Visual Studio Code: Download and install VS Code from here.
Install Python Extension: Open VS Code and install the "Python" extension from the extensions marketplace.
Install Jupyter Extension: You will also need the "Jupyter" extension to open .ipynb files.

Clone the Repository:

git clone https://github.com/gappeah/Analysis-of-E-Commerce-Sales-Data.git

Open the Project Folder: In VS Code, open the folder containing the project files.
Open and Run the Notebook: Open the Analysis_of_E-Commerce_Sales_Data.ipynb file in VS Code and run the cells interactively using the Jupyter environment.

Running in Google Colab

If you prefer running the notebook in Google Colab, follow these steps:

Upload the Notebook to Colab:
- Go to Google Colab.
- Click on File > Upload Notebook and select the Analysis_of_E-Commerce_Sales_Data.ipynb file from your local machine.
Install Necessary Libraries: If the notebook requires additional Python libraries that are not pre-installed on Colab, you can install them by running:
```
!pip install -r requirements.txt
```
Run the Notebook: Once all the dependencies are installed, run the cells by clicking the play icon next to each code block.

Files

Analysis_of_E-Commerce_Sales_Data.ipynb: The main notebook containing the analysis.
customer_data, product_data, retail_sales_data : Contains the datasets used for the analysis.
README.md: This file.

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
Analysis of E-Commerce Sales Data.ipynb		Analysis of E-Commerce Sales Data.ipynb
README.md		README.md
customer_data.csv		customer_data.csv
output.png		output.png
output_1.png		output_1.png
output_2.png		output_2.png
product_data.csv		product_data.csv
retail_sales_data.csv		retail_sales_data.csv

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

E-Commerce Sales Data Analysis

Project Overview

Features

Technologies Used

Dataset Preview

Key Findings

Code Examples

1. Data Loading and Preview

2. Grouping Data by Product Category

3. Visualizing Sales by Product Category

Getting Started

Running in Visual Studio Code

Running in Google Colab

Files

About

Releases

Packages

Languages

gappeah/Analysis-of-E-Commerce-Sales-Data

Folders and files

Latest commit

History

Repository files navigation

E-Commerce Sales Data Analysis

Project Overview

Features

Technologies Used

Dataset Preview

Key Findings

Code Examples

1. Data Loading and Preview

2. Grouping Data by Product Category

3. Visualizing Sales by Product Category

Getting Started

Running in Visual Studio Code

Running in Google Colab

Files

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages