Skip to content

morizin/Associate-Rule-Learning

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Association Rule Learning

Implementation of Association Rule Learning algorithms for Market Basket Analysis using the Apriori algorithm.

Overview

Association Rule Learning is a machine learning technique used to discover interesting relationships between variables in large databases. This project applies the Apriori algorithm to identify patterns in customer purchase behavior.

What is Association Rule Learning?

Association rules help answer questions like:

  • "If a customer buys X, what else are they likely to buy?"
  • "Which products are frequently purchased together?"
  • "What combinations of products appear in transactions?"

Common applications include:

  • Market Basket Analysis - Understanding product associations in retail
  • Recommendation Systems - Suggesting related products
  • Inventory Management - Optimizing product placement and stock

Algorithm: Apriori

The Apriori algorithm identifies frequent itemsets and generates association rules based on:

  • Support - How often items appear together
  • Confidence - How often the rule is correct
  • Lift - How much more likely items are purchased together vs independently

Repository Contents

├── apriori.py                          # Main Apriori implementation
├── apyori.py                           # Apriori algorithm library
├── Market_Basket_Optimisation.csv     # Sample transaction dataset
└── README.md                           # This file

Dataset

Market_Basket_Optimisation.csv contains transaction data where each row represents a customer's shopping basket with items purchased together.

Getting Started

Prerequisites

Python 3.x
pandas
numpy

Installation

# Clone the repository
git clone https://github.com/morizin/Associate-Rule-Learning.git
cd Associate-Rule-Learning

# Install required packages
pip install pandas numpy

Usage

# Run the Apriori algorithm
python apriori.py

The script will:

  1. Load transaction data from the CSV file
  2. Apply the Apriori algorithm
  3. Generate association rules
  4. Display frequent itemsets and rules with support, confidence, and lift metrics

Key Concepts

Support

Indicates how frequently an itemset appears in the dataset.

Support(A) = (Transactions containing A) / (Total Transactions)

Confidence

Measures how often items in Y appear in transactions containing X.

Confidence(X→Y) = Support(X,Y) / Support(X)

Lift

Shows how much more likely Y is purchased when X is purchased.

Lift(X→Y) = Support(X,Y) / (Support(X) × Support(Y))
  • Lift > 1: Items are positively correlated
  • Lift = 1: Items are independent
  • Lift < 1: Items are negatively correlated

Example Output

Rule: {milk} → {bread}
Support: 0.25 (appears in 25% of transactions)
Confidence: 0.80 (80% of milk buyers also buy bread)
Lift: 2.5 (2.5x more likely to buy bread when buying milk)

Applications

This implementation can be used for:

  • Retail product placement optimization
  • Cross-selling strategies
  • Bundle pricing recommendations
  • Promotional planning
  • Customer behavior analysis

Customization

Modify parameters in apriori.py:

min_support = 0.003      # Minimum support threshold
min_confidence = 0.2     # Minimum confidence threshold
min_lift = 3             # Minimum lift threshold

Algorithm Workflow

  1. Data Preprocessing - Convert transactions into proper format
  2. Frequent Itemset Generation - Find all itemsets meeting minimum support
  3. Rule Generation - Create rules from frequent itemsets
  4. Rule Filtering - Apply confidence and lift thresholds
  5. Results - Display and analyze discovered patterns

Limitations

  • Performance decreases with large datasets (exponential complexity)
  • Requires careful threshold tuning
  • May generate many rules requiring manual interpretation
  • Assumes all items are equally important

Future Improvements

  • Add visualization of association rules
  • Implement FP-Growth algorithm for better performance
  • Add interactive parameter tuning
  • Create rule filtering and ranking system
  • Add statistical significance testing

References


⭐ Star this repository if you found it helpful for learning about Association Rule Learning!

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages