Implementation of Association Rule Learning algorithms for Market Basket Analysis using the Apriori algorithm.
Association Rule Learning is a machine learning technique used to discover interesting relationships between variables in large databases. This project applies the Apriori algorithm to identify patterns in customer purchase behavior.
Association rules help answer questions like:
- "If a customer buys X, what else are they likely to buy?"
- "Which products are frequently purchased together?"
- "What combinations of products appear in transactions?"
Common applications include:
- Market Basket Analysis - Understanding product associations in retail
- Recommendation Systems - Suggesting related products
- Inventory Management - Optimizing product placement and stock
The Apriori algorithm identifies frequent itemsets and generates association rules based on:
- Support - How often items appear together
- Confidence - How often the rule is correct
- Lift - How much more likely items are purchased together vs independently
├── apriori.py # Main Apriori implementation
├── apyori.py # Apriori algorithm library
├── Market_Basket_Optimisation.csv # Sample transaction dataset
└── README.md # This file
Market_Basket_Optimisation.csv contains transaction data where each row represents a customer's shopping basket with items purchased together.
Python 3.x
pandas
numpy# Clone the repository
git clone https://github.com/morizin/Associate-Rule-Learning.git
cd Associate-Rule-Learning
# Install required packages
pip install pandas numpy# Run the Apriori algorithm
python apriori.pyThe script will:
- Load transaction data from the CSV file
- Apply the Apriori algorithm
- Generate association rules
- Display frequent itemsets and rules with support, confidence, and lift metrics
Indicates how frequently an itemset appears in the dataset.
Support(A) = (Transactions containing A) / (Total Transactions)
Measures how often items in Y appear in transactions containing X.
Confidence(X→Y) = Support(X,Y) / Support(X)
Shows how much more likely Y is purchased when X is purchased.
Lift(X→Y) = Support(X,Y) / (Support(X) × Support(Y))
- Lift > 1: Items are positively correlated
- Lift = 1: Items are independent
- Lift < 1: Items are negatively correlated
Rule: {milk} → {bread}
Support: 0.25 (appears in 25% of transactions)
Confidence: 0.80 (80% of milk buyers also buy bread)
Lift: 2.5 (2.5x more likely to buy bread when buying milk)
This implementation can be used for:
- Retail product placement optimization
- Cross-selling strategies
- Bundle pricing recommendations
- Promotional planning
- Customer behavior analysis
Modify parameters in apriori.py:
min_support = 0.003 # Minimum support threshold
min_confidence = 0.2 # Minimum confidence threshold
min_lift = 3 # Minimum lift threshold- Data Preprocessing - Convert transactions into proper format
- Frequent Itemset Generation - Find all itemsets meeting minimum support
- Rule Generation - Create rules from frequent itemsets
- Rule Filtering - Apply confidence and lift thresholds
- Results - Display and analyze discovered patterns
- Performance decreases with large datasets (exponential complexity)
- Requires careful threshold tuning
- May generate many rules requiring manual interpretation
- Assumes all items are equally important
- Add visualization of association rules
- Implement FP-Growth algorithm for better performance
- Add interactive parameter tuning
- Create rule filtering and ranking system
- Add statistical significance testing
- Apriori Algorithm - Wikipedia
- Agrawal, R., & Srikant, R. (1994). Fast algorithms for mining association rules.
⭐ Star this repository if you found it helpful for learning about Association Rule Learning!