Clustering

Logo

Welcome to My Page

View the Project on GitHub clembrain/ML-Clustering

ML-Clustering

🧠 Unsupervised Learning β€” Customer Segmentation with Clustering

Date: April 24, 2024
Author: Clement Airohuodion


πŸ” View Full Project Files

This project demonstrates how clustering β€” a powerful unsupervised learning technique β€” can be used to identify customer segments in the competitive world of online retail, specifically in the footwear industry. The goal is to support data-driven decisions in marketing, credit strategy, and customer retention.

Using a combination of K-Means and Hierarchical Clustering, this analysis reveals meaningful insights into customer behaviors, spending patterns, and demographic trends.


πŸ“Š Dataset


Shop data
Figure 1: This code reads and inspects the dataset and showing first 5 rows using β€œ.head()”.


🧹 Preprocessing

  1. Missing Values: Handled using mean imputation
  2. Outliers: Removed using IQR method
  3. Standardization: Applied StandardScaler for uniform scale
  4. Data Visualizations:
    • Pairplots & histograms to explore feature distribution
    • Interactive scatter plots & heatmaps using plotly
    • Boxplots & KDEs to understand payment and balance behaviors

pairplot
Figure 2: The code above creates a pairplot to visualize some relationships amongst key features like Balance, Purchases, Credit_limit, Payments, Minimum_payments.


corrplot Figure 3: Above is the correlation matrix for raw data, where the correlation between Balance and Cash deposit is β€œ0.33”.


corrplot cleaned Figure 4: Above is the correlation matrix for cleaned data, where the correlation between Balance and Cash deposit is β€œ0.50”.


🧠 Algorithms Applied

βœ… K-Means Clustering


Kmeans_cluster Figure 5: Above shows visualisation of the uniquely coloured clusters generated by K-Means, also the centroids are highlighted.


βœ… Hierarchical Clustering


hierarchical_clustering Figure 6: The Above dendogram shows the result of hierarchical merging using Ward’s method.


hierarchical_clustering Figure 7: Visualisation of Hierarchical clustering in 2D scatter pot.


πŸ“ˆ Results & Insights


πŸ“Œ Conclusion

Clustering enabled strategic segmentation of customers, unlocking critical insights to:

By integrating both K-Means and Hierarchical Clustering, this project provides a comprehensive view of customer behavior in online retail, aligning with real business challenges.


πŸ“ Click to See Supporting Code & Notebooks

πŸ“§ Contact: C.O.Airohuodion@edu.salford.ac.uk
πŸ”— LinkedIn: linkedin.com/in/yourprofile
πŸ”— GitHub: github.com/Clemobrain