The Only Thing I Can Do

The rise of your mouth mischievous, delicious smile. Your eyes know how to talk, soul searching for delight, always ready to laugh. Hands soft, strong, knowing… Your playful touch pulls shivers to…

Smartphone

独家优惠奖金 100% 高达 1 BTC + 180 免费旋转




RFM Modelling

fig: 1.1

Introduction

Segmentation is the fact of grouping similar elements having the same characteristics into the same segment. The resulting segments split the concerned population into groups that share the same features, and so can be handled the same way.

Marketers are using this technique to better address and target their customers via personalization, when launching their ads and communication campaigns, when designing a new offer or promotion, and also for merchandising. Traditionally, the split is done based on 4 main criteria:

. Demographic: E.g family with more than 3 kids

. Behavioral: E.g customer waiting for promotions

. Psychographic: E.g customers who are keen on cafe

. Geographic: E.g customers within 4 Km of our shop's

fig:1.2

But, this technique fails to detect hidden segments that are a mix of multiple features combination, especially with the development of the internet and big data, and here comes machine learning to bring its clustering solution, a set of unsupervised classification methods that allow recognizing patterns considering only the data distribution, based on the measure of similarity distance between the population elements regarding the two following rules:

· Homogeneity within clusters: data that belong to the same cluster should be as similar as possible

· Heterogeneity between clusters: data that belong to different clusters should be as different as possible.

why we do segmentation?

Because we can’t treat every customer the same way with the same content, same channel, same importance. They will find another option that understands them better.

Customers who use platforms have different needs and they have their own different profiles. We should adapt our actions depending on that.

we can do many different segmentations according to what we are trying to achieve. If you want to increase the retention rate, we can do a segmentation based on churn probability and take action. But there are very common and useful segmentation methods as well. Now we are going to implement one of them to our business: RFM.

RFM stands for Recency — Frequency — Monetary Value. Theoretically, we will have segments like below:

· Low Value: Customers who are less active than others, not very frequent buyers/visitors and generates very low — zero — maybe negative revenue.

· Mid Value: In the middle of everything. Often using our platform (but not as much as our High Values), fairly frequent, and generates moderate revenue.

· High Value: The group we don’t want to lose. High Revenue, Frequency, and low Inactivity.

As the methodology, we need to calculate Recency, Frequency, and Monetary Value (we will call it Revenue from now on) and apply unsupervised machine learning to identify different groups (clusters) for each. Let’s jump into coding and see how to do RFM Clustering.

To calculate recency, we need to find out the most recent purchase date of each customer and see how many days they are inactive. After having no. of inactive days for each customer, we will apply K-means* clustering to assign customers a recency score.

To create frequency clusters, we need to find the total number of orders for each customer.

Let’s see how our customer database looks like when we cluster them based on revenue. We will calculate the revenue for each customer, plot a histogram, and apply the same clustering method.

Fig:1.3

Convert column name Invoice Date: Recency, Invoice No: Frequency, Total Amount: Monetary

Fig:1.5

Based on our columns distribution and Descriptive Statistics information of columns Recency, Frequency, Monetary are:

Now we will try to provide a score to the customer based on the recency, Frequency, Monetary columns values.

If the value of recency for customers is low we provide a score of 1 because it means the customer has purchased the product recently.

In the case of Frequency and Monetary, we will provide a score of 1 to the customer whose frequency of purchasing and amount spent is more.

Fig:1.7

We will add the Recency, Frequency, Monetary score, and called it RFMscore, and based on RFMacore we will categorize customers into ‘Platinum’, ‘Gold’, ‘Silver’, ‘Bronze’.

Fig 1.8

Here I am trying to categorize customer into a different cluster based on the graph between feature like

Fig 1.9

2. Frequency Vs Monetary

Fig 1.10

3 Recency Vs Monetary

Fig:1.11

Conclusion: From all of the above graphs, we are not able to categorize our customers into different clusters.

Kmeans clustering is unsupervised classification methods that allow recognizing patterns considering only the data distribution, based on the measure of similarity distance between the population elements regarding the two following rules:

- Homogeneity within clusters: data that belong to the same cluster should be as similar as possible

-Heterogeneity between clusters: data that belong to different clusters should be as different as possible.

Zoom on hierarchical clustering (Ex. K-Means):

K-means aims to partition N observations into K clusters Sj (K must be known), each observation belongs to the cluster with the nearest mean/centroid, by minimizing the sum of squares of distances (e.g., Euclidean) between data and the corresponding cluster centroid µj.

Fig: 1.12

How to choose the number of clusters:

Determining the optimal number of clusters for the input data is hard, usually, we want that the compactness of the clustering to be as small as possible, usually we use statistical methods like:

· Elbow method uses a total within-cluster sum of square (WSS) as a function of the number of clusters, which measures how to spread apart the clusters are from each other. So, the K-means algorithm is run for a set of value k (e.g k ϵ [0…10]), and WWS is calculated for each k and plotted (see figure below). The number of clusters is chosen in such a way that adding a new cluster does not improve the WSS.

· Average silhouette method measures the quality of a clustering which shows how well each object lies within its cluster. A high average silhouette width indicates a good clustering.

· Gap statistic method compares the total within intra-cluster variation for different values of k with their expected values under the null reference distribution of the data. The optimal clusters will be a value that maximizes the gap statistic

Fig: 1.13

From Fig No:1.6 no we see that our data is left-skewed so it is required that we should make our data normally distributed and then we do data standardization so that our machine learning model will able to categorize our data accurately.

Now we apply our Machine learning model of k means to data

Fig:1.15

From the elbow method, we can see that if we use 3 clusters to segment our customers we will get maximum accuracy.

Now we will segment our customers based on 3 clusters as prescribed by our machine learning model.

We can start taking action with this segmentation. The main strategies are quite clear:

Add a comment

Related posts:

Celebrity recognition using VGGFace and Annoy

2. The cropped faces are sent to a model called VGGFace. This model provides encoding for the faces in the form of an array of size 2048. 3. The encoding are stored and indexed using a library called…

Major Train Accidents and Railaways Safety in India

The fatal accident that killed 288 people in Balasore, Odisha on June 2, once again raised questions about Indian Railways' safety. However, in the past years, there has been a continuous decrease in…

Do I have Imposter syndrome?

I own a green leather bag. It’s meant to be a shopping bag from reputable UK retailer, but the make is so exquisite that almost everyone does a double take after seeing it. She had given me the bag…