Goal
We want to create events from the aggregations in order to leverage the aggregations in the PQL and MQA models. These events will have the following format for example: email_at_least_Y_active_days_last_30_days / account_at_least_10_active_users_last_60_days / email_log_in_at_least_5_times_last_30_days.
The goal is to identify a few key turning points in the customer journey. Those frequencies of user / account activities most associated with conversion will be found by running a frequency analysis.
Frequency analysis
Very often, the most active is a user or an account and the higher the correlation with conversion. Though this is true, this is not very actionable in a scoring model since a tiny part of the population can be considered very active. Then, how does the rest and majority of the population get differentiated?
What we want is actually to split the population you identify using metrics that best separate the people who are likely to convert and the ones who are not likely to convert.
The graphs below will help you identify the optimal number(s) of occurrences to monitor to best separate the people who are likely to convert and the ones who are not likely to convert, based on the aggregation you’ve selected. The recommendation will be to create an aggregated event(s) for those optimal number(s) of occurrences. Those will be the ones with the highest correlation strength factor.
More statistical explanations
As explained in the part above, the conversion rate is not the best metric to look at in order to perform this frequency analysis since it is highlighting the correlation with conversion without taking into account the coverage of this insight among the population. The phi score is an association score and the metric used in the graph below. You can think about it as a correlation strength.
It ranges from −1 to +1, with negative numbers representing negative relationships, zero representing no relationship, and positive numbers representing positive relationships.
In the first case above, most people who did X converted. But not doing X does not imply you will not convert! The Phi score will be closed to 0.
In the second case above, most people who converted did X. But it does not mean that doing X implies you will convert. The Phi score will be closed to 0.
In the third case above, There is a strong correlation between doing X and converting. This is what we are looking for! The Phi score will be closed to 1.