Tip: Most professionals use tools like RJMetrics to perform cohort analysis automatically.
Step 1: Pull the Raw Data
Typically, the data required to conduct cohort analysis lives inside of a database of some kind and needs to be exported into spreadsheet software. In this example, we use MySQL and Microsoft Excel.
If you're studying customer purchase behavior, you want to end up with a table of data that includes one record per customer purchase. Each record contains the customer's ID (typically either a unique number or an e-mail address), the date and time of the purchase, the amount of the purchase, and the customer's "cohort date" (this is typically the date of the customer's first purchase). In a typical "orders" database table, the MySQL query to pull such information might look something like this:
12345678910 |
SELECT orders.customerid,
orders.transactiondate,
orders.transactionamount,
cohorts.cohortdate
FROM orders
JOIN (SELECT customerid,
Min(transactiondate) AS cohortDate
FROM orders
GROUP BY customerid) AS cohorts
ON orders.customerid = cohorts.customerid;
|
Ideally, however, you would want to include additional attributes such as the customer's referral source, the first product they purchased, geographic and demographic information, and more. The more information about the customer you have, the more ways you'll be able to segment your cohorts. However, each of these additional attributes may require additional database joins. Tools like
RJMetrics make these attributes available to you automatically.
Step 2: Create Cohort Identifiers
Open the data you've pulled into Excel. Since we pulled the "cohort date" attribute in the example above, we'll conduct the popular cohort analysis in which we compare groups of customers based on when they made their first purchase. Assuming we want to group our cohorts based on the month in which they made their first purchase, we'll need to translate each "cohort date" value into a "bucket" that represents the year and month of their first purchase. Assuming cohort date is in "Column D," the following Excel formula does the trick:
=YEAR(D2) & "-" & MONTH(D2)
Step 3: Calculate Lifecycle Stages
Once we know the cohort that each customer belongs to, we also need to determine the "lifecycle stage" at which each event happened for that cohort member. For example, if a customer made their first purchase on January 10th, 2012, and their second purchase on March 15th, 2012, they would be in the "January 2012" cohort, their first purchase would be in the "Month 1" lifecycle stage, and their second purchase would be in their "Month 3" lifecycle stage, because it happened in their third month after becoming a customer. To calculate lifecycle stage, we'll need to determine the amount of time between the customer's first purchase and the purchase in question. Assuming transaction date is in "Column C" and cohort date is in "Column D," a function like the one below will do the trick:
=ROUND((C2-D2)/30)+1
When you're done, you should have a table in Excel that looks like the one below.
Step 4: Create a Pivot Table and Graph
Pivot tables allow you to calculate an aggregation such as a sum or average across multiple dimensions of your data. The pivot table we'd like to create here is one that conducts a SUM of transaction amount, shows one row per cohort and one column per relative time period. Its data can be visualized on a basic Excel line graph.
There you have it: an extremely basic cohort analysis built from the ground up. There are hundreds of variations on cohort analysis that you can run based on your needs, a few of which are described below.
Bonus Step: Data Perspectives
The chart we've created is a cohort analysis, but it isn't very easy to interpret in this format. Another way to look at this chart would be to view each cohort's spending as a cumulative value over time. This will effectively build a curve that allows you to watch total customer lifetime spending grow over time per cohort.
Even more helpful is to normalize this data by the size of the cohort. In order to do this, each data point for a cohort must be divided by the number of members in that cohort. That way, you can view the average value per cohort member side-by-side without a bias from the size of the cohort. To do this, you'll have to create a second pivot table to calculate cohort size and then divide one by the other.
These data perspectives are another area where an automated tool like
RJMetrics can be extremely valuable. Data perspectives such as "cumulative average per cohort member" can be applied with just a few clicks.
1 Comments:
Best Places To Bet On Boxing - Mapyro
Where To Bet On Boxing. It's a sports betting event in which https://deccasino.com/review/merit-casino/ you bet on the outcome of a game. goyangfc.com In the boxing 출장안마 world, each player must decide if or filmfileeurope.com not casinosites.one to
Post a Comment
Subscribe to Post Comments [Atom]
<< Home