Marketing Funnel Analysis
Conversion Rates and Velocity by Cohorts
By Oz Guner in projects
April 16, 2023
Goal
The goal of this project is to study conversion rates and conversion rate velocities by cohorts. While you can use different tools to achieve the same results, I will be including the R code step by step to (hopefully) create impactful visualizations.
Setup
We have a few simple datasets in hand that we downloaded from our system of records. Let’s look at our base dataset:
#install the readr package to read our csv, which contains basic data
library(readr)
df <- read_csv("C:/Users/ozeng/Documents/R Repository/portfolio/portfolio/content/project/2023-04-16-marketing-funnel-analysis/base.csv")
#view data
df
## # A tibble: 4 × 3
## month created converted
## <chr> <dbl> <dbl>
## 1 January 2023 3235 60
## 2 February 2023 2899 57
## 3 March 2023 2888 30
## 4 April 2023 744 18
Looks like this data is aggregated by month. Let’s find the conversion rate for each month.
# create a new column for the conversion rate
df$conversion_rate <- df$converted / df$created
# print the updated data frame
df
## # A tibble: 4 × 4
## month created converted conversion_rate
## <chr> <dbl> <dbl> <dbl>
## 1 January 2023 3235 60 0.0185
## 2 February 2023 2899 57 0.0197
## 3 March 2023 2888 30 0.0104
## 4 April 2023 744 18 0.0242
Let’s create a visualization of the conversion rate by month so we have a more insightful look into our marketing funnel:
library(plotly)
# sort the data by month
df$month <- factor(df$month, levels = c("January 2023", "February 2023", "March 2023", "April 2023"))
# create the plot
plot_ly(df, x = ~month) %>%
add_trace(y = ~created, name = "Created", type = "bar", marker = list(color = "#4aa564"),
text = ~paste0(created)) %>%
add_trace(y = ~converted, name = "Converted", type = "bar", marker = list(color = "#fdb81e"),
text = ~paste0(converted)) %>%
add_trace(y = ~conversion_rate, name = "Conversion Rate", type = "scatter", mode = "lines", yaxis = "y2",
line = list(color = "#046b99"), marker = list(color = "#046b99"),
text = ~paste0(scales::percent(conversion_rate, accuracy = 0.01))) %>%
layout(title = "Conversion Rate by Month",
xaxis = list(title = "Month"),
yaxis = list(title = "Created/Converted Count", side = "right",
tickfont = list(size = 12),
tickformat = ",.0f",
showticklabels = TRUE,
showgrid = FALSE),
yaxis2 = list(title = "Conversion Rate (%)", side = "left", overlaying = "y",
tickfont = list(size = 12),
tickformat = ".2%",
tickvals = seq(0, 1, 0.1),
tickmode = "linear",
textfont = list(size = 12),
showticklabels = TRUE,
showgrid = TRUE,
zeroline = TRUE),
legend = list(x = 0.5, y = -0.2, orientation = "h",
font = list(size = 12),
xanchor = "center",
yanchor = "top"))
This chart communicates the leads created in a month and converted in the same month. February seems to have a higher conversion rate than January. April is going really well and March might need a little help.
There’s one caveat in looking at conversion rates in a simple way. The 57 units converted in February might not all be created in February. There’s another way to look at this problem: The cohort velocity method. Let’s create a spaghetti chart that showcases how fast leads from a cohort convert to a paid client:
## created_cohort leads_signups days_to_convert converted_to_paid
## 1 23-Jan 3235 53 5
## 2 23-Feb 2899 15 4
## 3 23-Jan 3235 19 4
## 4 23-Apr 744 2 4
## 5 23-Apr 744 0 4
## 6 23-Feb 2899 9 5
## 7 23-Jan 3235 1 4
## 8 23-Jan 3235 0 7
## 9 23-Feb 2899 2 5
## 10 23-Mar 2888 1 4
## 11 23-Feb 2899 0 6
## 12 23-Mar 2888 0 4
Notice how the data is structured. Each record indicates a month cohort, the total leads/signups in that month, and the total leads/signups converted at specific days each month. Whenever there’s at least one unit converted at a specific day interval, one record will be created. Now let’s calculate the conversion rate for each record:
## # A tibble: 107 × 3
## # Groups: days_to_convert [60]
## days_to_convert created_cohort conversion_rate
## <int> <chr> <dbl>
## 1 1 23-Jan 0.00124
## 2 1 23-Mar 0.00139
## 3 2 23-Apr 0.00538
## 4 2 23-Feb 0.00172
## 5 2 23-Jan 0.000927
## 6 2 23-Mar 0.00104
## 7 3 23-Apr 0.00269
## 8 3 23-Feb 0.000690
## 9 3 23-Mar 0.000693
## 10 4 23-Jan 0.000618
## # ℹ 97 more rows
In this case, a conversion rate is assigned for each record. If we are going to take a holistic view to look at the total leads converted at day X, we’d need to create a running conversion rate calculation, as follows:
Notice that using the cumsum()
method and grouping by cohort, you can create a running conversion rate in that specific cohort, which literally sums up the cumulative values.
Next, we’ll create a plotly chart:
Remember that, in the single conversion rate method, our conversion rate for January was 1.85% and February was 1.97%. However, the cohort-based conversion rate paints a clearer picture by eliminating the time bias. January actually has a higher conversion rate with 2.41% versus February’s 2.1%. Moreover, January was converting at higher levels at on average on Day 75, where February is today: 2.26% vs 2.1%.
Other ways to look at conversion velocity.
Another way to look at conversion velocity is by custom segmentation. Examples can include intent levels, demographics, and product actions. Let’s look at the same data from another perspective:
## channel leads_signups days_to_convert converted_to_paid
## 1 Channel 2 835 8 3
## 2 Channel 2 835 7 5
## 3 Channel 2 835 6 4
## 4 Channel 2 835 4 5
## 5 Channel 2 835 3 6
## 6 Channel 2 835 2 5
## 7 Channel 2 835 1 5
## 8 Channel 2 835 0 7
## 9 Channel 3 333 19 4
## 10 Channel 3 333 4 3
## 11 Channel 1 1949 53 5
## 12 Channel 1 1949 22 3
## 13 Channel 1 1949 15 3
## 14 Channel 1 1949 0 3
Similar to the other dataset, this dataset communicates the speed to convert broken down by channels instead of time. Regardless of when they were created we can look at how leads from different channels convert.
In this case, Channel 3 converts the best, followed by Channel 2 and Channel 1. Looking at this data with time controls could introduce even more insights as different channels might convert at different circumstances.
Conclusion
Conversion rates and conversion velocities are powerful tools in a marketer’s toolkit. It’s important to look at the same data from different perspectives to gain more insightful and actionable information.