Home Blog Causally Motivated Attribution for Online Advertising

Causally Motivated Attribution for Online Advertising

Siddhartha Agarwal

May 25, 2021

Share to

As mentioned in the previous blog post, algorithm-based methodologies for assigning credit to media channels on conversion of a user are becoming more and more popular, replacing archaic methodologies such as first touch and last touch attribution. A paper that goes beyond a regression framework to explain such attributions was presented by Dalessandro et al. which I’ll be going over in the next few sections.

Attribution

Attribution and Causality

Dalessandro et al. propose a counterfactual analysis to produce estimates of the causal effect of advertising channels on user conversion. There are some strict assumptions that have to be met in order to obtain causality from the data, which Dalessandro et al. state as the following:

The ad treatment precedes the outcome (conversion of a user)
Any attribute that may affect both ad treatment and conversion outcome is observed and accounted for. i.e., there are no unknown variables acting as confounders.
Every user has a non-zero probability of receiving an ad treatment.

Obviously, in real life scenarios, conditions 2 and 3 are nearly impossible to prove as true in any attribution analysis. It may be possible that an ad campaign is targeted towards a certain demographic, thus violating condition 3, and it may be very possible that confounders such as users’ biases towards certain products and services are unmeasurable quantities. One can see how this would be a challenge. In the interests of brevity, we will not dwell on the mathematical formulation of such an analysis since the practicality of it is dubious. In the next section, I will discuss an approximate causal model that Dalessandro et al. introduce, which recasts the causal estimation problem as a channel importance problem, with better application to real world data.

Channel Importance Attribution

Before getting into any convoluted equations, I’ll quickly introduce important notation:

C={ C1, C2,…Ck } is defined as the set of media channels that have shown ads to a group of people
W is a vector of user attributes before being exposed to any ads ( for example, demographics, prior internet searches etc.)
Y is a boolean indicating whether or not a user has converted, post exposure to ads
(γ = Σ Y, n) is the dataset of n users who have seen the same ads by channels in C, and have the same values W = w, producing γ = Σ Y total conversions
S is the set C, excluding Ck (hence a subset of C)
ωS,k is the probability that set C begins with the sequence {S, Ck, ….} in some distribution Ω of possible orderings

The expectation of channel C_k‘s contribution to Y, over all possible combinations of C, is given as V_k, which can be seen in the equation below:

In order to understand this better, consider an example where there are only 2 channels, C1 and C2. Attribution values for the channels can be given as :

We can see in this simplified form that the attribution values are affected by how these channels serve their advertisements to the user.

It is interesting to note, that in the case of observable ad campaigns, we will already know the order in which channels deliver their ads, making the ωS,k probabilities always 0 or 1. The paper discusses why this observable information can actually be harmful in providing attribution values. Let’s take a look at an example.

Consider C = {C1, C2}. Further, let E[γ|{∅}] = E[γ|{C1}] = E[γ|{C2}] = 0, and E[γ|{C1,C2}] = δ >0.. Further, assume that C2 always serves its ads after C1. These assumptions tell us that the individual effects of C1 and C2 cause no conversions among users, but the joint effects of C1 and C2 do lead to some user conversions. Using the formula described above, we can get attribution values as following:

Since we have observable probabilities of the sequence in which the channels serve their ads (since C2 always serves after C1), we can note that ω_2,1₌ 0, and ω_1,2=1, giving us the equation in the form above. What is interesting to note now, is the fact that our attribution values tell us that V1 = 0, while V2 = δ. This means, all the credit for the joint effect of C1 and C2 in our example is going to C2, simply due to the fact that C2 serves its ads after C1. This conclusion is harmful, since we can extrapolate this to a general idea that channels that serve their ads later receive greater credit for user conversions ( it basically turns into a last touch attribution model, which is pretty flawed).

Dalessandro et al. recognize that using these observable probabilities lead to poor recognition of interaction effects among channels, and instead propose a different way to calculate the quantity ω_S_,k. The following equation is the crux of their idea :

They define Ω as a uniform distribution over all possible orderings of C. They state that ω_S,kcan now be calculated as :

To completely understand this equation would require a very good understanding of Shapley Values, which are a common concept of attribution allocation in game theory. Due to the limited scope of this blog, I will not discuss it here. But if there’s something to take away from the paper’s implementation, it is the fact that observable probability distributions of ω_S,k should be ignored in favor of the equation provided by the authors in the equation above.

ARTIFICIAL INTELLIGENCE

FEATURED INSIGHT

Tavant Introduces AI Agents

FEATURED INSIGHT

Mastering Data Archival Techniques

Financial Products

Manufacturing Products

FEATURED INSIGHT

SLM - Opportunities And Challenges White Paper By Harvard Business Review

FEATURED INSIGHT

An Expert Take on How AI is Transforming the HELOC Experience

Financial Services

Media & Entertainment

Real Estate

Manufacturing

Digital Businesses

Agriculture

FEATURED INSIGHT

Tavant Named to HousingWire’s Tech100

IMPACT & INSIGHTS

Case Studies

Testimonials

Insights

QUICK READS

Online Platform Services for a Leading Game Company

ARTIFICIAL INTELLIGENCE

FEATURED INSIGHT

Tavant Introduces AI Agents

FEATURED INSIGHT

Mastering Data Archival Techniques

Financial Products

Manufacturing Products

FEATURED INSIGHT

SLM - Opportunities And Challenges White Paper By Harvard Business Review

FEATURED INSIGHT

An Expert Take on How AI is Transforming the HELOC Experience

Financial Services

Media & Entertainment

Real Estate

Manufacturing

Digital Businesses

Agriculture

FEATURED INSIGHT

Tavant Named to HousingWire’s Tech100

IMPACT & INSIGHTS

Case Studies

Testimonials

Insights

QUICK READS

Online Platform Services for a Leading Game Company

ABOUT

Leadership

Awards & Recognition

Our Partners

Our Story

News

FEATURED INSIGHT

SLM - Opportunities And Challenges White Paper By Harvard Business Review

Culture

Open Positions

FEATURED INSIGHT

SLM - Opportunities And Challenges White Paper By Harvard Business Review

ABOUT

Leadership

Awards & Recognition

Our Partners

Our Story

News

FEATURED INSIGHT

SLM - Opportunities And Challenges White Paper By Harvard Business Review

Culture

Open Positions

FEATURED INSIGHT

SLM - Opportunities And Challenges White Paper By Harvard Business Review

Causally Motivated Attribution for Online Advertising

Siddhartha Agarwal

Share to

Tags :

Follow us

AI & AI Agents

SLM - Opportunities And Challenges
White Paper By Harvard Business Review

SLM - Opportunities And Challenges
White Paper By Harvard Business Review

SLM - Opportunities And Challenges
White Paper By Harvard Business Review

SLM - Opportunities And Challenges
White Paper By Harvard Business Review

SLM - Opportunities And Challenges
White Paper By Harvard Business Review

SLM - Opportunities And Challenges
White Paper By Harvard Business Review