The Problem with Statistical Attribution

Online marketing has grown immensely in last decade. It used to be easy. Some AdWords, an SEO friendly site, maybe an email campaign and you are ready to go. Fast forward to nowadays. Targeted campaigns, personalized onsite experience, automated email communication, and a bunch of retargeting solutions. And all this to get the visitor to convert. And which of them is responsible for a particular conversion? Everyone raises their hands.

Raising hands

Marketing attribution has also developed (kind of). It also used to be easy. One of my two marketing channels touched someone who converted. Great success, you have attributed value to the right solution. But now that there are multiple touch-points at play, it kind of sux. Nevertheless, it looks like every year since 2013 has been "THE year of attribution" and "death of the Last-click".

Curse of the last

The main advantage of "last-click" attribution is its simplicity. The source of last visit/click (last non-direct for advanced) before purchase. Boom. It is easy to understand and easy to explain if somebody asks you what is the magic behind your report. There are of course shortcomings to this approach which have been covered quite extensively throughout the years. But it is the de-facto solution to use because there is nothing that simple and straight forward.


The rest of the rule-based solutions are no better! "First-click" is tricky since you don't know when the journey started. The first click within a certain time window is not necessarily the first click of a journey (it might have started long before). "Even distribution" overvalues some channels and undervalues others. "Time decay" is kind of good enough but, well, it is still skewed towards the end and won't reward acquisition channels appropriately. "Position based" and "custom" sounds awesome until you start experimenting and end up with an unmanageable number of rules and don't understand what exactly is happening.

There must be something better. Something that allocates value in a smart way. It has been called in different names - multi-touch attribution, statistical attribution, algorithmic attribution, data-driven attribution, but essentially it means that there is a particular type of algorithm which deals with the allocation of value based on data.

Statistical attribution

So here it comes. The "holy grail" of attribution modeling. The solution. The algorithm.

But there is a problem. Results are hard to communicate. Yes, there are some case studies, but there is no broad adoption. It is a bit like with alchemists in middle ages. The sacred scientists making gold and the potion of eternal life, and those wealthy merchants and nobles who have access to them. "Cool companies" are using it, but no one knows exactly what happens behind those fancy words of "mind blowing insights", "actionable results", etc. People are still stuck with the "last click".

The problem has three layers. Let's dive in.

Obsession with an algorithm

Choosing an algorithm is fairly easy. Just do some googling and you will find the top algorithms for the problem, descriptions, and in-depth YouTube videos about the principles behind these algorithms.
There are three main families of algorithms in this field: Markov models, Survival analysis, and Game theory. At Dripit we have tested all of them, including Shapley value, which is also the magic behind Google 360 suite statistical attribution model.
The problem with algorithms starts when you want to train the model your specific data. Shapely value is great when you have to calculate a taxi fare between several passengers. Try doing that with 10k passengers (online channels) where most of them haven't been in the same car (customer journey) previously. To overcome these algorithm limitations, mathematicians model the data and try various statistical approximations to fit the needs of a particular algorithm.

Data modeling

Usually people refer to an algorithm as a black-box. In reality, it is data modeling which is a true black box and piece of art. Data modeling takes most of the time. In machine learning community, there is this joke about data scientists: they spend 80% of the time to prepare the data and 20% - to complain that they need more time to prepare the data.
With statistical attribution, the result is a data set which is normalized and aggregated to fit a particular algorithm. In this process, some raw information about the data is lost, and it is already a different representation. The result is hard to describe to end-user if they are not into data science. And this is where the problem arises - "I don't understand; I don't trust" - ID[U;T].

Players are treated equally

This is the most important one. Historically algorithms have treated every channel equally. In other words, they all compete for conversions. But as the number of channels and marketing solutions has grown, they have become more targeted in their objectives. There are the ones that acquire traffic, others that engage and move down the funnel, and some that retarget if a visitor has gone off the track. But algorithms will treat them equally. It is like taking a basketball team and evaluate every player by their ability to hit the basket. In reality, it is not about one super star. It is about the team and individual contribution to a goal. But we don't see it in marketing attribution. Everything is measured and marketed around that one goal.

What's next?

So, is statistical attribution doomed? At Dripit we don't think it is. There are more and more channels to engage with the customer, and so there is a need for performance analytics. Google is about to release Google Attribution. Facebook plans to do the same with their Atlas product which will be called Advanced Measurement. In the meantime there is a growing skepticism about these big platforms also doing attribution to themselves. A bit like Charlie and Hank in one head. And some companies are even looking for ways to pull out their budgets from these silos altogether. The field of advertising channels and performance analytics is far from settled. But that's a topic for another blog post.

At Dripit we are focusing on customer journey analysis. In our experience, it can provide clear insights into marketing performance without any "secret formula", and it doesn't require big "big data" in the first place. And we are not alone in pursuing this route. We have talked with agencies, e-retailers, and startups who have also recognized that there simply isn't that one and only algorithm. Other industry players are looking into this direction as well. As mentioned, Facebook has about to launch Advanced Measurement. But they have also launched an analytics product with focus on analysing customer journey and how Facebook ads influence it.

There are a couple of reasons why an understanding of customer journey can help improve marketing ROI.

The story is clear

People understand sales funnel, customer journey, or sales cycle. They look at that spill-over chart and get it. And it is important that they can make an argument for the usefulness of a particular channel or solution, because it is essential in this particular stage. Because micro-conversion rate is x,y,z.


Micro-conversion is the ability to get visitor to move from one stage to the next. From "visitor" to "visitor with a product in the basket". And there are individual solutions which can be identified as crucial components of this stage. Like recommendation engine. It won't acquire or retarget your customers, but it is an essential tool that helps them to make up their mind and add that product to the basket.


By understanding customer journey and its steps you can actually identify problems and fix them. There is a difference between simply increasing conversions and increasing the amount of people who have added a product to the basket or signed up for an online seminar.

New type of algorithms

By changing perspective on the problem, there is also opportunity for new types of algorithms that can more accurately predict which channels at which stage are a must. At Dripit we are working on exactly this.


The customer journey isn't the holy grail either. It is a challenge on its own.
What do you think? What has been your experience with statistical attribution? Share your experience below in comments.