How to Set Up Meaningful (Non-Arbitrary) Custom Attribution in Google Analytics

How to Set Up Meaningful (Non-Arbitrary) Custom Attribution in Google Analytics – Moz

Moz logo

Menu open

Menu close

Products

Moz Pro

Moz Pro Home

Moz Local

Moz Local Home

STAT

Mozscape API

Free SEO Tools

Competitive Research

Link Explorer

Keyword Explorer

Domain Analysis

MozBar

More Free SEO Tools

Learn SEO

Beginner’s Guide to SEO

SEO Learning Center

Moz Academy

SEO Q&A

Webinars, Whitepapers, & Guides

Blog

Why Moz

Agency Solutions

Enterprise Solutions

Small Business Solutions

Case Studies

The Moz Story

New Releases

Log out

Products

Moz Pro

Your All-In-One Suite of SEO Tools

The essential SEO toolset: keyword research, link building, site audits, page optimization, rank tracking, reporting, and more.

Learn more

Try Moz Pro free

Moz Local

Complete Local SEO Management

Raise your local SEO visibility with easy directory distribution, review management, listing updates, and more.

Learn more

Check my presence

STAT

Enterprise Rank Tracking

SERP tracking and analytics for SEO experts, STAT helps you stay competitive and agile with fresh insights.

Learn more

Book a demo

Mozscape API

The Power of Moz Data via API

Power your SEO with the proven, most accurate link metrics in the industry, powered by our index of trillions of links.

Learn more

Get connected

Compare SEO Products

Free SEO Tools

Competitive Research

Competitive Intelligence to Fuel Your SEO Strategy

Gain intel on your top SERP competitors, keyword gaps, and content opportunities.

Find competitors

Link Explorer

Powerful Backlink Data for SEO

Explore our index of over 40 trillion links to find backlinks, anchor text, Domain Authority, spam score, and more.

Get link data

Keyword Explorer

The One Keyword Research Tool for SEO Success

Discover the best traffic-driving keywords for your site from our index of over 500 million real keywords.

Search keywords

Domain Analysis

Free Domain SEO Analysis Tool

Get top competitive SEO metrics like Domain Authority, top pages, ranking keywords, and more.

Analyze domain

MozBar

Free, Instant SEO Metrics As You Surf

Using Google Chrome, see top SEO metrics instantly for any website or search result as you browse the web.

Try MozBar

More Free SEO Tools

Learn SEO

Beginner’s Guide to SEO
The #1 most popular introduction to SEO, trusted by millions.
Read the Beginner’s Guide

How-To Guides
Step-by-step guides to search success from the authority on SEO.
See All SEO Guides

SEO Learning Center
Broaden your knowledge with SEO resources for all skill levels.
Visit the Learning Center

Moz Academy
Upskill and get certified with on-demand courses & certifications.
Explore the Catalog

On-Demand Webinars
Learn modern SEO best practices from industry experts.
View All Webinars

SEO Q&A
Insights & discussions from an SEO community of 500,000+.
Find SEO Answers

August 7-9, 2023
Lock in Super Early Bird savings for MozCon

Snag tickets

Blog

Why Moz

Small Business Solutions
Uncover insights to make smarter marketing decisions in less time.
Grow Your Business

The Moz Story
Moz was the first & remains the most trusted SEO company.
Read Our Story

Agency Solutions
Earn & keep valuable clients with unparalleled data & insights.
Drive Client Success

Case Studies
Explore how Moz drives ROI with a proven track record of success.
See What’s Possible

Enterprise Solutions
Gain a competitive edge in the ever-changing world of search.
Scale Your SEO

New Releases
Get the scoop on the latest and greatest from Moz.
See What’s New

New Feature: Moz Pro
Surface actionable competitive intel

Learn More

Moz Pro

Moz Local

Moz Local Dashboard

Mozscape API

Mozscape API Dashboard

Moz Academy

Avatar

Moz Home

Notifications

Account & Billing

Manage Users

Community Profile

My Q&A

My Videos

Log Out

By: Tom Capper
March 10, 2014

How to Set Up Meaningful (Non-Arbitrary) Custom Attribution in Google Analytics

SEO Analytics

The author’s views are entirely his or her own (excluding the unlikely event of hypnosis) and may not always reflect the views of Moz.

Attribution modeling in Google Analytics (GA) is potentially very powerful in the results it can give us, yet few people use it, and those that do often get misleading results. The built-in models are all fairly useless, and creating your own custom model can easily dissolve into random guesswork. If you’re lucky enough to have access to GA Premium, you can use Data-Driven Attribution, and that’s great—but if you haven’t got the budget to take that route, this post should show you how to get started with the data you already have.

If you’ve read up on attribution modelling in the past, you probably already know what’s wrong with the default models. If you haven’t, I recommend you read this post by Avinash, which outlines the basics of how they all work.

In short, they’re all based on arbitrary, oversimplified assumptions about how people use the internet.

The time decay model

The time decay model is probably the most sensible out of the box, and assumes that after I visit your site, the effect of this first visit on the chance of me visiting again halves every X days. The below graph shows this relationship with the default seven-day half-life. It plots “days since visit” against “chance this visit will cause additional visit.” If it takes seven days for the repeat visit to come around, the first visit’s credit halves to 25%. If it takes 14 days for the repeat visit to come around, the first visit’s credit halves again, to 12.5%. Note that the graph is stepped—I’m assuming it uses GA’s “days since last visit” dimension, which rounds to a whole number of days. This would mean that, for example, if both visits were on the day of conversion, neither would be discounted and both would get equal credit.

There might be some site and userbase out there for which this is an accurate model, but as a starting assumption it’s incredibly bold. As an entire model, it’s incredibly simplistic—surely we don’t really believe that there are no factors relevant in assigning credit to previous visits besides how long ago they occurred? We might consider it relevant if the previous visit bounced, for example. This is why custom models are the only sensible approach to attribution modelling in Google Analytics—the simple one-size-fits-all models are never going to be appropriate for your business or client, precisely because they’re simple, one-size-fits-all models.

Note that in describing the time decay model, I’m talking about the chance of one visit generating another—an important and often overlooked aspect of attribution modelling is that it’s about probabilities. When assigning partial credit for a conversion to a previous visit, we are not saying that the conversion happened partly because of the previous visit, and partly because of the converting visit. We simply don’t know whether that was the case. It could be that after their first visit, the user decided that whatever happened they were going to come back at some point and make a purchase. If we knew this, we’d want to assign that first visit 100% credit. Or it might be that after their first visit, the user totally forgot that our website existed, and then by pure coincidence found it in their natural search results a few days later and decided to make a purchase. In this case, if we knew this, we’d want to assign the previous visit 0% credit. But actually, we don’t know what happened. So we make a claim based on probabilities. For example, if we have a conversion that takes place with one previous visit, what we’re saying if we assign 40% credit to that previous visit is that we think that there is a 40% chance that the conversion would not have happened without the first visit.

If we did think that there was a 40% chance of a conversion being caused by an initial visit, we’d want to assign 40% credit to “Position in Path” exactly matching “First interaction” (meaning visits that were the user’s first visit). If you want to use “Position in Path” as your sole predictor of the chance that a visit generated the conversion, you can. Provided you don’t pull the percentages off the top of your head, it’s better than nothing. If you want to be more accurate, there’s a veritable smorgasbord of additional custom credit rules to choose from, with any default model as your starting point. All we have to do now is figure out what numbers to put in, and realistically, this is where it gets hard. At all costs, do not be tempted to guess—that renders the entire exercise pointless.

Tested assumptions

One tempting approach is simply to create a model based to a greater or lesser extent on assumptions and guesswork, then test the conclusions of that model against your existing marketing strategy and incrementally improve your strategy in this manner. This approach is probably better than nothing for improving your market strategy, and testing improvements to your strategy is always worthwhile, but as a way of creating a realistic attribution model this starting point is going to set you on a long, expensive journey.

The ideal solution is to do this process in reverse—run controlled experiments to build your model in the first place. If you can split your users into representative segments, then test, for example,

the effect of a previous visit on the chance of a second visit
the effect of a previous non-bounce visit on the chance of a second visit
the effect of a previous organic search visit on the chance of a second visit

and so on, you can start filling in your custom credit rules this way. If your tests are done well, you can get really excellent results. But this is expensive, difficult, and time consuming.

The next-best alternative is asking users. If users don’t remember having encountered your brand before, that previous visit they had probably didn’t contribute to their conversion. The most sensible way to do this would be an (optional but incentivised) post-conversion questionnaire, where a representative sample of users are asked questions like:

How did you find this site today?
Have you visited this site before?

If yes:

How many times?
How did you find it?
Did this previous visit impact your decision to visit today?
How long ago was your most recent visit?

The results from questions like these can start filling in those custom credit rules in a non-arbitrary way. But this is still somewhat expensive, difficult and time-consuming. What if you just want to get going right away?

Deconstructing the Data-Driven Attribution model

In this blog post, Google offers this explanation of the Data-Driven Attribution model in GA Premium:

“The Data-Driven Attribution model is enabled through comparing conversion path structures and the associated likelihood of conversion given a certain order of events. The difference in path structure, and the associated difference in conversion probability, are the foundation for the algorithm which computes the channel weights. The more impact the presence of a certain marketing channel has on the conversion probability, the higher the weight of this channel in the attribution model.The underlying probability model has been shown to predict conversion significantly better than a last-click methodology. Data-Driven Attribution seeks to best represent the actual behaviour of customers in the real world, but is an estimate that should be validated as much as possible using controlled experimentation.” (my emphasis)

Similarly, this paper recommends a combination of a conditional probability approach and a bagged logistic regression model. Don’t worry if this doesn’t mean much to you—I’m going to recommend here using a variant of the much simpler conditional probability method.

I’d like to look first at the kind of model that seems to be suggested by Google’s explanation above of their Data Driven Attribution feature. For example, say we wanted to look at the most basic credit rule: How much credit should be assigned to a single previous visit? The basic logic outlined in the explanation from Google above would suggest an approach something like this:

Find conversion rate of new visitors (let’s say this is 4%)
Find conversion rate of returning visitors with one previous visit (let’s say this is 7%)
Credit for previous visit = ((7-4)/7) = 43%

To me, this model is somewhat flawed (though I’m fairly sure that this flaw lies in my application of Google’s explanation of their Data-Driven Attribution rather than in the model itself). For example, say we had a large group of repeat visitors who were only coming to the site because of a previous visit, but that were converting poorly. We’d want to assign credit for these (few) conversions to the previous visits, but the model outlined above might assign them low or negative credit; this is because even though conversions among this group are caused by previous visits, their conversion rate is lower than that of new visitors. This is just one example of why this model can end up being misleading.

My best solution

Figuring out from our data whether a repeat visitor came because of a previous visit or independently of a previous visit is hard. I’ll be honest: I don’t know how Google does it. My best solution is an approximation, but a non-arbitrary one. The idea is using the percentage of traffic that is either branded or direct as an indicator for brand familiarity. Going back again to how much credit should be assigned to a single previous visit, my solution looks like this:

Calculate the percentage of your new visitor traffic is direct, branded organic or branded PPC (let’s say it’s 50%)

Note: Obviously most of your organic is (not provided), so I recommend multiplying your total organic traffic by the % of your known keyword traffic that is branded. As (not provided) approaches 100%, you’ll have to use PPC data to approximate your branded organic traffic levels.

Calculate the percentage of your 2nd-time-visitor traffic is direct, branded organic or branded PPC (let’s say it’s 55%)
Based on the knowledge that only 50% (in this case) of people without previous visits use branded/direct, approximate that without their first visit we’d only have seen (100%-55%)*(100/50)=90% of these 2nd time visitors.
Given this, 10% of visitors came because of a previous visit, so we should assign 10% credit for 2nd time visits to the first visit.

We can use similar logic applied to users with 3+ visits to calculate the credit deserved by “middle interactions”.

This method is far from perfect—that’s why I recommended two others above it. But if you want to get started with your existing data in a non-arbitrary way, I think this is a non-ridiculous way to get started. If you’ve made it this far and you have any ideas of your own, please post them in the comments below.

About Tom Capper —
I head up the Search Science team at Moz, working on Moz’s next generation of tools, insights, and products.

With Moz Pro, you have the tools you need to get SEO right — all in one place.

Start your free trial!

How to Set Up Meaningful (Non-Arbitrary) Custom Attribution in Google Analytics

How to Spot and Fix Keyword Cannibalization

Your Ultimate Content Calendar Template to Use in 2022

Project Management Software for Agencies: What to Look for

Do Reviews Impact Local SEO Rankings? New Study

How to Become an SEO Expert (What it Really Takes)

How My Mom Thinks Search Engines Work

类似文章