I recently found myself going deep down an internet rabbit hole.
Like anyone in charge of company communications, I get an alert any time our company is mentioned, linked to, or tweeted at on the internet. If you’re talking about Parse.ly, I want to know about it.
Last week, I started getting alerts about articles involving Elizabeth Warren and a claim she made in March. In a blog post on Medium (which has since been updated), Warren stated:
“Google and Facebook control websites that receive 70 percent of all internet traffic.”
The articles I was alerted to were tear downs of the statistic, claiming that she, in fact, didn’t have the right information. Which one was true?
Down the rabbit hole
To figure this out, I dug in. Here’s how Warren’s team found this particular statistic, in reverse order:
In late 2015, we released a blog post on external referral traffic stats from our network of media companies and publishers. The study looked at data from August through October of 2015. At the time, Facebook sent the most external traffic to our network (roughly 40% of traffic to sites came from Facebook). And Google search sent around 34% of traffic. This data excludes traffic that cannot be identified, including “dark” or “direct” traffic from email, SMS, private messaging apps, and other sources that strip referral information. We also exclude traffic from within the sites, known as “recirculation” or internal linking.
Most people in the media industry now know this data and this storyline. However, Facebook no longer accounts for the most external traffic to media sites. Google again came to dominate referrals once Facebook changed its News Feed algorithm in late 2017/early 2018. This caused some massive upheaval in the media industry along the way.
In late 2017, a programmer and writer, Andre Staltz Medeiros, wrote a blog called “The web began dying in 2014, here’s how.” In it, he cites a number of sources regarding internet traffic and trends. He included a link to our 2015 blog post and an image made from our data. It’s not the main focus of his post, but Medeiros mentions the tension between media and the two dominant tech giants, without diving too much into the particulars.
A few days later, Newsweek published an article titled, “Who controls the internet? Facebook and Google Dominance could cause the ‘Death of the web’.” Written by Anthony Cuthbertson, who was a staff writer for the publication at the time, the article mostly summarizes Medeiros’ blog post. (Cuthbertson is currently a Technology Correspondent for The Independent, according to LinkedIn.)
Cuthbertson writes in the Newsweek article: “Sites and services owned and operated by Facebook and Google—such as WhatsApp, YouTube and Instagram—now account for over 70 percent of all internet traffic, compared to a joint market share of around 50 percent in early 2014.”
He also embeds a tweet by Medeiros (who also goes by simply “Staltz”) showing the data from the 2015 Parse.ly blog post.
It’s slightly unclear which specific data point Cuthbertson meant to refer to in his quote. If it was meant to refer to the Parse.ly data, at a minimum it’s inaccurate, as our data separates Facebook traffic from Instagram. And WhatsApp traffic can’t be tracked through referrals (it would appear as “direct” or “dark” traffic.)
Then, in March of 2019, Warren’s team posted the Medium blog post. It links the “70” in the statement “Google and Facebook control websites that receive 70 percent of all internet traffic” to the Newsweek article. (The updated text has more accurately changed to “70 percent of all Internet referral traffic.”)
She also used this statistic in ads she ran on Facebook:
So, to summarize: Warren’s statistic links to a Newsweek article, that references a blog post that talks about multiple sources of data including ours, that links to our blog post which includes data from 2015. Phew.
Through the looking glass
Politicians and movements use stories and storytelling supported by data to promote their policies. However, this is a case study for just how difficult it can be for even the wonkiest policy maker to understand the minutiae of an industry that changes so quickly.
In fact, when used correctly, our data still supports the argument that Google and Facebook account for an extremely outsized amount of traffic to media and content. As of this writing (10/11/19), roughly 82.5% of identifiable, external traffic referrals to media sites, arguably where most people get their information, are coming from those two sources.
Of course, in addition to traffic referrals, many people spend time directly on these sites getting news and information. Pew Research has also found that “More than half of U.S. adults get news from social media often or sometimes (55%), up from 47% in 2018.” They find Facebook and YouTube as the two top social networks that people find news on.
With the change in Warren’s blog post to “70 percent of all Internet referral traffic,” the claim is certainly closer to accurate. However, we could still offer some clarification, perhaps to: “data suggests that over 70% of identifiable traffic to content and media sites comes through Facebook and Google Search.”
For anyone making an argument to the American public, to governments, or even an internal business decision, this citation led us down a rabbit hole of “how accurate is our understanding of the statistics we use and the information we rely on?” It serves as a reminder to be clear about what data and sources you’re looking at, and of course, that Google and Facebook really do dominate a significant amount of our audience’s attention.