The only way any of us can “see” online audiences is through data. Companies like Nielsen and comScore have traditionally done this by tracking a sample over time. However, increasingly, what we know about audiences comes to us through servers. Servers collect vast amounts of information on what users are doing online every second of every day. To some, these “big data” speak for themselves and offer a crystal clear lens with which to see and manage audiences. But much of the buzz surrounding big data is “irrational exuberance.” To make effective use of these new resources, you should know their strengths and weaknesses. Here are four questions to keep in mind if you find yourself seeing the world through big data:
Census or Sample?
Big data are often thought of as a census. If that were true, it would be great. You wouldn’t have to draw a sample and carefully “weight” individual responses to reflect the population. But frequently big data really means a big sample. For example, some claim that gathering data from digital set-top boxes (STBs) and create a census of the TV audience. But in practice, STB ratings are based on samples that are cobbled together and extensively weighted to resemble the total TV audience. Those samples are large enough to offer a lot more granularity than traditional methods, but they’re not close to a census.
In fact, most big databases are adjusted in some way. You might think that the things trending on Twitter reflect a simple headcount. But trending metrics are tweaked in ways that aren’t widely reported. Providers of “currency” measures are typically audited, so it’s easier to know the recipe behind the numbers. But it’s a mystery how many of the newer metrics are cooked up. If you use them, you should assume you’re not getting an unadulterated look at the audience, you’re probably wearing corrective lenses.
Preference or Behavior?
Social media platforms can capture comments that reflect people’s likes and dislikes, but most big data measure behaviors (e.g., views, downloads, shares, purchases, etc.). It’s tempting to interpret behaviors as an expression of preferences. In fact, economists use choices as a measure of “revealed preferences.” But people do things for all kinds of reasons. Ask yourself: Do people view something because they like it or because they just stumbled into it? Do they share something because they approve, disapprove or want to build their personal “brand?”
Even the meaning of “likes” can be a puzzle. Is it really about liking, or social affirmation, or just plain fraud? Big data are often a by-product of using digital platforms to deliver media or provide services. Their great appeal is that they’re cheap and abundant and seem capable of providing valuable new insights. But there’s also a danger in trying to wring too much out of data that weren’t designed to measure motives or states of mind.
Local or Global?
Any online publisher has a treasure trove of data. They can generally see how many visitors they have, how visitors move from page to page, and how much time they spend on each. But this is local on-site activity. It’s often harder to see what visitors are doing the rest of the time, and that could be important. For example, there’s interest in using “attention minutes” as a measure of engagement and perhaps a kind of currency. But are visitors who spend time on a site loyalists who are otherwise hard to reach or are they just people who spend a lot of time on the web? Without a more global source of data, it’s hard to know. If they’re heavy web users, they’ll be easy pickings for programmatic buyers who don’t care about editorial context.
Insight or Currency?
Even if you can’t necessarily see what your visitors are doing off site, there are plenty of insights to be gained by what a publisher can see. One powerful tool is A/B testing. Among other things, publishers have used it to assess the attention grabbing power of headlines. But using big data for insights and using it for currencies are two entirely different propositions.
Changing currencies – even a modest, commonsense change – is as much about politics as data. Reaching an industry-wide consensus takes time. For example, shifting to “viewable impressions” took IAB well over a year to orchestrate. These negotiated currencies may be less than optimal for any one player. So most online publishers will have to reconcile two sets of metrics, those that provide insights and those that provide money – and they’re not always the same.
Big data offer valuable new ways to see audiences, and we should use them for all they’re worth. But like all data-driven decision making tools, they have biases and blind spots. The best way to use those tools is to know what they can and can’t do for you.
James Webster is a Professor of Communication Studies at Northwestern University. He is the author of The Marketplace of Attention: How Audiences Take Shape in a Digital Age (2014, MIT Press).
Webster studies audience behavior and measurement. He’s on the editorial boards of the Journal of Communication and the Journal of Broadcasting & Electronic Media. He also works as a consultant to audience measurement and media companies. For more information and copies of selected work see his personal website.