Google’s deadline for deprecating third-party cookies has been looming on the horizon for the digital industry for a while now. But at this point in 2023, even with the continued delays by Google, it’s staring us in the face. According to the IAB, digital media has lost up to 60% of the signals that were accessible for targeting and measurement just a few years ago. And while publishers have always been more restricted in resources and spending than advertisers in general, publishers are feeling the heat. And we can see that in how they’re investing in their tech stacks.
The publisher community has long feared third-party cookie deprecation would lead to a bifurcation in the digital ad marketplace. On one side would be the walled gardens, with their large volumes of verified first-party data – not to mention advanced tech stacks. On the other would be the entirety of the open web, which would require collaboration and a scalable identity solution to hold its own.
Top priority in the publisher ad stack: making data work
In the latest installment of Lotame’s Beyond the Cookie report, our survey of publishers and marketers indicates that there really is a sense of urgency for open web media outlets to up their game. In the near future, 37% of marketers plan to increase spending in walled gardens, and 54% expect to reduce programmatic spend. Publishers know they need to onboard the right tools to compete – that is, to build audiences, and to analyze, enrich, and activate their data.
Fortunately, publishers are taking action. While strategizing about how to invest in their tech stacks, tools for managing and processing data top the list of priorities. In the next six to 12 months, our research finds that 35% of publishers will be looking to adopt a data management platform (DMP), 35% will be considering a customer data platform (CDP), and 32% will be exploring identity resolution tools. Not only that, but publishers cite DMPs and CDPs as the tools they would be least likely to retire from their tech stacks.
In the drive to process first-party data from as many sources as possible, CDPs have progressed through their hype stage and have demonstrated real value. Today, 44% of publishers and marketers alike use a CDP, and similarly, 45% say that they plan to build or buy technology to perform CDP functions in the coming year.
Don’t believe the hype. Believe the results
Unfortunately, the hype we heard around CDPs in 2022 admittedly created some confusion about their best use cases. A CDP shouldn’t necessarily be treated as a simple replacement for multiple other tools in the tech stack. For marketers, CDPs deliver value by helping improve customer experience. However, for publishers, it’s about experience and data consolidation.
To maximize their investment, publishers need to put CDPs to use collecting data from all possible online and offline sources. (As opposed to relying too heavily on, for example, data from logged-in known users, at the exclusion of other sources such as unknown users that can provide scale.) These use cases explain why publishers are investing in both CDP and DMP technology.
The rush to gain meaningful data insights that will attract advertiser spend has also driven a great amount of interest in clean rooms – despite reports that there’s still some lingering confusion about how they’re best implemented. At this point, 48% of publishers are using clean rooms. That means they’re more likely to use them than marketers (37% of whom are doing so).
A focus on balancing privacy and identity
Publishers remain concerned about the level of privacy provided by clean rooms. They’re also concerned about the reliability or actionability of the data – which they fear may be compromised by outdated emails or limited scale of authenticated IDs. Publishers do need to keep in mind that clean rooms are helpful in gaining more value from their data, by allowing participants to compare their data sets. But they’re not intended to effectively replace third-party cookies.
Publishers are also increasing their exploration of and investment in identity solutions. For a while, it may have seemed to some publishers that Google’s delays in deprecating third-party cookies put the task on a lower priority level. But interest in probabilistic solutions grew 50% YOY in 2022, while interest in contextual targeting and authenticated or email-based solutions both held steady over that same time period.
In 2023, digital ad spend is projected to continue rising, though not as rapidly as in 2021, when consumers’ digital behavior was also changing rapidly. Publishers are compelled to make wise, well-considered investments, to keep up with the industry’s evolution while keeping the business’s bottom line strong. Those who have been lagging need to step on the gas. It’s time to start testing and implementing the tech that suits their goals and advertiser partners, empowering the open web and putting their data to work to compete with the walled gardens.
Last year proved to be another turbulent one. With continued economic turmoil, publishers are beginning 2023 with a great deal of uncertainty about what the year will bring. And, given the slow but inevitable depreciation of third-party cookies, many publishers are seeking out alternative ways to reach their audiences with increasing urgency.
It’s not all doom and gloom though. Industry innovations promise to provide publishers with the data – and critical insight – they need to continue demonstrating value to brand advertisers seeking to leverage their platforms.
Let’s take a closer look at the publisher trends we expect to take hold in 2023.
1. Widespread adoption of attention metrics
Despite a growing mindset shift towards attention metrics in 2022, viewable impressions are still the primary metric used by publishers to analyze ad performance. However, we are likely to see the balance tip in 2023.
Attention metrics give publishers deeper insight into ad performance – moving beyond how long an ad is visible, to a space where publishers can accurately measure the level of attention an ad receives.
Research into the attention economy shows attention metrics are 200% more effective at predicting the performance outcomes of ads. This gives publishers a much clearer understanding of how users interact with their content, which is of course an invaluable part of the buy-in process with advertisers.
2. A growing focus on first-party data
Ever since Google’s first announcement to end support for third-party cookies in Chrome, publishers have started to seek alternative targeting solutions. Indeed, Google’s own deadline extension was driven in part by the need to further test their Privacy Sandbox initiative, which was proposed as a less intrusive solution for delivering targeted advertising.
Publishers are already busy future-proofing their marketing plans with the development of first-party data strategies, and we expect to see a growing focus in this area in 2023.
3. The rise of data clean rooms
Clearly, first-party data provides the basis for a solid targeting solution. However, many publishers are beginning to realize their data can be further enriched with second-party data.
By entering into second-party data partnerships, publishers can share aggregated (and therefore anonymous) data for mutual benefit. Not only does this give them a clearer picture of how readers engage with their content, but in turn, demonstrates value to prospective brand advertisers.
Of course, these partnerships ideally take place in data clean rooms, which allow multiple parties to match user-level data without actually sharing personally identifiable information (PII). This is incredibly powerful for tracking ad performance and cross-platform user journeys without the need for third-party cookies.
In the past couple of years, clean rooms have gained a great deal of traction. This has been driven primarily by the concept of walled gardens, where media giants such as Google, Amazon, and Meta provide access to event-level data held within their platform, to enrich their own first-party data.
More recently, we’ve seen a rise in publishers entering into private, more mutually beneficial partnerships using independent clean rooms. And in 2023, we expect this trend to continue.
4. A shift towards contextual targeting
Over the years, behavioral targeting has been the primary mechanism for online engagement. It has allowed marketers to deliver greater ad relevance, informed by a user’s browsing habits, and powered by third-party cookies.
Of course, openly following a user around the web is no longer acceptable to consumers, nor will it be possible once third-party cookies disappear completely. As a result, we’ve seen a marked shift away from this method. Enter the improved ad relevance that contextual targeting brings.
By serving ads that are highly relevant to the context in which they’re placed, publishers can create a user experience that better resonates with their readers. This approach cuts through the noise of up to 10,000 daily ads to increase brand recall by 70%.
Recently, we’ve seen more and more publishers experiment with contextual targeting as they get to grips with cookieless approaches. But in the year ahead, we expect to see them implement far-reaching contextual campaigns with growing confidence.
5. The jury’s out on NFTs in publishing for 2023
Originally viewed as the next fad in the cryptocurrency space, NFTs struck a chord with publishers in 2022. They created opportunities to connect with new audiences, offer unique experiences to readers, and build added value into content subscription models.
At DC Comics, we saw the launch of the NFT Universe as a way to enhance readers’ comic book collections and unlock exclusive fan experiences. Meanwhile, TIME is working on a community-building initiative with its TIMEPieces project. And education publisher Pearson plans to turn its textbooks into NFTs to create a new revenue stream from secondhand sales of its titles.
However, after recent headline losses, publishers are likely to proceed with caution in 2023. It is important that they take the time to learn from the success and failures of previous investments before they take the plunge.
As we embark on yet another uncertain year, publishers will be seeking out and testing a range of new innovations. Only by working out what performs best with their readership, and what doesn’t, can they start to take engagement to the next level.
About the author
Navid Nassiri joined Switchboard as Head of Marketing in 2021. Switchboard’s powerful data engineering automation platform aggregates disparate data at scale, reliably and in real-time, to make better business decisions. In his role at Switchboard, Navid is focused on driving growth and brand awareness through innovative marketing strategies. Navid is a seasoned entrepreneur and executive, including leadership roles at PwC and NBCUniversal.
It remains to be seen when third-party cookies will be fully phased out. But it’s clear that the priority should remain building a solid and flexible first-party data strategy that can withstand constant shifts in privacy and audience accessibility. Publishers who do so will be best positioned to help advertisers fill the gap between privacy and personalization while uncovering new monetization opportunities. The Seller-Defined Audiences strategy is emerging as a way to reinforce the connection between publishers, advertisers, and the consumer.
Seller-Defined Audiences, commonly abbreviated as SDA, is the latest addressability specification created as a privacy-friendly alternative to replace third-party cookies. Collaboratively designed to drive the adoption of publisher first-party data, SDA is a cohort-based targeting approach that leverages well-established advertising standards to allow for scalable audience targeting without compromising data privacy and security.
Despite being proposed almost two years ago, the IAB Tech Lab has just released standardized taxonomy and transparency guidelines for SDA. And while cohort-based targeting isn’t new, SDA offers exciting privacy-first opportunities for publishers who wish to monetize their first-party data.
As it addresses common bottlenecks, such as data scalability and operational overhead, SDA empowers publishers to market their data to multiple buyers across all major advertising environments, including browsers, apps, and OTT/CTV. It does so without relying on third-party cookies and IDs and doesn’t risk data leakage along the way.
How SDA works
Instead of sharing sensitive user-specific identifiers and personally-identifiable information with advertisers, publishers can leverage SDA to organize their anonymized first-party data into standardized audience cohorts based on user interactions and other data points gathered on owned sites, apps, and platforms. This way, SDA allows publishers to easily assemble and curate their first-party data while maintaining complete control over it.
The process can be summarized in four steps:
1. Audience Segmentation
Publishers, with or without the help of their DMP, must map their first-party data into standardized demographic, purchase intent, and interest audience segments following the IAB Tech Lab’s Audience Taxonomy 1.1.
This includes 1600+ tiered segments that provide a common naming convention and taxonomy, allowing normalization, uniformity, and comparability across SDA from different providers. In addition to audience signaling, SDA can also support contextual and content signaling, but both use cases have yet to be standardized for industry use.
2. Documenting SDA Metadata
Publishers include the segment ID and taxonomy ID in the ad call using their ad server or the header bidding wrapper following IAB’s Data Transparency Standard (DTS). Or then can leverage a prebid RTD (real-time data) module created by a data provider.
3. SDA Activation
Publishers can activate SDA on the SSP level for buying on both open auction and PMP. In the first case, the SSP must relay the SDA metadata in the bid request. In the second, SDA can enrich direct deals and cross-publisher auction packages.
4. DSP Bidding
Upon receiving the bid request, the DSP will be able to read the included SDA metadata, segment, and taxonomy IDs and decide whether to bid on the ad call or bid on an SDA-enabled Deal ID.
On the advertiser side, SDA allows buyers to leverage publisher-standardized first-party data without the need to contact each provider to create one-to-one deals. Streamlined and scalable access to pre-packaged cohorts through deals or via Open Auction will drive the usability of publisher first-party data, ultimately making it more appealing for advertisers.
Opportunity exists, but publishers will need to be a driving force
By offering standardized, scalable, and easy-to-activate audiences, publishers can unlock new revenue streams and efficiently monetize their first-party data. However, the benefits of SDA will only be fully tangible when there is significant adoption of the IAB-sponsored specification across both publishers and advertisers.
The adtech platforms also play a big part in driving the use of SDA. As mentioned, SSPs need to be able to transmit SDA in the bid request, whereas DSPs should be trained to recognize SDA metadata to allow the selling and buying of data-enriched inventory.
Publishers can enlist help from DMPs to segment and map their user data to standardized cohorts. And the rapidly evolving data clean room solutions will further empower data collaboration between providers bringing cross-publisher SDAs to the market.
Curated marketplaces may prove to be indispensable in further streamlining SDA adoption. It offers publishers an easy yet controlled channel for distributing data to many advertisers. Given its advanced deal management capabilities, curation will lower the entry barriers for publishers who are not yet ready to share segments transparently on the open market but want to access large-scale demand for their SDA.
Publishers, with support from their SSPs and other tech vendors, will serve as the main driving force for educating the market and driving the adoption of SDA. This is an exciting new addition to existing addressability solutions and embracing it will benefit the entire industry. When paired with other solutions, SDA can significantly ease the transition to the soon-to-be cookieless world and help publishers maximize ROI while remaining in control of their first-party data.
Despite industry angst over the impending demise of the cookie, the fact is that third-party data was never that reliable to begin with. These two factors have made publishers sit up and think about their data strategy.
Should they rely solely on first-party data? Or should they look to augment with second-party data? (Second-party data refers to another company’s first-party data from their own subscribers, app users, website visitors etc., but with personally identifying information removed.)
For those looking to enhance their first-party data by tapping into other sources, data clean rooms are proving an increasingly viable solution. In fact, over a third of publishers are more likely to use them than marketers (48% vs 37% respectively).
What’s more, in an attempt to dispel confusion over them, IAB Tech Lab has announced plans to release its first Clean Room Standards by the end of 2022.
But how can you tell which clean rooms provide the high-quality, privacy-compliant data you need to inform your most important ad ops or revenue decisions? How easy is it to invite your brand partners to collaborate? And can you actually activate your campaigns from within a data clean room?
Here are the main characteristics that you should look out for in a clean room:
1. End-to-end data management workflow
As anyone who uses any management software will know – from project management to data management or finance – there’s nothing more infuriating than having to log out of one system and into another to manage your workflow.
Therefore, having an end-to-end workflow in one place is crucial for efficient operations. Opt for a solution that gives you complete control of your workflow and has easy integrations with your existing data automation software.
2. Purpose-limited “rooms”
The whole idea of a clean room is to create a safe, secure, and purpose-limited environment for second-party data. What we mean by “purpose-limited” is that you only have access to the specific type of data you need, rather than a whole bunch of data that is likely irrelevant. For instance, you can open a clean room for targeting, or one for measurement, or perhaps attribution, and invite your trusted partners to access the (anonymized) data. In each case, the data gets specifically matched to your use case.
This is in the consumer’s interest, as they know that their data will only ever be shared for the purposes for which they have given explicit consent (according to legislation such as GDPR or CCPA). However, it also means that, as a publisher, you only get the data you really need to make informed decisions.
3. Interoperability
The problem with certain data clean rooms today is that all participants must subscribe to (i.e. be a customer of) that clean room. Naturally, this can limit opportunities to collaborate (or it requires unnecessary time/expense to set each party up as a fully-fledged customer of the data clean room).
For instance, as a publisher, you will likely be looking to collaborate with your brand advertisers on projects such as creating ad campaigns for targeting, or creating analysis matches for forecasting purposes. But at the same time, you don’t need the hassle of persuading them all to become a customer in order to join your clean room.
Therefore, interoperability is an important feature of a clean room and you should look for a provider that has an open data infrastructure, allowing you to invite your brand partners to collaborate at will.
4. Activation
Gaining access to relevant data, and collaborating with your partners, is half the battle. But what happens when it comes to activating your data for campaigns? The ability to easily integrate with programmatic and other ad delivery systems – for instance GAM, The Trade Desk or DV360 – is just as important as getting access to the data in the first place. Without this activation functionality, you’ll need to build your own integrations, requiring huge engineering resources.
Therefore, you should look for a clean room provider that allows you to connect to these pre-integrated platforms, activate your campaigns, and start targeting your audience. These activities should all occur directly from the clean room, without actually moving your data.
The best place to begin is with your own first-party data. But for many publishers, augmenting with second-party data can provide deeper insights into the target audience. As long as you partner with a data collaboration platform that includes interoperability and activation capabilities as standard, the process shouldn’t be as daunting as it might at first seem.
About the author
Navid Nassiri joined Switchboard as Head of Marketing in 2021. Switchboard’s data engineering automation platform aggregates disparate data at scale, reliably and in real-time, to make better business decisions. In his role at Switchboard, Navid is focused on driving growth and brand awareness through innovative marketing strategies. Navid is a seasoned entrepreneur and executive, including leadership roles at PwC and NBCUniversal.
Many industry insiders believe the honeymoon phase is over for data clean rooms. In the face of third-party cookie deprecation, clean rooms were supposed to solve the data scale issues brands and publishers predicted. If your first-party data is limited, just combine it with your business partners’ data, in a secure and privacy-compliant environment. That makes sense, right?
In reality, it’s never been quite so simple. And it certainly hasn’t helped the reputation of clean rooms that so many vendors positioned them as a cure-all for a smattering of issues the industry is contending with – identity, targeting, measurement, attribution, analytics, and so on. The results brands and publishers are seeing generally don’t, and can’t, match the hype.
The hype and the backlash threaten to diminish the role data clean rooms can and should play in today’s digital ecosystem. They have a purpose, but the industry could use some clarity on that purpose, and the value clean rooms deliver, so stakeholders don’t get taken in and disillusioned by snake oil peddlers.
The reality of data clean rooms
First, let’s clear up what clean rooms are not. They are not a comprehensive replacement for third-party cookies. They are not a data solution. What they are is a tool – or, more specifically, an environment. They are a place for two or more trusted business partners to compare data sets, without entirely sharing their data – each business ultimately retains ownership of its data, and contributes only the data it wants to contribute. The data is scrambled to keep personally identifiable information private, and the partners in the clean room need to comply with each other’s policies for consent.
A potluck dinner only works if enough guests bring enough food to satisfy everyone’s appetite, and it’s best when the food is good, and there’s an interesting variety. A clean room works the same way. Every participant must take care to make sure they’re bringing what others need and to remember the best data provides the best leverage.
A data clean room will not somehow clean anyone’s bad data. It’s a “clean” room because the data is supposed to go in clean — with personally identifiable information (PII) and sensitive data scrubbed, parameters for comparing data sets established ahead of time, and identity solutions highlighted. This makes the data actionable and helps give it value. If the data can’t be used to follow consumers or audiences across the buyer’s path, across datasets provided by different clean room participants, it won’t be able to replace the cross-channel tracking capabilities currently enabled by third-party cookies.
But if participants embrace the spirit of collaboration, and bring data to the table that can connect the dots and help to meet each other’s business objectives, the possibilities are endless. Two or more publishers, for example, can bring together data that strengthens their audience profiles and collectively raises the value of their inventory. Or, an e-commerce site and a publisher can compare data sets to track customer journeys and understand the performance of referral programs.
The different data clean room environments
Several high-profile launches of clean rooms — from the likes of Google, Amazon, Disney, and NBCUniversal — have added to the hype. But we need to remember that the term “clean room” describes multiple types of environments:
Walled garden clean rooms deliver their value by allowing a business to layer additional data onto the data of what already sits within the owner’s walls.
Clean rooms within independent environments deliver the most mutual value for participants.
Centralized clean rooms, where the terms are set by the platform’s owner, are less collaborative and are more of a means to make the data within walled gardens available to the clean room owner.
Stakeholders exploring the possibilities of clean rooms need to ask: Does entering this particular environment enhance the value of our own data sets, or would we simply be handing our data over to a platform’s owner?
Ensuring interoperability across stakeholders won’t be a cut-and-dry process. It will likely call for industry-wide initiatives. For a precedent of what that might look like, consider the state of customer data platforms (CDPs). CDPs also went through a similar hype-followed-by-reputation-management cycle, until the CDP Institute was established to ensure the CDPs delivered what they promised.
Before industry-wide trust and satisfaction in clean rooms overall drops further, the legitimate, trustworthy clean room providers should get in front of the issue, highlight the technology’s intended use cases, and decry the trend of dubbing any privacy-compliant data platform a clean room.
Clean rooms — with the right combination of participants — bring too much potential value to risk becoming victims of their own hype. They need to be marketed and promoted appropriately, promising only what they can deliver. And industry stakeholders need to seek out those unbiased, independent environments and interoperable solutions that will drive toward a collaborative, privacy-forward digital future.
About the author
Eliza Nevers drives Lotame’s global strategy and product roadmap to serve the evolving data enrichment needs of marketers, agencies, and publishers. Fluent in the full product life cycle, from strategy to development and build, she brings nearly two decades of hands-on experience in tech to her role as Chief Product Officer at Lotame. Prior to her tenure, Eliza served as VP of Product for Verizon Media. During her 12+ years there, she steered the development and launch of AOL’s first DSP, SSP, and DMP, driving the evaluation and product integration of multiple strategic acquisitions as well as developing and launching AOL’s unified Advertising platform, ONE by AOL for Advertisers. Most recently, Eliza launched her own consulting business where she defined product strategy and product development agile processes for various companies in and out of adtech.
Publishers and consumers are on the same side in a fight against gatekeepers wielding market power through inscrutable algorithms, says Marta Tellado, president and CEO of Consumer Reports. Her new book, Buyer Aware, explores the promise of today’s connected products and the many predatory practices and commonplace data harvesting and unacceptable risk.
“The marketplace is changing so dramatically, and so rapidly, but the balance of power has shifted away from consumers. I really wanted to kind of reveal to consumers: what does this new marketplace look like? What protections and rights do you have?” said Tellado in a phone interview.
Consumer respect
That’s where her new book Buyer Aware comes in. Tellado weaves together a history of previous consumer advocacy success stories with the challenges people face today while trying to navigate a world where the gatekeepers of the digital media prefer profit over privacy and attention over accuracy.
“Consumer Reports has been around for 86 years. We’re incredibly proud of the consumer rights and protections that we’ve been able to forge with and for consumers. But the reality is, that many of those things do not translate into the digital space. Technology has raced so far ahead of us,” said Tellado.
All of these threads speak to the premise of the book: that the right of consumers to fair marketplaces be deemed equivalent to people’s civil rights – instead of being an afterthought, as it often feels. Tellado says we’ve reached a critical moment on the frontier of consumer protection, one where Consumer Reports is in a position to be highly instrumental. Throughout the brand’s history, the ongoing efforts of its team has led to accountability in the market. And today’s market is in need of increased accountability.
Publishers implicated in today’s issues
While Buyer Aware includes an overview of the state of financial scams (it’s bad), a status report of today’s latest safety issues and industry responses (also not great), a large portion of the book is dedicated to the threat of big tech’s market power and the misinformation in the media ecosystem. These are two issues in which the publishing industry is directly intertwined – Consumer Reports included.
“What many [media companies] are struggling with, is that we’re in an age where we have monopoly platforms that are essentially gatekeepers,” said Tellado. “They stand between us and the consumers we’re trying to reach.”
Tellado says the digital market doesn’t have the standards, the rules, and the guidelines to sufficiently protect consumers from exploitation. And she points out that publishers are at the mercy of search engine optimization and algorithms which ingrain society’s biases. Certainly, consumer rights are important for society. But she says that even beyond this, fundamental American values are at stake.
“Writing the book was also a way to tell a larger story about our democracy, that it can only really thrive if you have a fair and just marketplace. Those two things are incredibly connected.” However, the main thing that Tellado wants readers to take away is that there are constructive paths forward.
She points to the emergence and vilification of “hipster anti-trust” as the force which shouldn’t be underestimated through patronizing nicknames. With Lina Khan’s ascension at the Federal Trade Commission, the potential of real action taken against the big tech giants is palpable. Tellado also endorses the Journalism Competition and Preservation Act helmed by Minnesota Democratic senator Amy Klobuchar as a legislative solution.
Equipping consumers with tools
In a very on-brand “consumer-empowerment” move at the end of each chapter, Tellado suggests ways an average person could shield themselves from perils found in the previous pages. To combat big tech’s extensive reach over people’s private data, Tellado recommends people reduce the amount they hand over. More innovative options for people to leverage are found after following a QR-coded weblink. Consumer Reportshas developed online tools people can use to protect themselves. For example, they offer an app that reveals the information companies collect from individuals.
Perhaps this seems ironic considering how much the book pins on other digital systems. But that’s the point: these new marketplaces are a reflection of how our digital tools are used, not that they’re inherently bad. Despite the risks, the work by Consumer Reports and its peers in the media space show us there is a flag to rally around in defense of the rights of the consumer.
We live in a world where there is no shortage of data: first-party, third-party, and zero-party. However, most publishing organizations run the risk of boiling the ocean with too much data at their disposal and too little actionable insights to be gleaned from it. In my view, data visibility can bring alignment across multiple departments of a publishing organization and steer the organization towards common goals.
This post is about two specific use cases and actionable insights from first-party data that publishers can use to optimize their editorial and subscription strategies. These tangible use cases can help you seed a solid foundation for a data driven organization.
Using data to optimize editorial strategy
Most publishers rely on last touch attribution to understand which content converts users to subscriptions. No matter what attribution methodology you use, there are gaps in coordination between the editorial and marketing teams at most media organizations – which ultimately results in leaving subscription revenue on the table. Let’s take this 2×2 scatter plot as an example:
The ability to plot each piece of content on a conversions versus pageviews scatter plot can distill strategic outcomes for the entire organization. Each quadrant on the scatter plot leads to actionable insights for different departments:
High Conversions – High Pageviews: This segment of content is being marketed well and is generating high conversions. Maintaining status quo with all departments is recommended with content falling in this quadrant.
High Conversions – Low Pageviews: This segment of content is being marketed poorly but has high conversions. The marketing team should take action to promote this content across all of its channels.
Low Conversions – Low Pageviews: This segment of content is not converting. The action is on the editorial team to reconsider its strategy for this type of content. Alternatively, the subscriptions team can consider trying a lower priced offer for this type of content to test if the conversions go up.
Low Conversions – High Pageviews: This segment of content can be earmarked outside the paywall to maintain a healthy balance of site traffic.
The main takeaway to consider here is not about data collection but more about data representation and the importance of data visibility within the entire organization in near real-time. A simple graphic like this can help editorial and marketing teams understand what actions need to be taken. Still, there needs to be alignment on common goals, constant coordination, and outcome based service level agreements (SLAs) in place to execute on any editorial strategy.
Using simple data points to convert never-subscribers
Almost every publisher has two main types of audiences: ardent fans and casual users. For the sake of simplicity let’s call them subscribers and never-subscribers. To convert never-subscribers, you first have to know who they are. The good news is you don’t necessarily need sophisticated AI to understand who your never-subscribers are. There are many simpler options and proxies at your disposal.
Assuming you track the digital footprint for anonymous users, you can track which users have rejected your subscription paywall once or twice in the last 30 days. It is a clear indication that the user is casual in nature and potentially seeks another offer in order to convert to a paying user. Lowering the hurdle to a bitesize offer might be a prudent move at this stage.
Another proxy to identify never-subscribers would be to dissect the sources of your site traffic. Other than direct traffic, you will have traffic coming from social media, chat apps, content aggregator sites, search, paid media, etc. Attributing a scoring mechanism depending on the source could quickly help decipher if the anonymous user is a potential subscriber or not. For example, you may find Twitter users have a higher propensity to subscribe but Facebook users are very unlikely to subscribe. With this in mind, you should show a lower threshold offer to users landing from Facebook.
If you have global audiences, you could dissect anonymous users by geography. For example, you might say that domestic users are more inclined to take my all-you-can-eat subscription while international users will require a lower entry barrier. With this in mind, I might offer an introductory bitesize subscription option.
Here is an example that publishers can use to segment their audiences from very basic traffic data:
At the end of the day, if you are in the digital subscriptions business, your objective is to maximize your recurring revenue. This requires you to understand the lowest payment threshold an individual user is willing to commit to. With that information in hand, your marketing team must make a concerted effort to re-engage the user to move up your value chain.
About the author
Abhishek Sharma is a Co-Founder and CTO of Fewcents, a plug-and-play solution for publishers and creators to collect small payments in 80+ currencies. He has 18 years of experience in strategic consulting, vendor management, project and delivery management, business analysis, solution design, and application development. He specializes in building high-frequency trading systems, integrating with payment partners such as Paypal, Stripe, etc, and working on regulatory risk engines using big data technologies. Prior to this, Abhishek worked at DBS Bank, Credit Agricole, and MSCI Inc.
For the better part of the last decade, there has been a clear case for revenue diversification as the key to sustainable success in media. Dependency on one or two revenue models is risky, especially amid times of economic uncertainty.
Revenue diversification is particularly critical as ad-supported business models provide less and less return. Programmatic ad buying increased efficiency. However, in many cases it has led to lower CPMs for media companies. As a result many have increased the number of ads per page, which limits advertising’s impact. Now, even that option is under threat as the digital media industry faces the challenges that come with cookie deprecation.
As important as it is, revenue diversification is challenging. Media companies have seized the opportunity to launch more revenue generating products over the past few years, including newsletters, podcasts, leadership communities, and live events, to name a few. Each of these products generates additional revenue, which is great. However, they also provide a precious resource: better audience data and insights. That said, not every organization has the ability to fully leverage these insights to further capitalize through diversification.
Media companies that have a unified view of their data can better engage with audiences and drive them to new revenue-generating activities. Unfortunately, they rarely have a single view of reader and customer activity. The key to revenue diversification is uniting these viewpoints to maximize engagement and better understand and promote the products that are going to continue driving revenue.
Seizing the diversification opportunity
Before we dive deeply into the challenge, let’s spend some time laying out the opportunity, because it runs wide and deep. Media companies have explored myriad ways to drive revenue, from old-school methods like selling branded merchandise, to the new school, such as branded streaming services.
While the roadmap will look different for each media entity, there are three diversified revenue streams that every company should consider.
Live events
The first is events, which are experiencing a comeback as attendees get more comfortable with in-person interactions. Events are a way for a media brand to translate its content and expertise into a real-world experience. By building events around popular content topics and leveraging editorial staff as on-stage talent, media companies can create an engaging, replicable experience that drives revenue. While the pandemic may have paused in-person events, it showed that digital and hybrid events are a viable business model going forward. This opens the door to even greater scale for an events business.
Ecommerce
Second is Ecommerce. More than opening an online store, the opportunity lies in integrating seamless opportunities for readers and subscribers to make purchases as a part of their engagement with the content and media brand. Many media companies leverage affiliate programs to match their content to products and monetize that connection.
More advanced media companies integrate products and ecommerce links directly into their content so that they get credit for these sales, as opposed to having their audience buy a product elsewhere. By leveraging the trust in their media brand, they can promote products and drive sales, which helps both the media company and the retail partner.
Community
Finally, there’s the media company’s community. This isn’t simply about building a dedicated members-only portion of a website, or a unique newsletter. Community is about more than creating an environment. It’s about exciting brand advocates and aligning with activists in an organic fashion.
In fact, there is a very large opportunity to build multiple communities that are aligned with a media property. Larger organizations can no longer view their audience as a monolith. Readers visit sites for a variety of reasons, and the larger audience may be composed of several smaller segments that engage with different content topics. Each of these audiences represents an opportunity to unite and align with a formal community, complete with targeted content, newsletters, events, and other revenue opportunities.
The roadblock to diversification
The challenge for any large media company is getting a single view across all of their readers so that they understand how to strategically market and promote these new revenue streams. Big media companies likely have numerous disconnected back end technologies that offer a view into their readers and how those readers behave on a site, interact with newsletters, or what they consume at events.
This happens because companies often use point solutions to manage the individual revenue streams. These systems may have different information on the same individual reader, but the media company can’t unite those views and get a full picture.This siloed data is much harder to use across different internal disciplines.
For example, consider an event attendee with an interest in crypto. The business publication running the event wants to own the crypto reporting and thought leadership space through newsletters and interest-based communities. Right now, they don’t have a way to promote additional content, newsletters, podcasts, or subscriber communities to the attendees once they leave the event, because that data isn’t accessible to any of the other teams.
The event revenue is great, but there’s a missed opportunity for deeper engagement.
Engagement is the key to new revenue streams
I’ve written before about how the relationship with the content consumer matters more than ever. This is especially true with revenue diversification, where the goal is to use existing content and expertise to drive more revenue. Whenever audiences interact with content, it creates data that helps better understand their interests, which helps media companies identify new revenue opportunities.
It’s possible to find new revenue streams without a unified view of the audience. However, media companies are leaving money on the table if they don’t try to better unite their customer view in order to understand engagement.
The disappearance of third-party cookies and identifiers is reshaping digital advertising as we have come to know it. Add regulatory requirements related to personal data processing into the mix with evolving user attitudes toward advertising targeting and we have a challenge on our hands. Thus, publishers and advertisers must rapidly transform to continue to understand audiences, personalize content and ads, maintain the monetization, and preserve and measure the profitability of their investments. Luckily, there are already solutions to help meet these challenges.
The first possible solution relies on taking full advantage of first-party data. However, in the context of collaborative use cases, concerns related to security, privacy, and confidentiality beyond the technical limitations linked to their sharing remain. So, how can an advertiser enrich, share, or activate its data on external sites without risk? How can a publisher make its data available to an advertiser for audience extension outside its proprietary environments? Data clean rooms provide one possible solution.
What are data clean rooms?
Data clean rooms are platforms that allow advertisers and publishers to compare their proprietary data without one party having direct access to the other’s data. More concretely, the different stakeholders have access to a secure, virtual environment in which their data is matched with others’ data. Even if none of the participants has direct access to the data, the results of these “matches” allow them to enrich their understanding of their audiences, refine their segmentation and activate this data — safely and on a large scale. Without cookies and complex integration into the existing technology stack, this is done automatically.
It is important to note that each player remains in control of their data and can change their settings. Participants specify who can access the virtual room, which data is accessible and activatable, and over what period.
Targeting and activation capabilities of walled gardens on the open web
Data clean rooms are not new. They have already contributed to the success of online platforms operating in walled gardens. Web giants like social media networks who have amassed massive repositories of first-party data with the aim of giving advertisers a way to reach users behind their wall.
As our industry enters a privacy-centric era, driven by more data regulation and advances in technology, we’re quickly seeing interest in data clean room capabilities spread across the open web.
For content and service publishers, this technology represents an opportunity to offer data similar to walled gardens, increase the value of their inventories, and win back market share. In addition, advertisers can benefit from the same targeting (precision and volume) and activation (retargeting and prospecting) capabilities offered by the major platforms in premium and brand-safe environments.
Better data activation
A collaborative data strategy – one that is based on first-party data – allows advertisers to manage their advertising investments in a secure, privacy-conscious way and ultimately achieve better results. To optimize their budgets and increase the performance of campaigns, advertisers need to understand their audiences in detail while respecting the confidentiality of personal data. To do this, the use of their proprietary data is an essential first step. Second, advertisers and publishers must seek to enrich this data with similar and/or complementary audiences. For example, to communicate with car enthusiasts, a car manufacturer is interested in combining its own data with a publisher whose car section attracts a large audience or specializes in topics related to bank loans.
The most common uses of data clean rooms are to retarget users and increase the reach of a campaign. In the first case, the advertiser’s data is compared with a publisher’s to retarget everyday users. Machine learning and predictive analytics technologies built into the data clean room model the two parties’ data using a “lookalike” or “statistical twin” method. This method creates new prospecting opportunities as the advertiser can find new audiences with similar behavior to existing targets.
The use of data clean rooms is a win-win approach for all stakeholders. However, data clean rooms require a sufficient volume of data and an independent and reliable technological partner.
The bottom line is this: the digital advertising industry is being reshaped and rebuilt. It’s a new world where no one approach will suffice. Only a portfolio approach that is made up of multiple solutions – clean room data matching, first-party data, persistent IDs, contextual targeting, etc. – will ensure success without third-party cookies.
The annual Digital News Report from the Reuters Institute for the Study of Journalism is a must-read for anyone in the news, media and digital publishing industries. Clocking in at 164 pages, the latest study, which came out today, covers a cornucopia of topics, informed by an online survey of more than 90,000 digital news consumers in 46 countries.
DCN members will want to read the complete report when they can, but ahead of that, we wanted to share some of the most relevant findings for digital content companies. To do this, I read the full report, identifying key trends and corresponding with the lead author, Nic Newman, to discuss these areas in more detail.
It is a decade since the first Digital News Report was published. Newman reflects that, since then, we have seen a relentless decline in consumption of traditional news sources such as TV, radio, and print and the growing importance of digital and social media.
“This has brought [a] greater variety of sources and perspectives than ever before, especially for educated and interested news consumers,” Newman says. “But at the same time,” he adds, “we see those that are less interested [in the news] often feeling overwhelmed and confused.”
It is against this backdrop that major themes emanating from the latest report — including growing news avoidance, as well as declining interest and lower levels of trust in the news —need to be considered. With that in mind, here are four developments publishers cannot afford to overlook, and recommendations to help tackle them.
1. Respond to the implications of news avoidance
Issue:
One of the biggest topics explored in the report is news avoidance. “Selective avoidance” is on the rise globally, with growing numbers deliberately steering clear of content that is often seen as difficult and depressing. Long-running and recurrent stories — such as those covering politics, the war in Ukraine, or the COVID pandemic — are also driving audiences to disconnect more frequently from the news.
Implications / Solutions:
To avoid audiences checking out, publishers need to recognize that some approaches in practice can be off putting. Therefore, they may need to offer a different content mix and tone. Addressing this is challenging, Newman says, because audiences also want — and expect — the media to cover difficult stories. Nevertheless, Newman identifies three areas where journalists and publishers can tackle several core reasons people often give for news avoidance: accessibility, negativity and bias.
First, he argues, we need to make news content more accessible and easier to understand. “This is one of the reasons why young people and less educated groups selectively avoid the news.” He also notes that content is typically produced for avid news consumers.
“Avoiding jargon and insider speak will help,” he says. More explanation, directly asking for — and addressing — audience questions, as well as producing fact-based content for video and podcast formats, could also be useful.
Secondly, telling stories differently might mean embracing approaches such as solutions and constructive journalism, as part of a mix of formats and content styles. Newman suggests outlets consider “finding more ways to cover difficult stories that provide hope or give audiences a sense of agency around stories like climate change.”
Lastly, we need to rebuild trust and credibility. Over a quarter (29%) of news avoiders believe the news is untrustworthy or biased. That rises to nearly four in ten (39%) in the United States.
“Some of that is about partisanship,” Newman says, “some is about sensationalist chasing eyeballs and clicks.” Potential remedies include “signalling opinion more clearly,” as well as “not labeling everything breaking news when it isn’t,” an approach CNN has recently broached.
2. Double-down on revenue diversification
Issue:
Increasing reader revenue is a key strategic goal for many publishers. However, much of the digital spoils generated by subscriptions are enjoyed solely by the biggest national brands. In the U.S, around half of paid subscriptions go to just three titles: New York Times, Washington Post, and Wall Street Journal. More widely, fewer than one in five digital news consumers (19%) in pays for content.
Implications / Solutions:
This “winner takes most” dynamic can make it difficult for smaller and local publishers to compete. Furthermore, the rising cost of living may also mean that some audiences will look to cut back their expenditure on paid-for content.
That’s a development Newman says news providers are alive to.
“More publishers are recognizing that subscription on its own will not be enough,” Newman says, “especially as further growth is likely to [be] constrained by rising prices and the squeeze on household budgets. Developing multiple revenue streams will provide resilience and help publishers weather the coming storm.”
This impending subscription storm is not unique to news publishers, but all media players. A chart on page 21 of the report outlines this tension. It shows that while 14% of digital news users in the U.S. think that they will have more media subscriptions in the next year, a further 14% of users believe they will have fewer subscriptions in the same period.
3. Ensure you have an effective first-party data strategy
Issue:
As the experienced product marketer Aphrodite Brinsmead noted last year, browser updates and privacy regulations impact the ability to use personal identifiers and capture customer data through third-party cookies. Subsequently, Brinsmead reflected, “the race is on for publishers and brands to leverage first-party data.”
However, news consumers appear to be wary about providing personal information, such as email addresses, to publishers. Just under a third (32%) of the report’s sample indicated they trust news websites to use their personal data responsibly. This drops to fewer than one in five in France (19%) and the USA (18%).
Implications / Solutions:
Most publishers understand that they need to develop their first-party data capabilities. But knowing you need to get to grips with this, and effectively doing so, is not the same thing.
“The low numbers (28% average) who have currently registered with a news site show that most news websites simply do not have a clear enough value proposition to get people [to] give up their data,” Newman argues.
This principle aligns with the subscription challenge publishers face too. It is hard to convince audiences to pay for content if the same material is available elsewhere for free. In these circumstances it seems not even inclined to hand over their email address to access it.
To remedy this, Newman suggests, “publishers will need to use a mix of competitions, events and special features to get those numbers up. They also need to persuade people that they will treat personal data responsibly.”
4. Do things differently if you want to reach Gen Z
Issue:
As we outlined recently, Gen Z is a demographic with its own outlook and media habits. The Digital News Report reinforces this, with Dr. Kirsten Eddy, a Postdoctoral Research Fellow at the Reuters Insitute, commenting on the growing gulf seen in the media behaviours and preferences found among many younger audiences compared to other demographics.
Implications / Solutions:
This cohort is less interested in traditional news subjects like politics. It also has a weaker connection with news brands. “They are also more skeptical of traditional sources,” Newman advises. “They are also shaped by social aspect of news ‘who is telling the story’ and what others think about it.”
As a result, this is a demographic more likely to seek out diverse voices online. They are less concerned about impartiality and more comfortable with journalists expressing opinions on social media. A preference for more visual social networks has meant that across all markets the use of TikTok for news consumption has jumped among 18–24s from 3% in 2020 to 15% in 2022.
“But they are not simply all TikTokers,” Eddy cautions, recommending in a dedicated essay (found on pages 42-45 of the full report) that publishers connect with the topics young people care about, and develop content that is aligned to the style and tone of specific platforms. Publishers should do this, in preference to “expecting young people to eventually come around to what has always been done.”
The big four (and much more)
These four issues – reaching younger audiences, addressing issues of news avoidance, ensuring you have an effective first party data strategy and the need for revenue diversification – matter to publishers large and small. As a result, these were the topics that emerged as most critical upon first read of the Digital News Report 2022.
They are, of course, just a fraction of the actionable insights that can be gleaned from this weighty annual research study. Readers may also want to delve further into issues such as trust, polarization, as well as data related to the consumption of podcasts, online video, email news and attitudes towards coverage of climate change.
Both the immersive web and Web3 are coming. Publishers need to make sure they are prepared for any changes that will inevitably impact their data.
But before we start, it’s important to make the distinction between the immersive web and Web3. They are not the same thing.
As a new iteration of the web based on blockchain technology, Web3 will, according to many, revolutionize the internet by transferring the ownership and power of private data to users. It aims to “decentralize” management. As such, it promises to reduce the control of big corporations, such as Google or Meta, and make the web more democratic. It is defined by open-source software. It is also “trustless” (it doesn’t require the support of a trusted intermediary) and is permissionless (it has no governing body).
Meanwhile, the immersive web or metaverse (which many conflate with Facebook’s new branding as “Meta”), is a version of the online world that incorporates advanced technologies to enhance user engagement and blur the line between the user’s physical reality and the digital environment.
But what are the implications for data-driven companies?
With Web3, the most obvious data implication is that publishers will now have to deal with distributed data and new applications, which will require new connectors. It will also impact yield management. The simplest example is downstream royalties (i.e., publishers rewarding customers for the resale of their data and passing that cost along to advertisers).
Meanwhile, the immersive web’s key impact will be the explosion of data volumes, which by some estimates will push global data usage by 20 times by 2032.
The jump from Web 1.0 to Web 2.0 was massive enough. But the leap to the immersive web is likely to see an exponential increase.
So, when moving from terabytes and petabytes, to exabytes and beyond, what components do you need to unify your data?
Velocity and scale
Everywhere you look, there are data automation solutions promising access to “real-time” or “near-real-time” data. But the question shouldn’t just be: how can I get real-time access to data? Things are a bit more complicated than that.
Rather, you should be asking:
1. How can I scale ongoing operations by having a data integrity engine that I can trust to continually scale, as my disparate data sources increase in number, and as my data sets explode in volume?
Building one data pipeline manually is manageable. But having the flexibility to add more pipelines, and connectors, becomes unsustainable without automation. For example, it can take up to a month to build each new connector manually. Unfortunately, that means the data from each new integration (Facebook, Snapchat etc) is out of date by the time your teams can access it. And if you need multiple new APIs for multiple different purposes – all at the same time – chaos reigns before you know it (and with no clear end in sight).
Any publisher attempting to keep up with the influx of new and ever-changing APIs on the horizon in Web3 needs to build a strong and workable data unification platform, now.
2. How long will it take to build a strategic data asset in the first place, before my teams get access to the data?
There’s no use in having access to real-time data in six months’ time. To make informed business decisions, your teams need that data right now. However, in reality, the majority of publishers embarking on building (or buying) their own data unification platform accept that they’ll need to wait for months before they can get close to any actionable data. For example, it might take six data engineering personnel a period of three to six months to code a bespoke platform that is useful to their individual business teams.
In an age of automation, where real-time data is key to keeping up with the competition, these time frames are no longer acceptable.
Smart data pipelines
Typical data pipelines are “dumb.” That means that they’ll pull all the data you need, but also a whole lot you don’t. For example, you might only need to access 100GB of your 1TB dataset to transform into actionable data. But without building a smart API for the job, it will end up pulling the full terabyte, which you will then need to store in your data warehouse.
The costs of exponentially larger data volumes can soon spiral out of control if left unchecked. Instead, you need to build APIs for specific cuts of the data that your teams need. This is what we call a smart data pipeline.
While the pace of adoption of Web3 is still unclear, the immersive web is just around the corner. It’s imperative that data-driven companies are prepared for what’s coming. That’s not just a few more rows of data to process and store, but a tsunami of new and larger data sets that will become overwhelming overnight without the right infrastructure in place.
For any publishers still attempting to carry out their data operations manually, they need to look to automate, wherever possible, before it’s too late.
About the author
Navid Nassiri joined Switchboard as Head of Marketing in 2021. Switchboard’s data engineering automation platform aggregates disparate data at scale, reliably and in real-time, to make better business decisions. In his role at Switchboard, Navid is focused on driving growth and brand awareness through innovative marketing strategies. Navid is a seasoned entrepreneur and executive, including leadership roles at PwC and NBCUniversal.
Bugs Bunny and Michael Jordan co-starred in Space Jam. Bill Cinton was re-elected to serve a second term as President of the United States. Tiger Woods became a professional golfer. The Summer Olympics were hosted in Atlanta. And washingtonpost.com went live. What do all of these events have in common? They all took place in 1996.
It has been 25 years since those first readers could get their news from The Washington Post online. Back then, Post articles couldn’t be “googled,” since Google — as a company — would be founded two years later. And sharing a news article with friends couldn’t involve Facebook or Twitter, as these networks wouldn’t come to market for eight and 10 more years, respectively… TikTok was only the sound an analog clock made and early-social media adopters were closer to Tom being their first friend on MySpace than influencers going viral and becoming millionaires from creating content on Instagram, Snapchat and/or YouTube.
News consumption was a one-size fits all paradigm: heard or seen via broadcast news on TV or radio, read from printed ink on paper, and skimmed from websites that were effectively static brochureware representations of their print big brothers (with some supplemental content online). There was no personalization. The model was one-to-many: here are the top things reader X, Y and Z need to know to stay informed. That model is changing and has changed. And The Post has shown success in personalizing the news to readers’ interests through My Post, newsletter subscriptions and much much more.
Stay tuned, Washington Post readers are about to see more personalization in 2022!
Creating and distributing the news: then vs. now
An “Apple-to-Apple” Comparison of Reading The Washington Post on December 20 in 1996 and 2021 through then-Modern Apple Technology:
The December 20, 1996, homepage of washingtonpost.com on an Apple Macintosh
Rhe December 20, 2021, home screen of the WashPost iOS app on an Apple iPhone
Change is good. But change needs to be managed. A lot has changed in this last quarter century at the intersection of media and technology. The Post has responded to change by building new systems that manage how content is created, distributed and amplified. But one thing has remained constant — great reporters and editors create great journalism.
Another constant is that quality journalism will be seen or heard by consumers looking to stay informed. And it can shine through the cloudy haze of mis-and-disinformation maliciously shared online.
Although these constants of good journalism from trusted institutional brands and other media players communicating the news remains, how consumers get their news has certainly changed with the times. In today’s digital new media landscape, according to The Pew Research Center, “more than eight-in-ten U.S. adults (86%) say they get news from a smartphone, computer, or tablet ‘often’ or ‘sometimes.’”
As a media AND technology company, The Washington Post has not just followed consumers to their preferred destinations, it has been a leader in creating content and bringing it to readers — readers who may have an interest in politics can get their Daily 202 newsletter emailed to them; food enthusiasts can cook with confidence with Voraciously recipes and guides; podcast listeners can subscribe to Post Reports, Please Go On, Can He Do That, and other audio format news; and over 1.2 million fans of @washingtonpost on TikTok can be informed and entertained by short, witty, videos by a creative team of content creators.
All of this work needs to live somewhere. Platforms, tools and services power this news before it reaches readers’ smartphones, computers, or tablets. The Washington Post has had to understand not just the scalable infrastructure needs of today to deliver this news where and how readers want it, but technology leadership has also had to set the organization up to be successful in the future with new and expanding infrastructure and Infrastructure-as-a-Service (IaaS) resources. It’s like the old sports adage — Wayne Gretzky wasn’t the fastest skater on the ice in the NHL and he wasn’t the biggest professional hockey player. He was the greatest because he played not to where the puck currently is, but where the puck was going.
The Post’s aspiration and northstar is to not just continue to deliver excellence in journalism, but also to equally deliver excellence in engineering and innovation. The Post is playing to where the innovation puck is going by as, Deloitte Insights suggests, “designing systems in which humans and machines work together to improve the speed and quality of decision-making.” The Post is doing this to improve the reader experience through personalization and to allow company leaders to turn more data into actionable intelligence at scale.
“I’ve always understood and appreciated the work that The Post contributes to the journalistic space, but interviewing [for my role at The Post] quickly made me realize the sophistication behind the engineering effort supporting that mission.”
— Washington Post Data Engineer Jack Miller, who joined The Post in 2021.
Data, data, everywhere. Data, data, time to share.
Moore’s Law highlights the correlation of computing power essentially doubling every two years. That’s become more of a rule than a law over-time. Another rule that has held steady is the total amount of data created or copied doubles roughly every two years — therefore, The Post has seen a whole lot of redoubling of total data since 1996 and Post engineering leadership expects that trend continue in the coming years.
Inside-and-outside of the newsroom, The Post — as a business — relies heavily on data-informed decision making at strategic and operational levels. Over the years, in addition to the increased need to approach data management in a holistic way, The Post has experienced a significant increase in subscriptions and traffic across various platforms and channels. This increased data volume and velocity coupled with new sources and complexities has created new challenges (and opportunities) to turn raw, siloed and unstructured data into business intelligence.
To address these challenges/opportunities and gain maximum journalistic and business benefits from reader interests, The Post began to develop a more integrated approach to data management in 2021 under the leadership of Beth Diaz (Vice President of Audience Development & Analytics), Venkatesh Varalu (Head of Data and Analytics), and in collaboration with leaders across Subscriptions, Advertising, Newsroom, Marketing, Finance, Product and Engineering.
This data was available and accessible prior to 2021, but The Post began to manage it in a more innovative, agile and programmatic way. Under this new approach, customer data is being positioned to power various marketing and reader personalization efforts through enhanced workflows, automations and data activations via homegrown tools or services and vendor platforms. The Post is calling this macro-initiative WaPo 360.
“I’ve always been a huge fan of data. Working as a newsletter analyst, I got the opportunity to explore The Post’s various data sets to answer interesting questions about how our readers behave, and to find evidence of what works best for keeping them engaged,” said WaPo 360 Senior Data Engineer Patrick Haney. “It was a fantastic experience. However, while working with these data sets, it became almost immediately clear that they weren’t arranged in an optimal format for analysis. Answering simple business questions could take hours instead of minutes due to the siloed nature of each data set, along with the business logic that needed to be applied in a consistent fashion and often it required reaching out to a subject matter expert for validation.”
“I was ecstatic when I learned about this new data integration initiative because it would solve all these aforementioned issues and enable analysts and non-analysts to quickly and efficiently use our data to answer vital business questions,” said Haney regarding his choice to transfer from one Post team onto another.
According to a recent Deloitte study, “most executives do not believe their companies are insight-driven. Fewer than four in 10 (37 percent) place their companies in the top two categories of the Insight-Driven Organization (IDO) Maturity Scale, and of those, only 10 percent fall into the highest category. The remaining 63 percent are aware of analytics but lack infrastructure, are still working in silos, or are expanding ad hoc analytics capabilities beyond silos.”
WaPo 360 will improve the turn-around time for The Post to turn data and signals into insight-driven business decisions.
WaPo 360 and the engineering experience
When he applied to work at The Post, Jack Miller said his “interviewers stressed the importance of the WaPo 360 project across many different verticals within the organization. Being able to join a growing team supporting that project was a huge reason why I decided to pursue the position and so far it has been a great experience.”
Fellow team Data Engineer Zach Lynn agrees, saying, “the WaPo 360 project struck me as an excellent opportunity to learn and also support The Washington Post’s core mission.” Lynn’s interest included working in several business areas and collaborating with other software teams.
The first step of WaPo 360 has been focused on stitching data signals from various data sources together. Data that previously was unstructured and accessible only to data analysts is becoming democratized for Washington Post engineers and technical business users. This first pillar of work is essentially warming up the oven and organizing all of the ingredients to make it easier for business stakeholders, in different departments, to bake their own pies. Data from site and app traffic, newsletter engagements, ads, subscriptions, and other sources are becoming more structured in WaPo 360 through Customer 360 — our first pillar of the initiative.
A Washington Post data analyst recently presented how his work has been impacted by WaPo 360. In his presentation, he outlined how he experienced a nearly 96% improvement in a SQL query run time by switching data sources from the siloed unprocessed data that he was looking for to the same data signals that were structured and pre-processed in WaPo 360. As noted earlier, different data sets have been accessible before 2021, but with WaPo 360, The Post is turning data into intelligence and making it easier for staff to do their jobs. WaPo 360 is essentially replacing their hand tools with power tools.
WaPo 360 and the business-user experience
The data that is becoming structured and pre-processed in Customer 360 isn’t just going to live on an island to be visited by data analysts and data engineers. The second pillar of WaPo 360 is to make that data accessible to those with a business need to access it, in anonymized ways, through improved self-service tools.
Joshua Zieve, Senior Data Analyst, joined the WaPo 360 team, to “help catalyze The Washington Post’s data sources to better understand and serve our current and prospective readership.” Zieve has been active in coordinating with business and technical users on many fronts. “Working across the Analytics & Engineering teams, I’m grateful for the opportunity to develop systems that facilitate, deepen and expedite analytics for use-cases throughout the organization,” Zieve said.
Good data is the foundation for WaPo 360 and that leads to personalization benefits. Following the team’s work in delivering structured data in Customer 360, WaPo 360 sends relevant data to power the business use-cases that Zieve references into a new Customer Data Platform (CDP). The CDP then works as an engine to allow business-users to perform exploratory data analysis, build audience segments, create marketing and reader engagement campaigns, analyze their success, then deliver an improved personalized experience to readers through integrations with Washington Post-built tools and popular offsite services that The Post utilizes to reach potential readers.
“[I’m] most excited about the self-service potential for The Post’s newsroom and business teams … with data in one place, which is aggregated and ready to be queried, users can get their data without waiting for The Post’s Analytics team to prepare the data. For the Analytics team, this will also reduce time spent for serving ad hoc requests from the newsroom/business side.”
— Sam Han, Director of Zeus Technology and Artificial Intelligence (AI) / Machine Learning at The Washington Post
WaPo 360 and the reader experience
The Post will be doubling down on personalization in 2022 — directly and adjacent to the work being conducted by the WaPo 360 team.
Early work is underway to improve the onboarding experience for new subscribers. And the team plans to unlock significant opportunities to retool, rethink and reshape how articles are suggested to readers — such as through improved content insights and an updated Content Taxonomy System with new article subjects/topics metadata powering future innovation.
Members of the WaPo 360 team recently presented the team’s work at a company-wide virtual forum. Washington Post Organizational Development Consultant Cameron Booher said, “Planning for any What’s Next event involves talking with many project teams about their ongoing and upcoming initiatives. And the usual format of What’s Next is to highlight three projects from different areas of the business. But it very quickly became evident through conversations just how significant of an undertaking WaPo 360 is. It’s extremely collaborative, and has been built upon expertise from almost every department at The Post. It will be rolled out in various phases, which speaks to the iterative process of develop-test-improve.”
“Some of the insights we’ll gain will help us improve reader and user experiences in spades,” Booher said.
This article originally appeared on Washington Post Engineering and is re-published with permission.