Tags containing: "copyright"

-hands working together to align cogs to represent publishers negotiating AI licensing together-

When it comes to AI negotiations, publishers are stronger together

April 24, 2025 Paul Bannister, Chief Strategy Officer – Raptive

By now, we are all painfully familiar with the way AI systems are reshaping how audiences discover and consume information—often at publishers’ expense. These powerful technologies reuse publishers’ content, usually without permission or fair compensation, placing growing pressure on publisher revenue and content control. Premium content creators of all sizes face identical risks as AI companies increasingly set the rules.

However, media companies are far from powerless. By taking collective action, the media industry can assert control over how our content is used and ensure our voices are central to shaping AI policies. Several practical pathways exist, including regulatory advocacy, strategic litigation, licensing agreements, and technological measures. The key is that we must work together.

Regulation/policy: defining the rules for AI

Enforceable policy regulations represent a clear line of defense against unauthorized use of content. Currently, ambiguity around “fair use” allows AI companies significant leeway. OpenAI’s CEO, Sam Altman, recently acknowledged this plainly, admitting that restrictions on AI scraping copyrighted material would threaten his company’s existence.

Altman’s candid admission underscores exactly why publishers must engage policymakers immediately. President Donald Trump’s recent executive order, “Removing Barriers to American Leadership in Artificial Intelligence,” explicitly seeks to minimize regulations that might hinder AI companies from pursuing their current path. OpenAI and Google have seized this opportunity to advocate aggressively for fewer copyright restrictions, claiming tight regulations threaten American AI dominance in a geopolitical race with China. Help will not come at the federal level anytime soon.

Several state legislatures are actively addressing AI’s impact on copyright, notably California’s AI Copyright Transparency Act (AB 412) and New York’s Artificial Intelligence Training Data Transparency Act (S6955), both of which mandate transparency from AI developers about copyrighted materials used in training models. These initiatives indicate state-level momentum and promise to set precedents that other states will follow. That said, the most immediate forum for action is likely in the courts.

Courts: establishing clear legal precedent

Legal action is the most promising line of defense and has already proven effective. Recent cases, notably Thomson Reuters v. ROSS Intelligence, represent critical opportunities to establish binding precedents around copyright and AI that can level the playing field.

In February 2025, the U.S. District Court for Delaware ruled decisively in favor of Thomson Reuters, determining that ROSS’s unauthorized use of copyrighted content to train its competing AI product was not protected by fair use. This is a big win for every publisher because it clarifies what has historically been a vague and uncertain doctrine. Making it stick will require a broader chorus of legal wins, but it’s a start.

Recognizing these stakes, publishers are increasingly acting together in the courts. One example is a joint lawsuit from 14 major media organizations—including Condé Nast, Forbes, and The Atlantic—against AI startup Cohere. Similarly, litigation initiated by The New York Times against OpenAI and Microsoft has been consolidated with cases from the Daily News and the Center for Investigative Reporting, forming the beginnings of a unified front of defense.

The outcomes of these collective efforts matter profoundly. While individual settlements might resolve immediate conflicts, only definitive court rulings can deliver lasting protections. Publishers at every scale share a vital common interest in supporting cases that reinforce strong, enforceable copyright standards for everyone. Everyone.

AI licensing negotiations: balancing opportunity and equity

Licensing agreements offer publishers another critical tool to monetize their content and control AI usage. These deals can deliver revenue and clearly define permissible AI applications, and we’ve seen a string of them recently. Yet licensing strategies carry risks: agreements negotiated by major publishers could inadvertently create a market divided between haves and have-nots. It’s also unclear if any of these deals will have long-term value, as the damage done to publishers will likely be much higher than any small payments from these deals.

Smaller publishers risk marginalization if AI licensing standards and terms are set exclusively by larger publishers. Collective approaches that define fair, equitable standards can help ensure licensing agreements work for the entire publishing ecosystem rather than fragmenting it.

Technological barriers: limitations of blocking AI crawlers

Technological measures, such as blocking AI crawlers from publisher sites, are another avenue. It’s worth pursuing, but we should not look at this as a long-term strategy. AI companies regularly evolve their technologies, circumventing technical barriers almost as quickly as they emerge.

While publishers can (and should) employ these measures strategically, lasting protection depends more heavily on clear regulatory policies, decisive court precedents, and equitable licensing agreements.

Making collaboration count

General calls for industry collaboration frequently fall short, offering little beyond vague ideals. Yet the AI challenge distinctly highlights how all publishers, regardless of size, share identical interests. Whether an independent blogger, a small-town newspaper, or a global publisher, AI-driven content reuse affects everyone similarly. AI does not care how big you are.

We’ve already observed direct negative impacts on publisher traffic from AI-powered overview summaries in search results. These early signs are merely the beginning. The entire digital landscape—search behaviors, traffic patterns, and monetization structures—is changing fundamentally with AI, and fast.

Publishers need support to run a sustainable business. This has compelled Raptive to advocate on the AI issue precisely because we recognize it is existential to the viability of independent publishing–and the power of strength in numbers. We have invited publishers with whom we work to sign a new agreement that lets us represent their interests in conversations with tech platforms around AI negotiations.

All premium content creators—those supplying the original, authentic content powering the internet—share a truly common interest. Now is the moment to advocate for it; we’ll be stronger if we do it together.

A golden light bulb leaning against a copyright symbol to show how copyright protection supports AI innovation

Copyright and AI: a win win

March 20, 2025 Chris Pedigo, SVP Government Affairs – DCN

In terms of public policy debates, Artificial Intelligence continues to be the belle of the ball with nearly every major government courting the industry to locate their investments and jobs within their jurisdictions. Europe, China, Korea, and the U.S. (among others) have laid out competing tax and government spending plans to entice and encourage AI companies. Against this backdrop of AI frenzy, President Donald Trump, via the Office of Science Technology and Policy, has solicited input on the formation of an “AI Action Plan” in order to “define the priority policy actions needed to sustain and enhance America’s AI dominance.”

Unsurprisingly and unabashedly, tech companies advocate that the U.S. government allow their content-generating AI models to train on copyrighted material without consent or compensation. However, as DCN noted in our comments regarding the action plan, a key component to achieving the stated goal of enhancing America’s AI dominance – and the broader success of American businesses – is the robust protection and enforcement of U.S. intellectual property law including the Copyright Act.

Copyright protection makes legal, and financial, sense

The longstanding legal rights for copyright holders are derived from the U.S. Constitution (Article I, section 8, clause 8), which affords them the opportunity to monetize the results of their hard work and investment in a variety of ways and incentivizes them to reinvest in the creation of additional content and new innovative delivery mechanisms to potential consumers. As a result of these longstanding rights, American content creators, including news organizations and other publishers, are able to contribute significantly to U.S. economic growth, including through employment, exports and important trade surplus, and digital services and goods.

According to a recent study, copyright-based industries accounted for 12.31% of the U.S. economy and 63.13% of the U.S. digital economy. From 2020 to 2023, these industries outpaced U.S. economic growth almost threefold. In the digital sector alone, copyright-based industries employ 56.6% of all employees in the digital sector. The annual compensation paid to core copyright workers is approximately 50% higher than the average U.S. annual wage. As for the global impact, the sales of select U.S. copyrighted products in overseas markets amounted to $272.6 billion, which exceeded the sales of other IP industries including pharmaceuticals, agriculture, and aerospace.

Copyright, competition and a fair market

Unfortunately, the manner in which many AI developers have exploited original content without consent or compensation – to build and operationalize their commercial products – has unjustifiably violated the rights of copyright holders. It has upended the existing balance which has historically sustained and promoted innovation.

AI developers use copyright protected content not only to “teach” their models to predict and mimic language skills, but also as a means to create compelling outputs which have the compounding harm of substituting for the original works on which the models were trained. This activity unfairly competes with those who invested in the creation of the original material and undermines their ability to seek a fair economic return. In fact, U.S. Senior District Judge Beryl Howell noted earlier this week in a copyright case attempting to argue fair use that the publisher’s content is “so valuable they put a copyright on it.” Exactly.

By “reaping that which they do not sow” AI companies cause harm to creators, publishers and the ecosystem as a whole. It is important that this form of destructive misappropriation be deterred, whether by copyright law or other appropriate means. In the U.S, there are 39 related lawsuits and counting. The outcome of these suits will provide much-needed clarity regarding the application of existing copyright law, including the fact-specific defense of fair use, to the infringement of the rights of copyright holders to develop generative AI technology.

However, one U.S. District Court recently confirmed that licensing is required for the use of copyrighted content to train an AI system. In Thomson Reuters Enter. Ctr. GmBH v. Ross Intel. Inc., the court, applying clear and recent precedent from the U.S. Supreme Court, held that the defendant’s unauthorized use of the plaintiff’s works to train the defendant’s AI system was direct infringement and did not constitute fair use. The Court reaffirmed that the impact of the use on existing and potential markets is the single most important element of a fair use analysis, and that there was clearly a potential market to use the materials at issue in the case to train AI.

Innovation flourishes within the copyright framework

Lest the VC crowd be dismayed, a licensing framework is emerging as many deals have been struck by publishers, record labels, motion picture industries, and others. OpenAI, Google, and Perplexity have all made efforts to pay for the right to use protected content to power their models and tools. This is a clear acknowledgment that this model is not only necessary, but eminently feasible.

While publishers’ rights are coming into clearer focus in the U.S., AI companies are beginning to feel a shared pain as evidenced recently by DeepSeek’s R1 model. OpenAI accused the company of IP theft, claiming that DeepSeek may have used OpenAI’s IP and violated its terms of service to develop its AI model.

“We know PRC (China) based companies – and others – are constantly trying to distill the models of leading US AI companies,” OpenAI said in a statement to Bloomberg. “As the leading builder of AI, we engage in countermeasures to protect our IP, including a careful process for which frontier capabilities to include in released models, and believe as we go forward that it is critically important that we are working closely with the US government to best protect the most capable models from efforts by adversaries and competitors to take US technology.”

A rising tide can lift all boats. Only maintaining existing copyright protections will lead to a robust, free market where creators are incentivized to make high quality works and AI companies are incentivized to license them. Importantly, in this robust market, AI companies would continue to have access to quality content which is critical for training and outputs. The American values of IP protection have been a cornerstone in our country’s innovative spirit and competitive edge over foreign adversaries. Protecting IP is a matter of preserving the core principles that distinguish American businesses in the global market. For the history of the U.S., copyright and innovation have gone hand in hand and there is no reason to deviate from that successful combination as we build the next chapter.

Read DCN’s Comments on the AI Action Plan, which were filed with the Office of Science and Technology Policy on March 15, 2025

AI and human playing chess to represent AI licensing negotations

The AI reckoning for publishers and platforms

February 27, 2025 Jason Kint, CEO – DCN

The publishing industry has been of two minds on AI’s rapid advancements – optimistic and cautious – sometimes within the same company walls. Business development teams explore much-needed new revenue opportunities while legal teams work to protect their art and existing rights. However, two major legal developments, the Thomson Reuters v. Ross Intelligence ruling and shocking new revelations in Kadrey v. Meta, expose the fault lines in AI’s unchecked expansion and set the stage for publishers to negotiate fair value for their investments.

One case confirms that publishers have a right to license their content for AI training and that tech advocates’ tortured analysis of fair use doesn’t throw out rights engrained in the U.S. Constitution or require publishers to opt-in to attain them. The other case suggests that Meta may have knowingly pirated books in its high-stakes race to keep up with OpenAI and that Meta’s notorious growth-at-all-cost playbook is more exposed than ever.

AI companies can no longer operate in a legal gray zone, scraping content as if laws don’t apply to them. Courts, lawmakers, researchers and the public are taking notice. For publishers, the priority is clear: AI must respect copyright from the beginning including for training purposes, and the media industry must ensure it plays an active role in shaping AI’s future rather than being exploited by it.

Thomson Reuters v. Ross: A win for AI licensing, a loss for those who intentionally avoid it

In a landmark decision, a federal judge ruled this month in favor of Thomson Reuters against Ross Intelligence, a startup that trained its AI model without rights or permission using the Reuters’ Westlaw legal database.

Judge Stephanos Bibas’ ruling in the Delaware district court is notable because he explicitly recognized the emerging market for licensing AI training data. This undercuts the argument that AI developers can freely use copyrighted works under “fair use” factors. And, consistent with DCN’s policy team, it also highlights the significant importance of the fourth factor of fair use, which publishers have been demonstrating with the signing of each new licensing deal.

For publishers, this is a crucial precedent for two reasons:

AI training is not automatically fair use. Content owners have the right to be paid when their work is being used to train AI.

A market for AI licensing is forming – this is the fourth factor. Publishers should define and monetize it before platforms dictate the terms.

This decision marks a turning point, ensuring that AI development doesn’t come at the expense of the people and companies producing high-quality content. Sam Altman of OpenAI, and other leadership across the powerful AI industry, have attempted to invent a “right to learn” for their machines. That’s an absurd argument on its face but regularly repeated in high-profile interviews, as if the technocrats might will it into reality.

Kadrey v. Meta: Pirated Books, torrenting, and a familiar playbook

While the Reuters ruling validates AI licensing, Kadrey v. Meta reveals how some AI developers have worked to avoid it.

Recently unsealed court documents suggest that Meta employees knowingly pirated books to train LLaMA AI models used as their first commercial version (LLaMA2). Significantly, their fair use analysis shifted from “research” to making bank – a lot of it.

Evidence revealed that demonstrates this knowing strategic shift:

Meta employees downloaded pirated book datasets from a massive, pirated dataset, LibGen, with employees even using torrenting technology to pull it down.

They may have “seeded” and distributed this pirated content to others. That’s a potential violation of criminal code that their own employees shared this, “What is the probability of getting arrested for using torrents in the USA?”.

Meta worried that licensing even one book would weaken its fair use argument, so it didn’t license any at all.

Some employees explicitly avoided normal approval processes to keep leadership from having to formally sign off.

Some documents suggest Mark Zuckerberg himself may have been aware of these tactics with documents referencing escalations to “MZ.”

Meta appears to have stopped using this material ahead of LLaMA3, possibly signaling awareness that their actions were legally indefensible.

Making matters worse, Meta’s case is being overseen by Judge Vincent Chhabria in the Northern District of California. This is the same judge who sanctioned Facebook’s lawyers in its massive privacy settlement that led to record-breaking settlements approaching $6 billion with the FTC, SEC and private plaintiffs. In that case, Facebook was accused of stalling, misleading regulators, and withholding evidence related to its user data practices. In other words, Judge Chhabria knows Meta’s playbook: delay, deny, deflect.

Now, Meta faces a crime-fraud doctrine claim. This means that some currently sealed legal advice could be unsealed if it was in furtherance of a crime. If proven, this would not be a simple copyright dispute; it could potentially lead to criminal liability and further regulatory scrutiny. The Court is ordering Meta to unseal more documents this week.

Move fast, break things… again: Meta’s AI strategy mirrors its past scandals

The Kadrey case’s revelations closely resemble Meta’s past data controversies, particularly those that were all put into the basket of Cambridge Analytica. The many ongoing details of the cover up of the scandal are still emerging today. Unfortunately, they were mostly overlooked by the tech press corp who have not been tuned in to these issues for far too long.

For years, Facebook pursued a strategy of aggressive data harvesting to accelerate its growth in mobile where it had risk of being supplanted by new platforms. The company:

Scraped vast amounts of publisher and user data without clear consent.

Shared this data widely with developers in exchange for reciprocal access to their user data – fueling Facebook’s mobile market share grab.

Ultimately settled with regulators for billions after repeated privacy violations.

Now, in Kadrey v. Meta, history appears to be repeating itself. Internal documents show that Meta feared OpenAI and needed to accelerate its AI development. Thus, Meta felt pressured to take outsized risks. Meta’s approach to AI training follows a similar pattern:

Acquire the best data – legally or not.

Use it to gain an edge over AI competitors.

Deal with legal and regulatory fallout later, if necessary.

Recently unsealed documents even expose a documented mitigation strategy.

Remove data clearly marked as pirated (but only if it’s in the filename despite letting the coders strip out copyright info in the actual content)

Don’t let anyone know what data sets they’re using (including illegal datasets)

Do whatever possible to suppress prompts that spit out IP violations

Key takeaways for publishers and media companies

The Thomson Reuters and Kadrey cases demonstrate both the risks and the opportunities for publishers in the AI era. Courts are starting to push back on AI’s unlicensed use of copyrighted content. But it’s up to the publishing industry to define what comes next.

Here are the big issues we must address:

AI models need high-quality data. And publishers must ensure they’re compensated for it. The Reuters ruling proves that a growing licensing market for AI exists.

Litigation is working. The unsealed evidence in the Kadrey case suggests that even AI giants like Meta know they’ve crossed legal lines. Facebook isn’t dumb, evidence from other peer companies may be even more damaging. The plural press needs to be shining the light on these wrongs as national security isn’t an excuse for AI companies to break copyright law.

Publishers must be proactive in shaping AI policy. Big Tech will push its own narrative. Meta and Google pay front groups like Chamber of Progress to stretch the meaning of fair use both in the U.S. and across the pond. Media companies must work together to establish AI licensing frameworks and legal protections and reinforce existing copyright law.

Regulatory scrutiny on AI will intensify. If Meta is found to have used pirated data, it will accelerate AI regulations. This will not likely be confined to copyright but could extend across tech policy as it did in 2018, when one scandal exposed larger problems leading to Facebook being dragged before parliaments around the globe.

The future of AI depends on trust, ethics and media leadership

The past year has shown that AI is both a disruptor and an opportunity. The Reuters ruling confirmed publishers can and should demand licensing deals. The Meta revelations prove why that’s so necessary.

AI is reshaping media, but it must be built ethically. The publishing industry has both the legal and ethical high ground. And media companies must use it to define the next phase of AI’s evolution. The future of AI isn’t just about innovation. It’s about who controls the data and the IP – and whether the people who create it are respected or exploited.

Policy

The free speech era: tech policy in the Trump administration

February 10, 2025 Adriana Santoni Vicens, Associate – DGA

As frigid temperatures drove the Presidential inauguration indoors for the first time in 40 years, President Donald Trump once again took office surrounded by a who’s who of U.S. technology leaders. Even amidst the pomp and circumstance of this quadrennial event, few elements were as discussed and scrutinized as the presence of Big Tech’s most prominent leaders.

And yet, their presence and prominence should be of little surprise, given the level of scrutiny companies such as Meta, Google, Apple, and Amazon are facing from governments across the world, and how much they stand to gain or lose from developing close relationships with political leaders.

These companies, likely because of this scrutiny and the opportunities granted by a change in administration, are in the midst of significant policy restructurings aimed at better positioning themselves for the political realities of a second Trump administration. Most notably, Meta CEO Marck Zuckerberg announced days before the inauguration that the company was to “return” to a “fundamental commitment to free expression” by ending their third-party fact checking program and moving towards a community notes model.

Free speech will undoubtedly be the cornerstone of technology policy in the second Trump Administration and companies are certainly wise to publicly align themselves with this ideal. For media executives and content creators, this means ensuring that advocacy narratives both respond to the administration’s free speech concerns and respond to the free speech arguments promoted by Big Tech.

Executive action

On January 23, President Trump signed an executive order stating that the country must “develop AI systems that are free from ideological bias or engineered social agendas.” The President also stated that this executive order “clears a path for the United States to act decisively to retain leadership in AI, rooted in free speech and human flourishing.”

Although these actions are clearly focused on affirming the administration’s ideological stances and distancing the administration from its predecessor, using AI to profess broader policy positions was first done by the Biden administration. As the American public gained awareness of the promise and peril of AI in 2022, then-President Joe Biden unveiled an AI “Bill of Rights” which included protection from discrimination and bias as one of its key principles. Such language carried on to the now revoked Biden-era AI executive order.

Given the prominence of AI in the national discourse, it is logical that President Biden, and now President Trump, would utilize AI policy to assert ideological stances. While AI policy during the Biden administration was focused on combating AI-enabled discrimination against protected classes, the Trump administration is focused on the protection of free speech and combatting censorship.

Policy and power players

In the same way that Meta and Google have embraced policies that adjust to political realities, AI companies have begun to do the same. On January 13, OpenAI published its “Economic Blueprint.” The policy document is meant to present a framework that champions the “individual freedoms at the heart of the American innovation ecosystem.”

One of these freedoms, according to OpenAI, is that of “ensuring that AI has the ability to learn from universal, publicly available information, just like humans do.” This purported freedom should raise alarm bells for publishers and content creators. As the media industry continues to fight for the protection of its intellectual property, AI companies have set out to create a “fair use” argument for their past and future transgressions. Their focus on “individual freedoms” is a shrewd approach, and one that is aimed directly at the interests of the administration and its Congressional allies.

Fair use and fair licensing deals

2025 will be decisive year for AI policy that will see the most significant movement thus far for substantial industry regulation. If publishers and content creators are to successfully compete with the policy narratives of major tech companies and AI companies, then they too must embrace a narrative that highlights the threats to free speech and freedom posed by unregulated technologies that can misrepresent or censor political or personal viewpoints.

For example, if publishers and content creators can successfully argue that fair licensing deals will allow AI companies to license unbiased or ideologically diverse content libraries, and thus promote diverse viewpoints and avoid ideological bias, the policies they are advocating for will become more politically salient and will have a higher likelihood of capturing the attention of policymakers and the general public. This new administration is eager to make its mark on AI policy. Publishers and content creators must speak to the free speech priorities that now dominate political discourse, or risk being drowned out by more powerful voices as much-anticipated legislation is finally set in motion.

Publishers need a new robots.txt for the AI era

October 24, 2024 Chris Pedigo, SVP Government Affairs – DCN

While in some ways the web has evolved organically, it also functions within accepted structures and guidelines that have allowed websites to operate smoothly and to enable discovery online. One such protocol is robots.txt, which emerged in the mid 1990s to give webmasters some control over which web spiders could visit their sites. A robots.txt file is a plain text document that is placed in the root directory of a website. It contains instructions for search engine bots on which pages to crawl and which to ignore. Significantly, compliance with its directives is voluntary. Google, has long followed and endorsed this voluntary approach. And no publisher has dared to exclude Google considering its 90%+ share of the search market.

Today, a variety of companies use bots to crawl and scrape content from websites. Historically, content has been scraped for relatively benign purposes such as non-commercial research and search indexing, which promises the benefit of driving audiences to a site. In recent years, however, previously benign and new crawlers have begun scraping content for commercial purposes such as training Large Language Models (LLMs), use in Generative Artificial Intelligence (GAI) tools, and inclusion in retrieval augmented generation outputs (aka “grounding”).

Under current internet standards such as the robots.txt protocol, publishers can only block or allow crawlers by domain. Publishers are not able to communicate case-by-case (company, bot and purpose) exceptions in accordance with their terms of use in a machine-readable format. And again: compliance with the protocol is entirely voluntary. The Internet Architecture Board (IAB) held a workshop in September on whether and how to update the robots.txt protocol and it appears the Internet Engineering Task Force (IETF), which is responsible for the protocol, plans to convene more discussions on how best to move forward.

A significant problem is that scraping happens without notification to or consent from the content owners. It often violates the website’s terms of use in blatant violation of applicable laws. OpenAI and Google recognized this imbalance when they each developed differing controls (utilizing the robots.txt framework) for publishers to opt out of having their content used for certain purposes.

Predictably, however, these controls don’t fully empower publishers. For example, Google will allow a publisher to opt out of training for their AI services. However, if a publisher wants to prevent their work from being used in Generative AI Search—which allows Google to redeploy and monetize the content—they have to opt out of search entirely. It would be immensely useful to have an updated robots.txt protocol to provide more granular controls for publishers in light of the massive scraping operations of AI companies.

The legal framework that protects copyrighted works

While big tech companies tout the benefits of AI, much of the content crawled and scraped by bots is protected under copyright law, or other laws which are intended to enable publishers and other businesses to protect their investments against misappropriation and theft.

Copyright holders have the exclusive right to reproduce, distribute and monetize their copyrighted works as they see fit for a defined period. These protections incentivize the creative industries by allowing them to reap the fruits of their labors and enable them to reinvest into new content creation. The benefits to our society are nearly impossible to quantify as the varied kinds of copyrighted material enrich our lives daily: music, literature, film and television, visual art, journalism, and other original works provide inspiration, education, and personal and societal transformation. The Founding Fathers included copyright in the Constitution (Article I, section 8, clause 8) because they recognized the value of incentivizing the creation of original works.

In addition to copyright, publishers also rely on contractual protections contained in their terms of service which govern how the content on their websites may be accessed and exploited. Additionally, regulation against unlawful competition is designed to protect against the misappropriation of content for purposes of creating competing products and services. This is to deter free riding and prevent dilution of incentives to invest in new content. The proper application of and respect for these laws is part of the basic framework underlying the thriving internet economy.

The value of copyrighted works must be protected

The primary revenues for publishers are advertising, licensing, and, increasingly, subscriptions. Publishers make their copyrighted content available to consumers through a wide range of means, including on websites and apps that are supported by various methods for monetization such as metered paywalls. It is important to note that even if content is available online and not behind a subscription wall, that does not extinguish its copyrighted status. In other words: It is not free for the taking.

That said, there are many cases where a copyright holder may choose to allow the use of their original work for commercial or non-commercial purposes. In these cases, potential licensees contact the copyright holder to seek a negotiated agreement, which may define the extent to which the content may be used and any protections for the copyright holder’s brand.

Unfortunately, AI developers, in large part, do not respect the framework of laws and rules described above. They seek to challenge and reshape these laws in a manner that would be exceptionally harmful for digital publishers, by bolstering their position that content made publicly available should be free for the taking – in this case, to build and operationalize AI models, tools and services.

Publishers are embracing the benefits of AI innovation. They are partnering with developers and third parties, for both commercial and non-commercial purposes, to provide access and licenses for the use of their content in a manner that is mutually beneficial. However, incentives are lacking to encourage AI developers to seek permission and access/licensing solutions. Publishers need a practical tool to signal to bots at scale whether they wish to permit crawling and scraping for the purposes of AI exploitation.

Next steps and the future of robots.txt

The IETF should update the robots.txt protocol to create more specific technical measures that will help publishers convey the purposes for which their content may or may not be used, including by expressing limitations on the scraping and use of their content for GAI purposes. While this should not be viewed as in any way reducing the existing legal obligations of third parties to seek permission directly from copyright holders, it could be useful for publishers to be able to signal publicly and through a machine-readable format what uses are permitted, e.g. scraping for search purposes is permitted, whereas scraping to train LLMs or other commercial GAI purposes is not.

Of course, a publisher’s terms of use should always remain legally binding and trump any machine-readable signals. Furthermore, these measures should not be treated as creating an “opt out” system for scraping. A publisher’s decision not to employ these signals is not permission (either explicit or implicit) to scrape websites or use content in violation of the terms of use or applicable laws. And any ambiguity must be construed in favor of the rights holders.

In order to achieve a solution in a timely and efficient manner, the focus should be on a means to clearly and specifically signal permission or prohibitions against crawling and scraping for the purposes of AI exploitation. Others may seek to layer certain licensing solutions on top of this, which should be left to the market. In addition, it should be ensured that there is transparency for bots which crawl and scrape for purposes of AI exploitation. Any solution should not be predicated on the whims of AI developers to announce the identities of their bots or operate in any manner that obscures their identity and purposes of their activity.

And, critically, search and AI scraping must not be comingled. The protocol should not be allowed to be used in a manner that requires publishers to accept crawling and scraping for AI exploitation as a condition for being indexed for search.

Let’s not repeat the mistakes of the past by allowing big tech companies to leverage their dominance in one market to dominate an emerging market like AI. Original content is important to our future and we should build out web standards that carry forward our longstanding respect for copyright in the AI age.

Industry-Research

The impact of media companies opting out of open-web AI training

July 31, 2024 Rande Price, Research VP – DCN

The internet is seen by some as a vast repository of information readily available for training open and closed AI systems. However, this “data commons” raises significant ethical and legal concerns regarding data consent, attribution, and copyright, particularly for media companies. These concerns are growing due to the fear that AI systems may use the media’s content for training without consent, exacerbating conflicts over intellectual property rights.

A new study, Consent in Crisis: The Rapid Decline of the AI Data Commons, investigates these issues by examining how AI developers use web data and how data access and usage protocols shift over time. This research involves a comprehensive audit of web sources used in major AI training datasets, including C4, RefinedWeb, and Dolma.

The research also evaluates the practices of AI developers, such as Google, OpenAI, Anthropic, Cohere, and Meta, as well as non-profit archival organizations such as Common Crawl and the Internet Archive. By focusing on dynamic web domains and tracking changes over time, this study assesses the evolving landscape of data usage and its implications for media companies.

The research observations provide strong empirical evidence for the misalignment between AI uses and web-derived training data. This analysis helps track major shifts in signaling consent preferences and reveals current tools’ limitations.

Increased restrictions on AI data

From April 2023 to April 2024, a growing number of websites started blocking AI bots from collecting their data. Websites accomplish this by including specific instructions in files called robots.txt and their terms of service.
Impact: About 25% of the most critical data sources and 5% of all data used in some major AI datasets (C4, RefinedWeb, and Dolma) are now off-limits to AI.

OpenAI’s bots, which collect data for AI training, are blocked more often than other companies’ bots. The rules about what these bots can and cannot do usually need to be clarified or more consistent.
Impact: This inconsistency makes adhering to data usage preferences difficult and indicates ineffective management tools.

Divergence in the web data quality

The most popular web domains for AI training are news, forums, encyclopedias and includes academic and government content. These domains contain diverse content, such as images, videos, and audio. Many of these sites montize via ads and paywalls. They also frequently have restrictions for how their content can be used in their terms of service. In contrast, other web domains consist of personal/organizational websites, blogs, and e-commerce sites with less monetization and fewer restrictions.
Impact: The increasing restrictions on popular, content-rich websites mean that AI models must increasingly rely on open or user generated content. Thus, they miss out on the highest-quality and most up-to-date information, potentially affecting their performance and accuracy.

Mismatch between web data and AI usage

There needs to be a closer connection between the web data collected for training AI and the actual tasks AI systems perform in the real world.
Impact: This misalignment could lead to problems with AI systems’ performance and data collection. It may also lead to legal issues related to copyright.

AI economic fears may reshape internet data

The use of internet content for AI training, which was not its original intent, shifts incentives for content creation. With the increasing use of paywalls and ads, small-scale content providers might opt out or move to walled platforms to protect their data. Without better control mechanisms for website owners, the open web is likely to shrink further, with more content locked behind paywalls or logins to prevent unauthorized use
Impact: This trend could significantly reduce access to high quality information availability for AI training.

The media’s choice to opt out of AI training

While the Internet has served as a critical resource for AI development, the use of the content created by others, including the media, (often at great expense) without consent presents significant ethical and legal challenges. As more media companies choose to exclude their content from AI training, the datasets become less representative and outdated. The decline in data quality reduces the relevance and accuracy of the resulting AI models. Therefore, improved data governance and transparency are essential to allow for open access of content online. It also provides a framework for ethical use of web content for AI training, which in turn should improve the quality of training data.

Policy

The future of journalism – defining copyright in the age of AI

January 31, 2024 Adriana Santoni Vicens, Associate – DGA

On January 10, 2024, the Senate Judiciary Committee’s Subcommittee on Privacy, Technology, and the Law held a hearing titled “Oversight of A.I.: The Future of Journalism,” kickstarting legislative activity on AI for 2024. The central question of this hearing wasn’t whether copyright law covers AI, most witnesses and members of Congress seemed to agree that it does, it was whether existing law properly and effectively protects AI’s infringement on the intellectual property of journalists. As Subcommittee Chairman Senator Richard Blumenthal (D-CT) stated, rights need remedies, and for these remedies to be effective, they must be enforceable. It was that effectiveness and enforceability that was the true centerpiece of this Congressional discussion.

The witnesses at the hearing were: Danielle Coffey, President and Chief Executive Officer of the News Media Alliance, Jeff Jarvis, Tow Professor of Journalism Innovation at the CUNY Graduate School of Journalism, Curtis LeGeyt, President and Chief Executive Officer of the National Association of Broadcasters and Roger Lynch, Chief Executive Officer of Condé Nast.

For senators, a sense of urgency

During his opening statement, Senator Blumenthal (D-CT) highlighted the importance of this subject and this hearing, touting it as critical to democracy. Careful not to vilify the possibilities awarded by AI, Senator Blumenthal argued it is essential for reporters and readers to be able to reap the benefits of AI while avoiding its pitfalls. Nonetheless, he clearly called out how the rise of big tech and generative AI has led to the decline of the news industry, with the hard work of authors being utilized without credit or compensation.

Evident in Senator Blumenthal’s remarks was a sense of urgency, as he expressed that it was essential that Congress learn from their mistakes in tackling social media. He also floated several areas of consensus around the topic of AI, such as licensing, transparency, incentive structures for companies to develop trustworthy products, limiting big tech’s monopolistic practices when it comes to advertising, and clarifying that Section 230 does not apply to AI.

As a refresher, Section 230 of the Telecommunications Act of 1996 states that “No provider or user of an interactive computer service shall be treated as the publisher or speaker of any information provided by another information content provider.” Since coming into effect, Section 230 has granted websites and social media companies immunity from liability for content posted on their platforms by others.

It is no surprise that several of these areas of consensus are present in legislative proposals introduced by Senator Blumenthal. In 2023, he, alongside Subcommittee Ranking Member Senator Josh Hawley (R-MO) introduced the “No Section 230 Immunity for AI Act”as well as an AI Legislative Framework which tackled licensing regimes, transparency, and trustworthiness.

In his opening statement, Senator Hawley echoed Senator Blumenthal’s sense of urgency in protecting the work product, data, and information of consumers, at a time when the largest tech companies attempt to monopolize these areas.

For witnesses, a (somewhat) clear solution

Across the board, the hearing’s four witnesses illustrated the invaluable contributions the news industry has made to society. Danielle Coffey, Curtis LeGeyt, and Roger Lynch all agreed that licensing agreements are an essential component in combating the risks AI poses to the industry.

Coffey highlighted that such agreements could help avoid protracted uncertainty in the courts, while LeGeyt and Lynch raised how licensing agreements have become standard practice in the music, radio, and local television industries. Jeff Jarvis was more optimistic about the positive use cases of AI in the industry and advocated for the measured embrace and implementation of AI in journalistic practices.

A fork in the road for the industry

Following witness testimonies, committee members expressed their support of licensing agreements as a solution to some of the copyright issues raised by the interaction between AI and the news industry. Even more so, several committee members expressed their eagerness to tackle the issue directly and immediately.

Senator Mazie Hirono (D-HI) inquired whether Congress needed to enact legislation for these kinds of licensing procedures to be implemented, while Senator Blumenthal stated that when it comes to both licensing and Section 230 issues, Congress has an obligation to clarify current law, ensure that licensing is legally required and reinforce the inapplicability of Section 230. Somewhat surprisingly, it was some of the witnesses who pumped the legislative breaks on these comments. Regarding Senator Hirono’s comments, LeGeyt argued that such Congressional action would be premature while Coffey stated she believed the industry would prevail in addressing these issues through pending litigation.

What is undeniable, is that 2024 is set to be a landmark year for Congressional action on AI, and that copyright issues offer legislators a path to AI “victory” that is targeted, discreet, and not overtly controversial. Because of this, regardless of what was advocated for in this hearing, members of Congress can be expected to at the very least attempt to “clarify” the applicability of existing copyright law to generative AI models. Of course, the distinction between a limited clarification of current law and the outright enforcement of these types of agreements is up to legislators.

While witnesses adamantly made the case that copyright law is on their side, legislators continuously expressed concerns with the efficacy of existing protections. Going back to Senator Blumenthal’s statement, about rights needing remedies that are effective and enforceable, participants agreed that the rights of journalists certainly exist in copyright law, but for legislators, efficacy and enforceability need an extra push from Congress to come to fruition.

Looking towards 2024, with copyright litigation in its nascent stages, the digital content industry may certainly find relief in the legal system but would be wise to hedge some of its bets in the hands of legislators who seem keen on engaging with this industry-defining issue.

Five plain truths about AI

August 24, 2023 Jason Kint, CEO – DCN

“The rise of AI is an existential threat for media companies.”

“The rise of AI is a disruptive opportunity for media companies greater than the Internet itself.”

I overheard both statements in the last week. How can both be true at the same time?

While I may not be able to square that circle, I do know that DCN has spent the last decade focused on the future and not shying away from difficult questions like these. And, for the past six months or more, we have been among those immersed in the impending upheaval and unprecedented opportunity heralded by everyone from AI doomsayers to evangelists.

While the questions about the future of AI in the media are far from answered, there are a few plainly obvious truths emerging as we explore the full potential of AI.

The Large Language Model (LLM) data sets on which generative AI is being trained have been built upon what may well be the most extensive violation of copyright in history. The power and promise of AI to reshape industries is rooted in intellectual property that is a necessary ingredient in the equation. That bad math, that bad faith, must be recalculated and recalibrated in order for AI to evolve in a way that aligns with the true spirit of this extraordinary innovation.
Many challenges of the last decade remain constant in the AI era. Market power and abuse is a profound problem. It would be naive to rely on the generosity of trillion-dollar companies to silo negotiations to train tech companies’ large language models from the impact and the needs of the whole of the media business.

Consider the way in which Google has historically argued that it doesn’t detract from media sites’ revenue because it drives traffic to them. On the contrary, it is well understood that “search results” have become overwhelmed with advertising and offer “snippets” (scraped and trained by publishers’ sites) that often satisfy the user without having to click through. Generative AI takes this so much further, by allowing the search engine to compile information from a multitude of sites—without necessarily crediting any of them, much less driving traffic.
Privacy concerns around LLMs need more attention. Somehow the excitement and ready access to real-time output has swept this under the rug. Recent history should have taught us better.

Clearview AI, infamous for scraping billions of images across the internet without consent to fuel facial recognition, is the subject of a new book, Your Face Belongs to Us. And we learned in unsealed court docs earlier this year that Facebook used data brokers to train its machines to microtarget ads when they were forced to stop buying data outright. LLMs create a deep new well of data that is being opaquely collected and that will inevitably be exploited in ways consumers would not expect—or approve of.
Generative AI will increasingly be used for storytelling, whether in the fields of news or entertainment. However, responsible and successful media organizations recognize its limitations and human hands will still shape the creative output of these tools. As long as this storytelling involves humans at any point in the creative process, this content will require protection under the law. Otherwise, the devaluation of creativity and truth will be inevitable.
The sustainability of the free press is an essential ingredient for democracy. A free press supports an informed public, which holds the powerful accountable. Healthy competition and capitalism have unlocked opportunities and efficiencies that media companies have benefited from, and there’s no reason to believe that the AI era will be different. However, given the unhealthy dominance of the big technology companies, the last decade has been perilous for the press.

Therefore, any conversation around the future of AI must be anchored on the needs of an informed public, which starts and ends with an ecosystem that supports professional local and national newsrooms.

Given what we have witnessed over the past decade in the proliferation of mis- and disinformation, which has leveraged technology and vacuums in trust, the generative power of AI must give us pause. With power comes responsibility, and these are tools that we must use, and govern, wisely.

As someone who is listening, reading and thinking about what’s next as a full-time job, the acceleration of AI and its impact on media has got me on the edge of my seat. I’ve witnessed firsthand what media organizations have accomplished with AI for decades, and eagerly anticipate continued innovation. I also respect and acknowledge the efforts of media organizations to defend their work product, their creative output, the reporting, writing, photography, cinematography… as so much more than a mere data set.

We know our work. We know our worth. And we know our audiences and respect their values, which is why they value us. While the questions and innovations will keep on coming, there are unequivocal truths that should guide us as we continue to build a strong media ecosystem.

Media execs weigh risks, challenges of generative AI

March 14, 2023 Jessica Patterson – Independent Media Reporter

For a decade, artificial intelligence (AI) has enabled digital media companies to create and deliver news and content faster, to find patterns in large amounts of data, and engage with audiences in new ways. However, with much hyped recent announcements including ChatGPT, Microsoft’s next-gen Bing, and Meta’s LlaMA, media outlets recognize that they face significant challenges as they explore the opportunities the latest wave of AI brings.

In this second story in our two-part series on the evolution of AI applications in the media business*, we explore six challenges that media outlets face around AI tools, from the misuse of AI to generate misinformation, errors and accuracy, to worries about journalistic job losses.

Misinformation

While it has been used by media companies for various purposes over the last 10 years, AI implementations still face challenges. One of the biggest is the risk of creating and spreading misinformation, disinformation and promoting bias. Generative AI could make misinformation and disinformation cheaper and easier to produce.

“AI language models are notorious bullshitters, often presenting falsehoods as facts. They are excellent at predicting the next word in a sentence, but they have no knowledge of what the sentence actually means,” wrote Melissa Heikkilä for MIT Technology Review.

Generative AI can be used to create new content including audio, code, images, text, simulations, and videos—in mere seconds. “The problem is, they have absolutely no commitment to the truth,” wrote Emily Bell in the Guardian. “Just think how rapidly a ChatGPT user could flood the internet with fake news stories that appear to have been written by humans.”

AI could also be used to create networks of fake news sites and news staff to spread disinformation. Just ask Alex Mahadevan, the director of MediaWise at the Poynter Institute, who used ChatGPT to create a fake newspaper, stories and code for a website in a few hours and wrote about the process. “Anyone with minimal coding ability and an ax to grind could launch networks of false local news sites—with plausible-but-fake news items, staff and editorial policies—using ChatGPT,” he said.

Errors and accuracy

Julia Beizer, chief digital officer at Bloomberg Media, says the biggest challenge she sees around AI is accuracy.

“At journalism companies, our duty is to provide our readers with fact-based information. We’ve seen what happens to our discourse when our society isn’t operating from a shared set of facts. It’s clear AI can provide us with a lot of value and utility. But it’s also clear that it isn’t yet ready to be an accurate source on the world’s information,” she said.

Thus far, AI content generators are prone to making factually-inaccurate claims. Microsoft acknowledged that its AI-enhanced Bing might make errors, saying: “AI can make mistakes … Bing will sometimes misrepresent the information it finds, and you may see responses that sound convincing but are incomplete, inaccurate, or inappropriate.”

That hasn’t stopped media companies from experimenting with ChatGPT and other generative AI. Sports Illustrated publisher Arena Group Holdings partnered with AI startups Jasper and Nota to generate stories from its own library of content which were then edited by humans. However, there were “many inaccuracies and falsehoods” in the pieces. CNET, which also produced AI-written articles and came under scrutiny for factual errors and plagiarism in those pieces.

Francesco Marconi, longtime media AI advocate and co-founder of AppliedXL, said that though AI technologies can reduce media production costs, they also pose a risk to both news media and society as a whole.

“Unchecked algorithmic creation presents substantial pitfalls. Despite the current uncertainties, newsrooms should monitor the evolution of the technology by conducting research, collaborating with academic institutions and technology firms, and implementing new AI workflows to identify inaccuracies and errors,” he said.

Search traffic

Generative AI applications like ChatGPT have the potential to eat away a portion of publishers’ search traffic by generating answers without requiring a user to visit a news website.

“The introduction of generative summaries on search engines like Google and Bing will likely affect the traffic and referral to publishers,” Marconi said. “If search engine users can receive direct answers to their queries, what motivation do they have to visit the publisher’s website? This can impact news organizations in terms of display ads and lead generation for sites that monetize through subscriptions.”

Filter and context

The amount of data and information created every day is estimated around 2.5 quintillion bytes, according to futurist Bernard Marr. With the rise of generative AI models, the growth of information available to digital media companies and the public is exponential. Some experts predict that by 2026, 90% of online content could be AI-generated.

It presents a new challenge, according to Marconi. The explosion of data from IoT has created a world where there is too much of it. “We are now producing more information than at any other point in history, making it much more challenging to filter out unwanted information.”

A significant challenge for journalism today is filtering and contextualizing information. News organizations and journalism schools must incorporate computational journalism practices, so that journalists are also responsible for writing editorial algorithms in addition to stories.

“This marks an inflection point, where we now must focus on building machines that filter out noise, distinguish fact from fiction, and highlight what is significant,” Marconi said. “These systems are developed with journalistic principles and work 24/7 to filter out irrelevant information and uncover noteworthy events.”

Replacing journalists

AI-powered text generation tools may threaten journalism jobs, which has been a concern for the industry for years. On the other side is the longstanding argument that automation will free journalists to do more interesting and intensive work. It is clear, however, that given the financial pressures faced by media companies, the use of AI to streamline staffing is a serious consideration.

Digital media companies across the U.S. and Europe are grappling with what the potential of generative AI may mean for their businesses. Buzzfeed recently shared that it planned to explore AI-generated content to create quizzes, while cutting a percentage of its workforce. Last week, CEO of German media company Axel Springer Mathias Doepfner candidly admitted that journalists could be replaced by AI, as the company prepared to cut costs.

There is a valid concern regarding job displacement when considering the impact of AI on employment, Marconi agreed—with a caveat. “Some positions may disappear entirely, while others may transform into new roles,” he said. “However, it is also important to note that the integration of AI into newsrooms is creating new jobs: Automation & AI editors, Computational journalists, Newsroom tool managers, and AI ethics editors.”

Potential legal and ethical implications

One of the other biggest challenges digital media companies and publishers will face with the rise of AI in the newsroom are issues around copyright and intellectual property ownership.

ChatGPT and other generative AI are trained by scraping content from the internet, including open-source databases but also copyrighted articles and images created by publishers. “This debate is both fascinating and complex: fair use can drive AI innovation (which will be critical for long-term economic growth and productivity). However, at the same time it raises concerns about the lack of compensation or attribution for publishers who produced the training data,” according to Marconi.

Under European law, AI cannot own copyright as it cannot be recognized as an author. Under U.S. law, copyright protection only applies to content authored by humans. Therefore, it will not register works created by artificial intelligence.

“AI’s legal and ethical ramifications, which span intellectual property (IP) ownership and infringement issues, content verification, and moderation concerns and the potential to break existing newsroom funding models, leave its future relationship with journalism far from clear-cut,” wrote lawyer JJ Shaw for PressGazette.

Questions remain

While AI is not new, it is clearly making an evolutionary leap at present. However, while media companies may have been slow to adopt technology in the early days of the internet, today’s media executives are keen to embrace tools that improve their businesses and streamline operations. But given the pace at which AI is evolving, there’s still much to learn about the opportunities and challenges it presents.

Currently, there are some practical concerns for digital media companies and large questions still to be answered, according to Bloomberg’s Beizer. She questions how the advancement of these tools will affect relationships: “If we use AI in our own content creation, how should we disclose that to users to gain their trust?”

Wired has already made the first step by writing a policy that places clear limits on what they will use AI for and how the editorial process will be handled to ensure that a quality product is produced.

Beizer also poses the question of “how publishers and creators should be compensated for their role in sourcing, writing and making the content that’s now training these large machines?”

While in some eras, media companies have been swept along with the tide of technological change, with AI media executives are clearly grappling with how to embrace the promise while better managing the impact on their businesses.

*Part 1: Are we entering the age of artificial journalistic intelligence?

Innovation

Why Comcast is investing in blockchain (and maybe you should too)

March 29, 2018 Damian Radcliffe, Carolyn S. Chambers Professor in Journalism – University of Oregon

According to Gartner, $3.7 trillion is expected to be spent on global IT activities this year (up 4.5% from 2016) driven by “projects in digital business, blockchain, Internet of Things (IoT),” as well as the “progression from big data to algorithms to machine learning to artificial intelligence (AI).”

All of these trends are likely to further disrupt media markets and digital content companies. Of them, blockchain is getting a lot of attention at the moment. And rightfully so.

A growing market

Identified last year by PwC as one of eight breakthrough technologies that “will be the most influential on businesses worldwide in the very near future,” it’s an innovation which has excited investors, business and governments around the world.

One proponent, Comcast Ventures, the VC affiliate of the Comcast Corporation, recently joined IBM, the technology community Galvanize, and the VC Boldstart Ventures, in supporting a growth lab for early stage blockchain startups. Led by MState, a press release for the initiative notes that “more than 100 Fortune 500s companies have active blockchain initiatives and the number is growing fast.”

What is blockchain?

Explainers abound, including these examples from Forbes and TechRepublic. PwC offers this pithy description:

“[Blockchain is a] distributed electronic ledger that uses software algorithms to record and confirm transactions with reliability and anonymity. The record of events is shared between many parties and information once entered cannot be altered, as the downstream chain reinforces upstream transactions.”

This 3 minute video from PBS also sums up the technology very effectively, with the visuals perhaps being an easier way – for some people – to make sense of this system:

The global blockchain market is predicted to grow from USD 411.5 million in 2017 to USD 7,683.7 million by 2022, at a Compound Annual Growth Rate (CAGR) of 79.6%. The technology has the potential to impact multiple areas of interest to media companies, including: payments and contracts, as well as content distribution and digital asset management.

Commenting on an earlier study by the same company (Research and Markets) Business Wire noted in April 2017: “The media and entertainment vertical is expected to witness the highest CAGR during the forecast period.”

Comcast’s approach

According to one advocate for blockchain, Gil Beyda, Managing Director of Comcast Ventures, there are good reasons to be excited by this nascent technology.

“The internet connected people and businesses with near zero cost of distribution. However, the network still required intermediaries (website, etailers, etc.) to aggregate people and content/goods and provide a trust layer for transactions,” he explained in an email to Digital Content Next.

“Blockchain fundamentally changes that model by creating trust between individuals and companies that are unknown to each other. This allows new decentralized business models that were not possible before.” Beyda acknowledges that “It is still in the early days. ” However, he points out that blockchain is a “horizontal technology that has the potential to touch nearly every business from, supply chain management to commerce, to content consumption.”

As a result, Comcast, like a number of other media companies – such as Spotify – are exploring the potential afforded by blockchain to create (and support) new, and existing, business models.

“Comcast has announced the Blockchain Insights Platform with NBCU+Disney+Altice+Cox and others to match audience datasets — without sharing data — to better plan, target, execute and measure advertising,” Beyda told us.

The initiative, launched at Cannes Lions last summer, sees Comcast partner with NBCUniversal, Disney, Altice USA, Channel 4 (UK), Cox Communications, Mediaset Italia and TF1 Group (France) in order to deliver “a new and improved advertising approach which would facilitate the secure exchange of non-personal, audience insights for addressable advertising.”

Marcien Jenckes, President, Advertising, Comcast Cable, argued at the time: “This new technological approach would make data-driven video advertising more efficient and consumer data more secure. We’ll work with the participants in this initiative to improve ad planning, addressable targeting, execution and measurement, to ultimately create even more value for the television advertising industry.”

“Another internal project enables IoT devices in the home to use blockchain to secure and control access. Others at Comcast at looking at consumer loyalty programs and energy management,” Beyda says.

Comcast’s entry into this space goes beyond their traditional content role, to include expanded home automation services (offered, their website states, to more than 15 million customers at no additional cost) supported by a blockchain based tool. This will enable consumers “to easily grant, revoke and tailor access to any IoT device in a way that is safe, private and highly resistant to tampering.”

As Noopur Davis, Chief Information Security Officer, Comcast Cable, observed in a recent blog post: “Blockchains may be most commonly associated with cryptocurrencies [like bitcoin, Ed], but the underlying technology provides a powerful, flexible and secure platform that can support many types of sensitive transactions where privacy and reliability are critical.”

With Intel predicting that the average household will have 50 connected in-home devices by 2020 (up from ten in 2016), Comcast join Google, Amazon and others at the intersection of media and tech, who are operating in the increasing busy connected-home market.

Other potential benefits

Outside of these areas, Beyda also highlights how “early application of blockchain in media companies might include identity, royalty tracking, digital rights management and content distribution.”

Arguably it’s the payment and distribution opportunities afforded by this technology which will pique the interest of many content creators and rights holders.

As Deloitte commented in a recent paper (Blockchain @ Media | A new Game Changer for the Media Industry?): “Blockchain technology permits bypassing content aggregators, platform providers, and royalty collection associations to a large extent. Thus market power shifts to the copyright owners.”

Further possible blockchain uses identified by Deloitte include “new pricing options for paid content,” improved “distribution of royalty payments,” as well as “secure and transparent C2C sales” and “consumption of paid content without boundaries.”

Blockchain-based Opportunities. Source: Deloitte

Although adoption and the evolution of this technology still has some way to go, and several of these ideas – such as a micro-payment future have been hotly anticipated before – Deloitte nonetheless suggest:

“Possible applications and technical innovations will have a far reaching impact: content creators may be able to keep a close track of their playtimes, royalties and advertising revenues could be shared in an exact and timely manner based on consumption, and low cost content could be purchased efficiently, even if priced at mere fractions of cents.”

Meanwhile, companies like MetaX are exploring how blockchain can address issues of viewability and ad fraud by recording and storing detailed real-time ad impressions, and others have argued that blockchain technology (which allows users to trace, chronologically, any changes) can also be used to address issues of fake news and content manipulation.

Moving forward

“The media industry is stuck with licensing, distribution and collection structures that are pre-Internet,” Bruce Pon – founder of BigchainDB, a Berlin based blockchain database – wrote recently on Medium.“The blockchain enables new ways to think about the value exchange between creators, middlemen and consumers.”

Dan Williamson, CEO and co-founder of The-BLOCK.io, agrees: “We believe blockchain technology will have a huge impact on the media industry,” he told Digital Content Next.

“It will help revenue-strained media companies raise finances through ICOs and allow their readers and advertisers to participate in micropayment-friendly ecosystems. The immutable and tamper-proof nature of the blockchain will help advertisers and media owners guard against the widespread fraud and mistrust that plagues the industry. [And] it will also allow companies and individuals to distribute content in ways such that it is impossible to take it down: a double-edged sword.”

Although Pon believes that “media companies are sleepwalking into this next technology maelstrom, without knowing what’s going to hit them,” the experience of Gil Beyda and his team at Comcast Ventures indicates that there are some blockchain cassandra’s out there in medialand.

“I believe we’ll see applications of blockchain technology in production in the next 1-2 years,” Beyda predicts, suggesting that the evolution of this technology – and the myriad of benefits it could potentially unlock – might become more mainstream sooner than you might realize.

It’s potential could be quite radical. Dan Williamson, highlights how ”projects like Basic Attention Token, which was launched by Javascript creator and Mozilla and Firefox co-founder Brendan Eich, are seeking to flip the business model of the internet.” As he explains, the initiative is designed to give users control over their data and with whom they share it.

“If successful, it will disrupt Google, Facebook and the entire digital advertising industry,” he says. “What happens then? It could herald new economic era for the internet, whereby content creators are rewarded for their work and users are rewarded for their data.”

As a result, as Dr. Nelson Granados – an Associate Professor of information systems, and Director of the Institute for Media, Entertainment, and Culture at Pepperdine’s Graziadio School of Business – has argued:

“If you are in media and entertainment, 2018 will be a year to closely monitor and possibly experiment or invest in blockchain innovation, if you haven’t done so yet. Otherwise, you could be left behind.”

And no discerning media company wants that.

Matthew Schroder, a Doctoral Student at the University of Oregon’s School of Journalism and Communication contributed to the research for this article.

Perspectives

The impact of piracy on the media ecosystem: Address to the 2017 IP Crimes Conference

August 29, 2017 Jason Kint, CEO – DCN

The following remarks were given by DCN CEO Jason Kint on August 29 at the 2017 International Law Enforcement Intellectual Property Crime Conference in New York City – United Nations Headquarters.

Good afternoon. My name is Jason Kint. I am the CEO of Digital Content Next. DCN is the only trade group dedicated to serving high-quality digital content companies that manage trusted, direct relationships with consumers and advertisers. We have grown to represent more than 80 digital media companies which reach 100% of the U.S. online population and are leading much of the evolution in news and entertainment.

Despite the incredible advances of the last 20 years, the internet still holds vast, untapped potential for consumers. Devices are getting smarter. More immersive experiences roll out every day. At the same time, premium content companies face challenges in the transition to a digital world. What business model works best for each brand? How much should they partner with the big platforms? Is their content being used fairly? Are they being credited and compensated appropriately? Our members are at the forefront of these challenges – investing in engaging experiences and experimenting with new ways to distribute and monetize their content.

Copyright piracy is a serious crime that undermines the progress to a healthy digital ecosystem. Ultimately, it costs consumers in the form of higher prices. But, copyright piracy also hurts the ability for media companies, our members, to monetize their content.

Newspapers are constantly fighting online scams that offer discounted or free subscriptions to their premium content. In a case this spring, one website was found to be offering discounted subscriptions to 20 or so premium news sites including our member, The New York Times, Financial Times and Wall Street Journal. The site’s owner would sign up for free trial or short-term subscriptions and then re-sell the subscription as a full-year subscription. Of course, he collected full payment up front. The newspapers were tipped off when consumers called to complain that their subscription had been cut off after a few months. These efforts to combat piracy cost resources and money, but they also have real damages for consumers.

Live sports broadcasts are another category that is particularly vulnerable during the transition to digital.Companies are experimenting with new ways for consumers to view this highly compelling content. But, these efforts are undercut by thieves who blatantly post full live streams of entire games on social media or other platforms. According to one study, 54% of millennials have watched illegal streams of live sports and a third admit to regularly watching them.

This impact is felt disproportionately by smaller media companies with fewer resources to monitor and stop these crimes. Ellen Seidler, an independent filmmaker took out a second mortgage and racked up credit card debt just to make “Then Came Lola.” In 2006, it debuted at film festivals and then was released via DVD and via streaming sites. But the movie only grossed a quarter of what was expected because Ellen found that the movie was available for “free” on thousands of pirate sites. Similarly, Maria Schneider, an independent musician, testified before the US Congress that she has no time for making music anymore as she focuses entirely on protecting her copyrighted works. She is an artist who hast lost the time to create her art. In a game of whack-a-mole, independent creators like Ellen and Maria don’t stand a chance. And, left unchecked, consumers will be left fewer options for high quality news and entertainment.

Another harm, unknown to many consumers, is that many of the pirate sites also traffic in malware. According to a 2015 study by the Digital Citizens Alliance, one out of every three pirate sites contained malware. As organized crime syndicates moved into the content theft business, they saw the opportunity to make more money by distributing malware.

To really combat content theft, we need more resources/focus/coordination from law enforcement. Piracy actors react differently to law enforcement than they do to lawsuits. For instance, when Megaupload was taken down, a number of cyber lockers got out of the business. That wouldn’t have happened without the attention of law enforcement.

We also need more attention from Google, which holds a monopoly on the internet search market. Currently, Google will flag pirate sites after thousands of downloads or complaints. But, they make no effort to favor authorized copyright holders or trusted sources in their algorithm. Instead, Google crawls the Pirate Bay and other known copyright thieves every day to ensure that content can be found. Google enables this game of whack-a-mole that places a huge, unreasonable burden on the copyright holder. Google works with many of our members to take down pirated content, but they can and should do more.

The same holds true for Facebook. Content creators don’t have visibility into these platforms to see where their content is being shared illegally. Google and Facebook collectively act as a duopoly, sharing as much as 99% of the growth in advertising last quarter. At the same time, the platforms are slow to adopt measures to combat fraud or even provide more transparency to protect the content ecosystem. More can be done to ensure that valuable content isn’t illegally streamed.

We’re living in a new, unprecedented digital era. Consumers have the ability to discover premium content and experiences like never before. Facebook and Google engineers have created social and search discovery engines which are quite literally changing global society. But, without greater protections for content, consumers might be left with all the tools of discovery but with a bad malware hangover and no good content. Thank you for having me.