Research / Insights on current and emerging industry topics
The Tow Center reveals ChatGPT’s major attribution errors
Despite partnerships and claims to support audience reach, Open AI's Chat GPT generates search tool generates errors that risk publishers’ reputations, audience trust and revenue
December 10, 2024 | By Rande Price, Research VP – DCNOpenAI’s ChatGPT Search, an AI-driven alternative to traditional search engines, raises pressing concerns for news publishers, including attribution errors. OpenAI promotes its collaboration with select news organizations and uses mechanisms like robots.txt files to give publishers some control over their content, However, questions loom about its impact on journalism. These worries echo the backlash from two years ago when publishers discovered their content was used—without consent—to train OpenAI’s models.
OpenAI markets ChatGPT Search as a platform to enhance publisher reach. Yet, new research from the Tow Center for Digital Journalism, as reported in the Columbia Journalism Review, reveals significant issues with the tool’s ability to accurately attribute and represent content. This undermines trust between publishers who are working with OpenAI. It also represents a significant concern for publishers whose content is misattributed or misrepresented because of the risk of reputational damage. And, as such, it poses challenges for newsrooms adopting AI technologies.
Unreliable attribution and false confidence
The Tow Center analyzed ChatGPT Search using 200 traceable quotes from 20 publishers, including those with licensing agreements, those in litigation, and unaffiliated entities. Traditional search engines consistently surface the original articles in the top three results.
However, ChatGPT Search fails to correctly attribute 153 of the quotes. It fabricates citations, credits rewritten versions of articles or misattributes sources. Notably, in only seven cases, it admits being unable to locate the source, prioritizing plausible but incorrect answers over transparency.
Unlike traditional search engines that clearly indicate when no match is found, ChatGPT’s confident delivery of inaccurate citations risks misleading users and damaging the credibility of referenced publishers. The findings of these ChatGPT search errors emphasize the risks of integrating AI-driven search tools into journalism amid ongoing struggles with content protection.
Accurate attribution is critical for news organizations to maintain trust, brand value, and loyalty. However, ChatGPT Search frequently distances users from original sources by misidentifying premium publications or favoring syndicated or plagiarized versions.
For example, when identifying a New York Times quote, ChatGPT attributes it to a site that copied the article without credit. Such misrepresentation undermines intellectual property rights and rewards unethical practices. Similarly, it often cites syndicated versions of articles, such as attributing a Government Technology piece to MIT Technology Review, diluting the originating publisher’s visibility and impact.
These errors exacerbate publishers’ challenges with audience fragmentation and declining revenues. ChatGPT’s attribution flaws risk further eroding the vital connection between publishers and their readers.
Crawler policies and content control
In its marketing, OpenAI emphasizes its respect for publisher preferences via robots.txt files. However, the Tow Center’s findings suggest otherwise. Yet even those publishers that block OpenAI’s crawlers are not immune to misrepresentation. And those allowing crawler access saw little improvement in citation accuracy.
For instance, despite blocking OpenAI crawlers, the New York Times experienced content misattributions. Publications like The Atlantic and the New York Post, which have licensing agreements and permit crawler access, also faced frequent errors.
This inconsistency highlights publishers’ limited control over how ChatGPT Search represents their content. Blocking crawlers does not guarantee protection, and opting in does not ensure better outcomes.
Transparency, trust and revenue impact of ChatGPT errors
A core problem with ChatGPT Search lies in its lack of transparency. When it cannot access or verify a source, the AI often forms plausible but inaccurate responses. Thus, it leaves users unable to discern reliability.
Unlike traditional search engines, which clearly signal when no results match a query, ChatGPT usually fails to communicate uncertainty. This opacity risks misleading users and eroding trust in both the platform and the referenced publishers.
The rapid adoption of AI-driven tools like ChatGPT Search poses significant challenges for publishers. With an estimated 15 million U.S. users starting their searches on AI platforms, the potential disruption to search-driven traffic is profound. Publishers reliant on visibility for subscriptions, advertising, or membership revenue face growing threats. If readers cannot reliably trace content to its source, publishers lose critical opportunities for engagement and monetization.
What needs to change
To foster a sustainable relationship with newsrooms, Tow’s research recommends that OpenAI must address these challenges:
- Commit to transparent attribution: ChatGPT must accurately cite original sources or explicitly indicate when an answer cannot be provided.
- Establish universal standards: OpenAI should adopt industry-wide protocols prioritizing canonical sources.
- Increase accountability: To address systemic issues, meaningful partnerships with newsrooms—beyond select licensing deals—are essential.
- Enable publisher control: Tools empowering publishers to dictate content access and representation will signal good faith.
ChatGPT Search’s flaws underscore the tensions between generative AI platforms and the news industry. While OpenAI claims to collaborate with publishers, its inconsistent handling of content undermines trust and fails to protect intellectual property.
As generative AI reshapes the future of search, publishers must advocate for stronger safeguards and fairer partnerships to ensure their work is accurately represented and valued. By addressing these issues, AI platforms and newsrooms can build a foundation for mutual benefit—and one that that will ultimately benefit consumers as well.