Skip to main content

Courtney and Kathryn Wall at Wikipedia Event 2025

The importance of Wikipedia in search engines and large language models

As a digital encyclopedia, Wikipedia’s structure makes it an appealing source of information to search engines and generative AI models. Some characteristics of Wikipedia that contribute to its importance to search and AI technologies include: incoming links, high-quality content, frequent updates, and good semantic markup.

Although search results algorithms are complex, the volume of inbound links contributes to the ranking of a webpage in search results. Additionally, the number of keywords related to any given search query and the frequency of updates and edits made to a page contribute to the algorithmic ranking of pages. As a result, Wikipedia pages are consistently ranked highly on search engines.

Search engines try to deliver users to “high-quality” content. In part, this is influenced by the trustworthiness of a page (which is usually determined by the volume of inbound links it receives). Additionally, the presence of advertisements can decrease the “quality” assigned to a given webpage. Google, for instance, explains that it determines which content seems “most helpful” to users by ranking content based on a mix of factors they call “E-E-A-T” (experience, expertise, authoritativeness, and trustworthiness). Because Wikipedia is an encyclopedia that follows strict editorial rules, its pages are often the highest-ranking content according to Google’s E-E-A-T guidelines.

Wikipedia content also has an immense impact on large language model (LLM) artificial intelligence systems like ChatGPT. In 2023, Selena Deckelmann, the Wikimedia Foundation’s chief product and technology officer, even claimed that “every LLM is trained on Wikipedia content, and it is almost always the largest source of training data in their data sets.”

What counts as “research” to Wikipedia?

Wikipedia has three “core content policies” that govern what kinds of content they allow on their pages. These include:

  1. Neutral Point of View: Wikipedia articles must cover “significant views fairly, proportionately and without bias.”
  2. Verifiability: Every quotation, and any material that is “likely to be challenged,” requires attribution to “a reliable, published source.” For Wikipedia, verifiability means that other Wikipedia editors should be able to “check that information comes from a reliable source.”
  3. No Original Research: Since most claims on Wikipedia need to cite “verifiable” sources, pages are not allowed to “contain any new analysis or synthesis.” Wikipedia’s guidelines even go so far as to state that “Wikipedia does not publish original thought.”

Because Wikipedia is an encyclopedia, its function is to collect existing knowledge. It is not intended as a space for advancing original research. While these core content policies contribute to Wikipedia’s strength as an online encyclopedia, they can also inadvertently silence communities whose histories have been suppressed. This kind of archival violence is a key motivation behind From the Rock Wall to Wikipedia.

Dominant historical narratives often fail to represent the lived experiences of historically marginalized communities like the African American communities in Chapel Hill. Still, since these histories are less accurately documented in “published sources,”  these silences are often extended in Wikipedia entries about the area’s local history. Moreover, because of the prominent role Wikipedia plays in search and AI technologies, the impact of these silences extends beyond the website itself. Through our partnership with the Marian Cheek Jackson Center and Northside community members, this project has demonstrated the value of oral histories in confronting some of these archival injustices.

Who can edit Wikipedia?

Another feature of Wikipedia is that any user can make edits to pages on the site.

Wikipedia is a wiki: an online publication that allows users to collaboratively create, edit, and organize its content. This theoretically opens the production of Wikipedia to everyone and it creates immense opportunities for producing more accurate historical representation on the site. However, the accessibility of Wikipedia also makes it possible for editors without familiarity with Chapel Hill’s local history to alter pages after they have been published. For this reason, it is part of our work at the DLC to ensure that pages created for this project remain faithful to the local histories that inform them.

Project Impact

Since the beginning of From the Rock Wall to Wikipedia, the project has resulted in 11 Wikipedia pages, with six published in 2023 and five more published in 2025. Using Wikipedia’s Page View Tool, which generates data visualizations displaying the number of views a Wikipedia article has gathered within a specified date range, we have monitored the performance of each of these pages.

The original Wikipedia pages, published in 2023, have gathered over 5,000 views in total, with the widest-reaching page being the page about the Hargraves Community Center. These pages have received more than 200 edits by over 60 editors, with a median of four views daily (pictured below).

Wikipedia page stats June 2025
Screenshots of the data visualized with the Wikipedia Pageviews tool displaying the number of views gathered by the original six Wikipedia pages produced. Retrieved June 2025.

 

Some of these Wikipedia pages have already influenced Google’s “AI Overview” features in search results. For instance, the Hackney School page is the primary source that Google draws on to generate its AI response for the query “what is the hackney school” (pictured below).

wikipedia impact AI overview
A screenshot of the search engine results page (SERP) for the search query “what is the hackney school,” conducted on a private “Incognito” Google search designed to minimize personalization. Retrieved June 2025.

 

From the Rock Wall to Wikipedia’s project team was accepted into the Institute for Liberal Arts Digital Scholarship (ILiADS) in 2023 to further develop this project; their participation in the Institute has led to an open-access publication (currently in press with Digital Humanities Quarterly) about using Wikipedia to promote community organizations and teach students digital literacy skills.