The encyclopedia, the machine and the reservoir: what Wikipedia has to do with the climate crisis

AI feeds off Wikipedia to produce answers that replace visits to the site. With fewer visits, fewer people discover Wikipedia, the volunteer community thins, donations drop. The common good that feeds the machine is depleted by the machine itself

On January 15, 2026, Wikipedia turned twenty-five years old. On the same day, the Wikimedia Foundation announced commercial agreements with Amazon, Meta, Microsoft, Mistral AI, and Perplexity to give their artificial intelligence models structured access to encyclopedia content through the Wikimedia Enterprise product. Google had already signed on in 2022. Companies pay: how much, has not been made public. Founder Jimmy Wales commented that the models trained on Wikipedia benefit from human curation work and that companies should contribute to the infrastructure costs they generate.

This transaction has a simple logic: Wikipedia is among the most valuable datasets for generative artificial intelligence. Over 65 million articles in more than 340 languages, curated by humans according to rules of verifiability and neutrality. According to an analysis by the Profound platform of 30 million citations generated by ChatGPT, Google AI Overviews and Perplexity (August 2024 / June 2025), Wikipedia weighs in at 47.9% among the top ten sources most cited by ChatGPT. But the relationship between AI and Wikipedia is not just a commercial exchange.

The cycle of invisible extraction

For years, AI models have tapped into Wikipedia without paying anything through automatic content scraping. When a chatbot answers a question, it rarely indicates that the explanation comes from Wikipedia. Encyclopedia knowledge dissolves into the conversational flow without generating links, site visits, new contributors or donations.

The consequences are measurable. In October 2025, the Wikimedia Foundation reported an 8 percent drop in human views compared to the same period in 2024, after upgrading bot detection systems (much of the apparently human traffic in May and June came from bots designed to evade controls). Marshall Miller, senior director of product at the foundation, attributed the decline to the impact of generative AI and social media on information search habits, specifying that search engines provide answers directly to users, often based on Wikipedia content.

The dynamic is circular. AI feeds off Wikipedia to produce responses that replace visits to the site. With fewer visits, fewer people discover Wikipedia, the volunteer community thins, donations drop. The common good that feeds the machine is depleted by the machine itself.

The energy cost of the response that replaces the page

When a chatbot replaces a Wikipedia page with a generated response, the energy cost of the operation changes.

Estimates of the consumption of a single AI query vary and are debated. The most widely used figure, that a query to ChatGPT consumes about 2.9 watt-hours of electricity (ten times a Google search), is based on SemiAnalysis calculations taken up by researcher Alex de Vries in a commentary on Joule in 2023 and later cited in the IEA report on energy and AI .Epoch AI, however, recalculated the figure in February 2025, estimating about 0.3 watt-hours for queries using GPT-4o, due to more efficient models and hardware. Sam Altman gave a similar figure (0.34 Wh). The exact figure is not externally verifiable because no company publishes the actual consumption of its models.

What is verifiable is the overall scale. According to the IEA, data centers consumed about 415 TWh of electricity in 2024, or 1.5 percent of global consumption. The projection for 2030 is 945 TWh, equivalent to Japan’s annual consumption. Servers dedicated to AI are responsible for nearly half of the projected net increase, and AI’s share of the total could rise from 5-15% in 2024 to 35-50% by 2030. In the European Union, data center consumption is estimated by the Commission at 70 TWh in 2024, with a projection of 115 TWh by 2030, in a context where the EU aims to reduce final energy consumption by 11.7 percent over the same period.

The single query is not the problem. Systematic substitution is: millions of static pages replaced by answers generated in real time, with a transfer of energy costs that occurs without the querierbeing aware of it.

Where climate information comes into play

The degradation of Wikipedia’s quality does not affect all entries equally. Climate change pages are among those that attract the most manipulation attempts, making them particularly dependent on volunteer surveillance. If that surveillance weakens, the damage spreads through every chatbot that uses that content as a source.

The risk is not theoretical. According to a Washington Post investigation, hundreds of Wikipedia articles have received the warning label introduced in 2024: “This article may incorporate text from a large language model.” Since 2024, volunteers have reported more than 4,800 articles with suspected AI-generated content, as reported by Rest of World based on data provided by the Wikimedia Foundation. The Princeton University study (October 2024) estimated that about 5 percent of new English pages created in August 2024 contained AI-generated material, often with fewer citations, lower quality, and less integration into the encyclopedia’s knowledge network.

For languages with smaller publishing communities, the situation is worse. According to MIT Technology Review, volunteers working on four African languages estimate that 40-60% of the articles in their respective editions are incorrect machine translations. The edition in Inuktitut, an indigenous Canadian language, contains machine-generated portions in more than two-thirds of the pages with more than a few sentences. Marathi, Telugu and Tamil, languages spoken by hundreds of millions of people, have only a few hundred active editors on Wikipedia.

These reduced editorial communities are the first line of defense for information quality. If they are overwhelmed by automatically generated content, the consequences also affect entries on the environment, resources, biodiversity, and extreme events. The languages of the global South, where climate vulnerability is highest, are also where the encyclopedia is most fragile.

The closed loop: model collapse

The most structural risk has a technical name: model collapse. In July 2024, a group of researchers from the University of Oxford, Cambridge, Imperial College London, and University of Toronto published a study in Nature (Shumailov et al.) showing how AI models trained on data generated recursively from other models suffer irreversible degradation. Rare and marginal information in the original data distribution disappears, and the output becomes progressively more uniform and less accurate.

The mechanism, applied to Wikipedia, produces a closed loop: models train on the encyclopedia, the texts they generate end up in the same encyclopedia (inserted by editors using chatbots to write or translate), subsequent models train on that contaminated content. The study does not analyze Wikipedia directly, but the risk is structural: with each cycle, local specificities and less common information are progressively removed. The result is a flattening of knowledge that hits the margins first.

For environmental information, the stakes are high. The climate crisis produces different effects in each region, and documenting them requires local knowledge that no language model can generate on its own. If this knowledge is diluted in a recursive training cycle, the loss is not recoverable.

The extractive analogy

There is an uneasy symmetry between the way AI treats Wikipedia and the way the fossil economy treats natural resources. A common good, be it collaborative knowledge or the stable atmosphere, is exploited on an industrial scale by private actors without the costs of extraction appearing in the price of the product. Emissions are not in the price of the barrel, volunteer labor is not in the price of the query. Degradation is not immediately visible: global warming is slow, model collapse is gradual. And those who pay the highest price are those who have the fewest resources to defend themselves: the communities most exposed to the climate crisis, the smallest language communities on Wikipedia.

The Wikimedia Enterprise agreements are an attempt to correct this asymmetry, just as carbon taxes seek to internalize the cost of emissions. But the scale is different. The foundation’s commercial revenues remain a fraction of individual donations (over $150 million a year, from 8 million donors). The companies that train billion-parameter models on Wikipedia have not disclosed how much they pay to do so.

The immune system

Wikipedia is not passive. In 2023, volunteer editors created WikiProject AI Cleanup, a group dedicated to identifying and removing AI-generated content. In August 2024, Wikipedia changed its rapid deletion policy to allow immediate deletion of articles with obvious signs of AI writing. The Wikimedia Foundation’s three-year AI strategy (April 2025) calls for the use of artificial intelligence to support moderators, not replace human editors.

Marshall Miller of the Wikimedia Foundation described the volunteers as an encyclopedia “immune system.” Like any immune system, it works until it is overwhelmed.

For English editions, with over 284,000 editors making at least one change per month, the responsiveness is significant. For Asian, African, indigenous regional languages, that capacity is fragile. And generative AI is testing it.

What remains

Wikipedia is the only site in the top ten most visited sites in the world to be run by a nonprofit organization. It is a global knowledge infrastructure built on volunteer labor and principles of verifiability. This infrastructure feeds the AI models that then reduce its traffic and pollute its content, consuming an increasing share of global energy in the process. The degradation is simultaneous: of the information ecosystem and the energy ecosystem.

The answer is not to oppose AI. It is to recognize that the current model of extraction is not sustainable, just as extracting fossil fuels without paying the cost of emissions is not sustainable. Companies do not publish the energy consumption of their models. Chatbots rarely indicate where the information they return comes from. Enterprise agreements are beginning to include a financial contribution, but without the transparency needed to assess its appropriateness.

Free knowledge is not an infinite resource. Like the atmosphere, it needs to be taken care of.

Reviewed and language edited by Stefano Cisternino