Wikipedia is getting paid by Big Tech for AI training data, as rising scraping costs force a rethink of “free” content.
Background: The Wikimedia Foundation is a non-profit that has been operating since 2001 and runs Wikipedia, the world’s largest publicly edited encyclopedia. Today, Wikipedia hosts more than 65 million articles across 300 languages and has become one of the internet’s most relied-upon reference points.
What happened: Now, Wikimedia has announced new enterprise partnerships with major tech companies including Microsoft, Meta and Amazon. Under the deals, these companies will pay to use Wikipedia’s content to help train their AI models - a shift away from the content being freely scraped at scale.
What else: The move comes as AI web scraping has put growing pressure on Wikimedia’s servers and costs, despite the organisation being primarily donation-funded. Rather than footing the bill alone, Wikimedia is now asking the biggest beneficiaries of its data to contribute.
What's the key learning?
💡The AI era has dramatically increased the value of data. Large, trusted data sets are now essential fuel for modern AI systems.
💡Content platforms are starting to charge for AI access. Wikimedia follows moves like Reddit’s 2024 licensing deal with Google, so it shows that the "free" scraping days are ending.
💡AI leadership will hinge on data access, not just model speed. The next advantage won’t be better algorithms alone, but who controls the most useful information.
Sign up for Flux and join 100,000 members of the Flux family