I understand why the Wikimedia Foundation created a dataset that is purpose built for machine learning purposes. It was given at gunpoint though. The motivation was to discourage the scraping of their sites. These are all pretty disgusting developments.

Apropos of nothing here are some links:

thedabbler.patatas.ca/pages/poi…
github.com/ai-robots…
zadzmo.org/code/nepe…
git.madhouse-project.org/algernon/…
marcusb.org/hacks/qui…
codeberg.org/MikeCoats…
codeberg.org/konterfai…
github.com/Fingel/dj…