๐ When AI Eats It's Own Slop; It's Called Model Collapse
Rooted Communications | What Happens When AI Eats Its Slop?
Welcome Back! ๐๐ผ
Our collaborator, Taryn Talley, is out with an interesting piece about the repercussions of AI slop on the AI models themselves. We've covered the harms of AI and alternatives to using AI in your work. Today, we ask, what happens when AI models eat their own slop? And how does that impact us? Read on for more...

- ๐๏ธ Amazon Ring Partners with FLOC (TechCrunch)
- ๐๏ธ Did Big Tech Enshitify The Entire Economy? (Cory Doctorow)
- ๐๏ธ From TikTok Ban to MAGA Ownership (the roots of change podcast)
- ๐๏ธ Writing vs AI; What's the Difference? (Pluralistic)
Connect On & Off Big Tech
Mastodon โข BlueSky โข PeerTube โข YouTube โข Instagram โข TikTok

- NEW ๐ Content Strategy School; our new video series with strategic video bites to build your strategy (Flowering Members)
- ๐ Messaging Guidance on Minnesota's ICE Problem (ASO Communications)
- Digital Security Checklist for Activists (activistchecklist)
More Rooted Resources
Resource Hub โข Signal Channel โข Podcast โข Free Templates

What Happens When AI Eats its Own Slop? Itโs Called Model Collapse.
from collaborator, Taryn Talley (she/her)
Like people living solely on highly processed foods risk poorer health outcomes, if Large Language Models ingest a diet of non-stop AI-generated content, the health of their training data is at risk.
Aspect | Ultra-Processed Food Risk | AI-Generated Data Risk |
Source | Producing over-processed โfoodโ designed for efficiency and cost often results in a loss of nutritional value. | Content produced by algorithms that favor high-probability patterns loses the "long-tail" of human nuance. |
Result | Short-term energy but long-term health decline (e.g., metabolic issues). | Models appear fluent at first, but over time, reasoning, diversity, and accuracy "collapse" over time. |
Mechanism | The body lacks the complex micronutrients found in whole foods. | LLMs lack the "unlikely" but true edge cases that only human creativity and an error-prone life provide. |
The Loop | A diet of ultra-processed food can lead to cravings for more of the same, reinforcing bad habits. | Models trained on AI-generated data start "hallucinating" on their own errors, which amplifies them. |
The Science Behind "Model Collapse"
In a research paper published by Nature in 2024, (authored by Shumailov et al.), it was confirmed that when AI models are trained exclusively on data generated by previous AI models, they go through two specific stages:
- Early Model Collapse: Models begin losing "minority" dataโthe rare, unique, and creative parts of human language. The model output will start to sound "average" at best.
- Late Model Collapse: The model starts confusing different concepts (ex, answering a question about architecture with facts about biology) until every output is absolutely useless.
As that aforementioned research paper circulated, the term โmodel collapseโ began to gain traction, prompting top-level LLMs to shift their stances. OpenAI and Google began prioritizing content licensing to ensure access to "clean" human-generated data - and no doubt to limit their future liability, learning from the initial capture of copyrighted material (without citation or compensation). These same LLMs also sought to preserve "pre-AI internet" data (created before late 2022) for future training.
According to a Gemini (1) prompt response:
As of 2026, the industry is seeing three specific areas where collapse is manifesting:
Sign of Collapse | Real-World Observation |
The "Tail" Vanishing | Models are becoming less capable of discussing rare languages, niche scientific theories, or ultra-specific coding edge cases. They default to the "average" answer more often than they did in 2023. |
Bias Amplification | Since AI data reinforces majority patterns, models are showing increased "homogenization." They sound more like "the average of the internet," losing the unique voices and cultural nuances found in the original human-only datasets. |
"Digital Dementia" | In recursive testing (feeding a model its own output repeatedly), models like Metaโs OPT-125M eventually began babbling about "jack rabbits" after starting with a prompt about architecture. While flagship models are more stable, they still show slight degradation when exposed to "AI slop" on the web. |
What do the big three say about their teamsโ efforts to prevent model collapse?
I wanted to share the perspective of the top LLMs. So I asked Gemini, Claude, and ChatGPT the following question: โHi (LLM), what steps have your engineers taken to prevent the degradation that leads to late-stage model collapse?โ
Not surprisingly, Geminiโs response was much more robust than the other LLMs. ChatGPT came in second with a decent but high-level response. Claude was by far the underwhelming response. So, letโs look at the techniques that the top three are currently employing to prevent model collapse. Ive also added which LLM mentioned which technique in their initial return.
Data Provenance and "The Vault" Strategy
In my research for this article, Iโve encountered the terms โgold standard dataโ and โpristine gold datasetโ multiple times. It makes sense that, to prevent ingestion of AI-slop, they need to maintain pure human-generated content (as protected source data), reducing the risk of AI-polluted web scrapes.
๐บ for Flowering Members
This post is for subscribers on the ๐บ Flowering (Tech Geeks & Communicators+) and ๐ Fruitful (Tech Geeks & Communicators+) tiers
SubscribeAlready have an account? Log in