1
If the AI model repeatedly learns with self-generated data, the model’s performance may deteriorate and eventually collapse. It deteriorates gradually as if inbreeding, and the model loses diversity and produces extremely biased results.
It is said that the content created by AI is already overflowing, affecting Google’s search quality. Then, will this phenomenon continue to intensify in the future, leaving only trash on the web, and the AI model will collapse?
2
It is unlikely. Similar things have already happened in the history of the web. It is one of the backgrounds of the rapid emergence of early SNS such as Facebook and Twitter. Existing blogs and web pages were commercially abused, and the reliability of the information was greatly reduced. The existing web ecosystem has been greatly damaged by content abuse using SEO optimization and the flooding of advertising articles.
In this situation, early SNS was recognized as a reliable source of information for real people and connections. In a way, the recent popularity of influencer marketing is similar to this. Influencers build a kind of trust relationship with their followers. Advertisements or recommendations based on this work much more effectively than general advertisements.
3
In any case, a recent study found that maintaining about 10% of the original data could prevent the model from collapsing. In other words, the importance of high-quality human-generated data in AI learning has grown. The value of human-made “real” data increases as AI-generated data increases rapidly. Big technologies are already striving to secure high-quality text data from prestigious media and academic journals.
4
As AI-generated data increases rapidly, there will be more wasteful content, but that will not disrupt the web. Instead, artificial intelligence and foundation models will serve as tools to identify reliable information sources.
Just as early SNS secured the reliability of information through connections, AI can evaluate the reliability of information by analyzing relationships and patterns between data. Similar to selecting friends or followers on SNS, the foundation model can build a personalized trust network by learning the user’s interests, specialties, and trusted information sources.
5
Where can I get a large amount of human-made “real” data? It is big tech that has a platform that can collect a large amount of actual data from users. Data quality and diversity management become more important in AI model development, which leads to an increase in development costs. Only big techs can afford these costs.
The serious monopoly is getting worse. These are the things you really need to worry about.
Respect for values used to be established only after harsh experiences. After the Great Depression,…
● Meaning of LG Energy Solution's self-disclosure and what to pay attention to in the…
It feels like inflation is an issue again. The U.S. consumer price index has bounced…
😁To summarize what seems complicated below, from the standpoint of LP (fund investors - investors);…
📌 Clearing the history of Tesla stock purchases by major institutions (Q4 2024) Goldman Sachs:…
25/2/10 #TeslaNews Summary Tesla To Begin Deployment Of FSD v13.2.7 On Some VehiclesTesla has begun…