I think there’s a reasonable case to be made that language models will not make us smarter or dumber. They will offer no improvement to humanity overall, and this hype train will eventually pass. But part of me worries that they may actually freeze us in time.
These models are built by sampling a large set of writing (and images) at a fixed point in time. They are then optimized, with human guidance, to return “better” results. What is “better” is in the eye of the beholder(s), and is often determined by a market or economic incentive rather than what is objectively richer or more sophisticated response. There is an incentive with these systems to provide a result that appeals to the most people–it’s a race towards the median in the best case, (or a race to the bottom in the worst case).
We’ve seen this in many other domains: Movies. Music. Food. As the world has grown more connected and industries strive to serve the most number of people, originality is thrown by the wayside in favor of things that trigger the most user engagement and thus the most sales.
I have a slightly cynical outlook here based on my own past experience. In 2004 I started a music recommendation company. The idea was to help people explore The Long Tail, the edge, the art on the periphery that the mainstream wasn’t picking up. I felt that I had created a wonderful system, but what I discovered then was a fundamental truth that Apple Genius and Pandora discovered years later: most people do not want to explore the long tail. There’s comfort in the familiar. If you like 90’s country music then nine times out of ten you probably don’t want to be bothered with a recommendation to listen to a crossover ska-country fusion band, you just want to hear Garth Brooks.
Given these two forces: consumers wanting to be satiated by the familiar and producers wanting to make things that appeal to the widest audience, my concern is that AI language models will trigger a race towards the median in human knowledge.
Here’s a completely hypothetical example from a potentially not-to-distant future:
User: Who was Henry the 8th?
Computer: Henry VIII was King of England from 1509 to 1547. He had his first wife executed. His disagreement with the Pope led to the English Reformation. [Merchandise] [Quotes]
Compare this to what you get from Wikipedia today, and you’ll see there’s a lot of nuance left out in this answer and the writing style is far more sophisticated. This hypothetical answer is distilled down to only a few “essential” points, as determined by the optimizer(s).
In my view, this is already happening today on the Internet, even without AI language models. YouTube, WikiHow and a mess of other search-engine-optimized sites return wasteful, low-information-density garbage. Time consuming answers to even the simplest questions. When I read what ChatGPT-4 is producing, to me it feels like I’ve fallen into the worst corners of WikiHow-laden search results.
If we increase our reliance on AI generated responses, tuned in ways that appeal to the widest possible audience, it could make it harder for us–as a species–to gain new knowledge. For the future’s sake, I hope my fears prove to be unfounded.