Obligatory: the views and opinions expressed in this post are my own and do not represent the views and opinions of my employer.
In light of all the hype going around about ChatGPT, I wanted to offer my “hot take” on what the next 2-5 years of the web look like.
One aspect of the rise of generative models that isn’t getting the right amount of attention is the long-term effects on the information economy. I think that being able to automatically produce arbitrary content that is indistinguishable from human-generated content at scale is the death knell of the web as we know it.
The web today
As we speak, thousands of small businesses are being started up with the sole purpose of exploiting ChatGPT as a “SEO expert”. It is being used to write websites, social media posts and reviews to promote products and businesses.
For the last decade, this job has been tasked to content farms in poorer areas of the world. The work done in these places has been slowly rotting away the value system that Google and Amazon (among others) relied upon: that information you find on the internet is truthful and aligned with your interests. Now that GPT has entered the scene, the work of these content farms is about to accelerate. ChatGPT produces better “fake” content faster and cheaper than the best content farm. A lot faster. So fast that I suspect that what took these SEO mills a decade to produce will be repeated in less than a year.
What does this mean for the future of the internet? Consider how hard it is to find reliable information on Google now compared to in 2013. Now, imagine the core problem that has caused this doubling. Then 4x’ing in 2024. Then 8x’ing in 2025.
There’s a related problem that will occur: as AI-generated information drowns out human-generated information, humans will simply stop producing content. So not only is the noise floor about to rise by an order of magnitude, the signal is going to drop in tandem.
An Information Virus
When I say GPT may be an information virus, I am referring to the spread that is about to happen. The economic incentives of our information economy will drive thousands of businesses to create AI-generated content. As more of the internet becomes AI-generated, humans will no longer be able to effectively use it as an information storage system. They won’t write because their voice will get drowned out by the AI cacaphony. They won’t read because – why would you read AI generated content when you can just talk to your own virtual assistant?
I suspect the end result of this is that our society is about to lose it’s ability to disseminate information at large scale, at least using any of the techniques that have been available to us in the past couple of decades.
This virus feeds on money. As long as there are economic incentives to produce content, people will use generative models to spew it onto the internet. It is undetectable and therefore unstoppable. It will happen regardless of whether OpenAI exists or not*.
While I am specifically considering text in this post, I think this is going to happen across every data modality. Images, text and audio are going to fall first, and soon. Video is probably safe for a few years but you are joking yourself if you think it’ll be safe forever.
* For what it’s worth – I am personally happy that a company that is committed to “doing it right” is spear-heading this change.
Why Google is Actually in Trouble
Popular news headlines over the past two months have opined that Google has a new competitor. These articles are right, but they’re looking in the wrong direction. Google is in big trouble, but not because it has competition. It is in trouble because the capability to produce false, computer-generated content that cannot be detected at scale has been democratized. Google is the gatekeeper of the information economy, and the information economy is what the GPT-virus will feed on.
As the internet “fills up” with undetectable AI-generated content, Google’s search algorithms will cease to work reliably. People currently complain that ChatGPT produces confidently wrong answers and say that Google doesn’t have this problem. These people are missing an important point: ChatGPT answers get posted to the web. Google cannot detect them. Therefore, very soon, Google will also be filled with these confidently wrong answers and there is no known technique for filtering them out.
In the information cesspool that is an internet filled with 99% of AI-generated content, Google’s search will no longer work. People will no longer use it. Advertisers will stop advertising. This is Google’s true existential threat, and it has nothing to do with OpenAI or Microsoft. It has to do with the fact that this technology exists and is being deployed broadly. If Google released a version of Bard tomorrow that blew ChatGPT or Bing out of the water, this simple fact will not have changed.
The conspiracy theorist in me thinks that Google management knows about this. This is the reason why they never released Lambda or a similar chatbot on their own.
Do I think Google is doomed? No – not necessarily. I only think that they are no longer “safe”. They need to build an entire new line of products, then build a monetization structure on top of it – before they lose their current source of funding. It’s going to be a rocky decade. Fortunately, the “talent density” at Google is still quite high in my opinion. The only thing holding them back is upper management and bureaucracy.
(For those of you who don’t know, I spent the last 5 years of my life working at Google – and I loved it. I love the company, I mostly loved the culture, and I have a deep respect for the folks who work there.)
Is this the actual rise of “Web 3.0”?
An amusing distraction during the COVID crisis was the rise of tech philosophers speaking out about how we were on the cusp of a “decentralized web”. The part of this argument that was never answered for me is: “how are you going to convince my (60 year old) Mom to start using your web 3.0?”
If my predictions above turn out to be right, web 3.0 might actually turn out to be a “thing” after all. Some form of decentralized web might very well become the only way for humans to communicate with each other via computers. The only way to halt the information virus is to make the economic value of producing content negative. That actually sounds a lot like web 3.0 as I understood it! How this actually plays out is far harder to predict, but I see the solution for at least a few of the problems posed by the GPT “information virus” in crypto currencies.
Ending on a positive note
This post is pretty high on the “doom-and-gloom” meter. The thing is: the web as we knew it from the early 2000s has already been dying. This became forefront for me a few years ago when I realized that the best way to search Google was to prepend every search with “reddit” or to use YouTube. I think it is actually a positive development on the whole for our society to “get over” the old web. I can’t say for sure what is to come is going to be better, but I’m pretty sure that what exists now is not worth saving.
Outside of the fate of the web, I see GPT as a monumental force of good. In a few short years, every human being will be able to put a compressed version of all human knowledge on a USB stick and carry it out their front door. Everyone will have access to an “extremely good” source of information about math, science, medicine, programming, etc – indexed with an interface that can be accessed via natural language. Yes, it will be flawed – but it will still be very valuable to real people.
That is incredibly exciting to me. I think human society as a whole will take a leap forward as a result of this.