By far my most popular tweet ever:
The follow-on question is obvious: what’s the 10%? My answer is: I don’t know. Here’s how I’m finding out, both what I’m doing & how I’m thinking about it.
Thinking About It
Every technological revolution I’ve seen in my lifetime has had the same structure: something valuable that was expensive “suddenly” got much cheaper. (I put “suddenly” in quotes because in hindsight the progress was always obvious.) Examples:
Compute cycles with the microprocessor.
Graphics with the color bit-mapped display.
Communication bandwidth with the Internet.
Social connection with social media.
And now a bunch of things with AI.
It’s tempting to try to figure out where things are going to go next. Tempting, but futile. You can’t figure out the implications of a changed cost structure by thinking hard. We’ve all parachuted in the dark onto a large field. There’s large veins of gold down there somewhere. There is no metal detector so our best bet is to stick a shovel in the dirt & see what we turn up.
I’ve written about this situation extensively under heading 3X: Explore/Expand/Extract. Shifting to exploration is particularly difficult for folks who have been living in ExtractLand for years or decades. The laws of Exploristan are completely different:
You don’t know where value is coming from.
You’re looking for a loop where more value tends to create even more value.
You don’t know how cause is connected to effect.
Any single experiment is likely to just turn up dirt.
Sometimes tiny changes make a big difference.
Effective behavior in Exploristan looks naive & ineffective from ExtractLand & vice versa:
As many experiments as possible. Exploring is partly a numbers game.
Don’t worry about duplication of effort. Since tiny changes can make a big difference, 2 teams trying “the same” idea are still creating value through their inevitable differences.
Same goes for duplication over time. “The same” experiment 6 months later is not the same. Things have changed. (This is no excuse for running the actual same experiment over & over because you’re afraid of failure.)
You’re not looking for ideas that make sense, you’re looking for ideas that can be cheaply invalidated. Nonsense is a resource. Listen for your own laughter.
You’re not trying to maximize the chance of “success” of any given experiment. An extra you invest beyond “Nope, that doesn’t work either” or “Holy simolians! We didn’t expect that!” is one less experiment you get to run, one less shovelful you get to examine.
I keep hearing statements of the form, “Sure ChatGPT can do that, but not very well.” This statement is irrelevant in Exploristan. Replace it with these questions:
How fast is the tech getting better?
What ceiling is there on how good the tech can get?
(This is interesting the case of ChatGPT because it has gotten obviously worse at some tasks in the last 6 months, like complicated programming tasks. That’s a sign that it’s bumping up against some natural limit which OpenAI may or may not be able to overcome.)
Exploring
I’m trying lots of little experiments:
I asked ChatGPT to write a rap in the style of Biggie about the Test Desiderata. I won’t burden you with the results, but it’s a good example. Writing a rap at all would have been impossibly expensive for me & now it’s cheap. Still not valuable, as far I can tell, but it sure is cheap.
I have wired ChatGPT to the Action button on my iPhone. I’m reducing the cost of just trying ChatGPT for whatever pops into my head.
I’m tuning a model based on all of my writings. It gives better answers to some questions than the generic models. Will it prove valuable to people? I don’t know. Will it be able to turn into a sustainable business? I don’t know. I know that I have a unique ingredient—me—and I want to find out if that matters. (Thanks to the good folks at Incubyte for helping me launch this.)
I’m further tuning variants of the model to answer questions about blog posts (like this one). So far, some shockingly good answers & a whole bunch of nonsense. How fast can we improve & how much? Will people find this replacement for the comments section helpful? We’ll find out.
I have replaced Google search with ChatGPT entirely, just as a habit. Google search results have been getting steadily worse for a couple of years & I hadn’t really noticed.
I use Krea.ai to generate illustrations. Again, it’s the easiest possible interface.
Again with Krea I trained a couple of models on examples of my art, just to see if interesting things popped out. They kinda do.
I walk nervous non-technical people through their first interaction with ChatGPT. Here’s the picture Grandma & I came up with at Thanksgiving. She’s very into fitness:
Conclusion
There you have it:
The cost landscape has changed.
To explore the new landscape we all have to walk it.
More experiments = better.
Go get ‘em! I’m out there with my shovel too.
Exploristan stage ChatGPT questions:
- How fast is the tech getting better?
- What ceiling is there on how good the tech can get?
As you say yourself, “it has gotten obviously worse” and “That’s a sign that it’s bumping up against some natural limit which OpenAI may or may not be able to overcome.”
Well, …
OpenAI has said that they cannot make it much better, that they have reached the limits.
Also, early in the process, experts predicted that if LLMs were widely used to generate internet content, they would "poison themselves" by reading their hallucinatory outputs as inputs, resulting in their results getting worse and worse, until they become useless. This was demonstrated in tests. And we've been seeing it, in practice.
So I think this would suggest, from the questions you said we should focus on at this stage in the technology, that LLMs are simply not likely to really "take off" as a viable new technology "wave."
I made a little screencast series of some of my ChatGPT experiments --> https://www.youtube.com/playlist?list=PL4Q4HssKcxYuwbVAgVqwM5od3yLtg9NM0