Part of why ChatGPT's writing ability surprises
Apologies for all of the essays in your inbox lately, but I am currently waiting for my results from my cat trials, fee waiver application from the FDA, and various forms from my manufacturer and API suppliers. Also, it’s slow season for test prep right now. As a result, I have way more free time than I expected.
In high school and college, I used to be very into breakdancing1. I’d watch endless footage of battles and go to endless hours of practice (or “session”). Watching people better than me spend hour after hour perfecting the minutiae of their moves was where I first got introduced to the idea of deliberate practice.
However, there was always an uneasy tension with breakdancing and deliberate practice. The breakdancing scene places a very high emphasis on originality. So, on the one hand, you had to master the fundamental movements, and there were only so many ways to do that. On the other hand, if you did nothing but deliberately practice the movements that other people had already mastered, you risked being called a “biter”, or copycat. By the time you participated in a battle, you needed to have your own unique movements.
Because of this tension, breakdancing, despite the countless hours young men (and some women) all over the world have devoted to it, has never really been fully systematized. Each new breakdancer who comes on the scene has to learn by osmosis and apprenticeship how to reach and then surpass the bleeding edge of the field. So, while the newest generation of breakdancers is undoubtedly more skilled in acrobatic movements (“power moves”) than any generation before, you’ll often hear from older breakdancers (“oldheads”) that some of the more stylized moves have been lost to time. It was impossible for anyone to copy them without being called a biter, so nobody learned how to do them.
After college, I took up Brazilian Jiu Jitsu, or BJJ. Once again, I found myself in a world where young men (and some young women) spent hour after hour perfecting the minutiae of their moves.
However, there was a crucial difference in this practice. BJJ has no concept of “biting” or copying2. Every top BJJ competitor has released at least one DVD explaining exactly how their moveset works and how they think about using it. In fact, it’s common for a competitor to use their success at a major tournament to plug their latest DVD release.
As a result, pretty much all of BJJ has been systematized. Every new move that comes out is named, dissected, and placed within greater systems of movesets. I’d go so far as to say that, in this environment, it’s impossible for an effective move to be lost to time. Someone will take it upon themselves to explain to everyone else how to use it effectively.
I think it’s easy looking from the outside to think this is a natural state of affairs: “Of course breakdancing was not systematized and BJJ was! One’s a dance form and the other is a sport!” But I disagree. I think it’s the attitude that matters. Gymnastics is very similar to breakdancing, and yet it’s systematized to the extent that different moves are assigned point values by the judges.
Meanwhile, I don’t think it’s at all obvious that BJJ would be amenable to being systematized. The easiest way you can win in BJJ is to make your opponent give up, and the only major rule is that you can’t do that by hitting them3. You can choke them with your hands, your legs, your clothing, their clothing. You can bend pretty much any part of their body the wrong way. You’d expect there to be a huge number of ways to do this.
And you’d be right. But there are a limited number of ways to do this effectively, and it’s possible to use your bodily movements to constrain your opponent into a limited number of responses. The huge number of possible actions you can take and responses your opponent can make gets funneled into a limited number of actions for you and responses for your opponent.
And the reason this works is because almost all of our bodies are shaped in roughly the same way. Some of us are heavier or taller, and some of us are skinnier or shorter, but almost all of us have the same functional parts. The moves that I can do are also moves that you can do. The moves that work on me will also work on you. The ways that my arms and legs can move are roughly the same as the way your arms and legs can move.
This brings me to ChatGPT. One of the most surprising things about ChatGPT is how well it can quickly bang out a mediocre essay on pretty much any topic. Granted, it still has trouble with hallucinating sources, but it’s surprising how convincing the essay sounds.
It’s surprising because ChatGPT is basically just starting off with a bunch of words4. It can string the words together however it likes. You’d expect that there are a huge number of ways to string words together.
And you’d be right. But there are a limited number of ways to string words together effectively into a convincing essay. And, as it turns out, training ChatGPT on a lot of essays has taught it the common, effective ways of stringing words together to form a convincing essay.
The reason why this is surprising is we’re not used to thinking about essays like this. We’re used to thinking about essays like breakdancing, where there are fundamentals that you have to learn (i.e. grammar and semantics), but we place such a high emphasis on originality that we kick people out of school for copying wholesale.
But, if we had thought about essays like BJJ, we’d realize that writing is, by and large, systematic. I don’t invent how to write an essay every time I write one of these blog posts. I frequently reuse the same techniques and strategies, techniques and strategies that I originally learned from reading a lot of essays that people wrote before me. The strings of words that they put together work roughly for me the same way that they did for them.
We have chosen to teach writing like breakdancing, and so we’ve been left with the mistaken impression that most writing can’t be systematized. But take an AI that is incredibly good at learning patterns of words and give it no compunctions about copying techniques wholesale, and we find quickly that writing, like all human creative endeavors, fits within bigger systems. AIs still can’t innovate or add to these systems, but it’s increasingly clear that they can use them very well.
For any of you that care, this is not entirely correct. I was never a great bboy, and was much better at (and more into) popping and all-styles. This distinction is lost on anyone not in the scene, though.
Although I’ve heard that this actually was a major concern before BJJ came to America. The Brazilians would guard their techniques jealously and threaten physical violence against anyone who would try to “steal” their moves. The Americans, on the other hand, just copied whatever worked from whoever was winning, and eventually their mindset won out.
Or eye gouging, scratching, squeezing their genitals, pulling their hair, bending their fingers, or sticking your fingers in any orifice. There are other wrinkles to the rule set as well, so please ask before you do anything crazy in your first BJJ class.
Well, actually, it’s starting with tokens. This is an important distinction, because (and I’m talking out of my depth here but I think I’m right) it allows ChatGPT to store a word like underestimate into two tokens it already knows, [under][estimate], rather than creating a new token for it. But that’s besides the point here.