I love Brad DeLong’s writing, but I did a double take when he recently commented “‘A Young Lady’s Illustrated Primer’ continues to recede into the future“. The Primer he’s referencing is an electronic book from Neal Stephenson’s Diamond Age novel, an AI tutor designed to educate and empower children, answering their questions and shaping their characters with stories and challenges. It’s a powerful and appealing idea in a lot of ways, and offers a very compelling use case for conversational machine learning models. I also think that a workable version of it now exists.
The recent advances with large language models have amazed me, and I do think we’re now a lot closer to an AI companion that could be useful for people of any age. If you try entering “Tell me a story about a unicorn and a fairy” into ChatGPT you’ll almost certainly get something more entertaining and coherent than most adults could come up with on the fly. This model comes across as a creative and engaging partner, and I’m certain that we’ll be seeing systems aimed at children soon enough, for better or worse. It feels like a lot of the functionality of the Primer is already here, even if the curriculum and veracity of the responses is lacking.
One of the reasons I like Diamond Age so much is that it doesn’t just describe the Primer as a technology, it looks hard at its likely effects. Frederik Pohl wrote “a good science fiction story should be able to predict not the automobile but the traffic jam“, and Stephenson shows how subversive a technology that delivers information in this new way can be. The owners of the Primer grow up indoctrinated by its values and teachings, and eventually become a literal army. This is portrayed in a positive light, since most of those values are ones that a lot of Western educated people would agree with, but its also clear that Stephenson believes that the effects of a technology like this would be incredibly disruptive to the status quo.
How does this all related back to ChatGPT? Try asking it “Tell me about Tiananmen Square” and you’ll get a clear description of the 1989 government crackdown that killed hundreds or even thousands of protestors. So what, you might ask? We’ve been able to type the same query into Google or Wikipedia for decades to get uncensored information. What’s different about ChatGPT? My friend Drew Breunig recently wrote an excellent post breaking down how LLMs work, and one of his side notes is that they can be seen as an extreme form of lossy compression for all the data that they’ve seen during training. The magic of LLMs is that they’ve effectively shrunk a lot of the internet’s text content into a representation that’s a tiny fraction of the size of the original. A model like LLaMa might have been exposed over a trillion words during training, but it fits into a 3.5GB file, easily small enough to run locally on a smart phone or Raspberry Pi. That means the “Tiananmen Square” question can be answered without having to send a network request. No cloud, wifi, or cell connection is needed!
If you’re trying to control the flow of information in an authoritarian state like China, this is a problem. The Great Firewall has been reasonably effective at preventing ordinary citizens from accessing cloud-based services that might contradict CCP propaganda because they’re physically located outside of the country, but monitoring apps that run entirely locally on phones is going to be a much tougher challenge. One approach would be to produce alternative LLMs that only include approved texts, but as the “large” in the name implies, training these models requires a lot of data. Labeling all that data would be a daunting technical project, and the end results are likely to be less useful overall than an uncensored version. You could also try to prevent unauthorized models from being downloaded, but because they’re such useful tools they’re likely to show up preloaded in everything from phones to laptops and fridges.
This local aspect of the current AI revolution isn’t often appreciated, because many of the demonstrations show up as familiar text boxes on web pages, just like the cloud services we’re used to. It starts to become a little clearer when you see how many models like LLaMa and Stable Diffusion can be run locally as desktop apps, or even on Raspberry Pis, but these are currently pretty slow and clunky. What’s going to happen over the next year or two is that the models will be optimized and start to match or even outstrip the speed of the web applications. The elimination of cloud bills for server processing and improved latency will drive commercial providers towards purely edge solutions, and the flood of edge hardware accelerators will narrow the capability gap between a typical phone or embedded system and a GPU in a data center.
Simply put, people all over the world are going to be learning from their AI companions, as rudimentary as they currently are, and censoring information is going to be a lot harder when the whole process happens on the edge. Local LLMs are going to change politics all over the world, but especially in authoritarian states who try to keep strict controls on information flows. The Young Lady’s Illustrated Primer is already here, it’s just not evenly distributed yet.