Thin Is In
Thick clients were the dominant form of device throughout the PC and mobile era; in an AI world, however, thin clients make much more sense.
Thin Is In
Tuesday, February 17, 2026
Listen to this post:
There was, in the early days of computing, no debate about thick clients versus thin:
[A 1960s mainframe computer]
When a computer was the size of a room, there were no clients: you scheduled time or submitted jobs, and got back the results when it was your turn. A few years later, however, thin clients in the form of a monitor and keyboard arrived:
[A 1970s terminal for accessing a mainframe]
There is no computer in this image; rather, this is a terminal connected to a mainframe. That’s why it’s called a “thin” client: it’s just an interface, with all of the computing happening elsewhere (i.e. in another room). By the 1980s, however, “thick” clients were the dominant form of computing, in the form of the PC. All of your I/O and compute were packaged together: you typed on a keyboard connected to a PC, which output to the monitor in front of you.
A decade later, and Sun Microsystems in particular tried to push the idea of a “network computer”:
Adrian Cockcroft, CC-SA 4.0
This was a device that didn’t really have a local operating system; you ran Java applications and Java applets from a browser that were downloaded as they were used from a central server. Sun’s pitch was that network computers would be much cheaper and easier to administer, but PCs were dropping in price so quickly that the value proposition rapidly disappeared, and Windows so dominant that it was already the only platform that network administrators wanted to deal with. Thick clients won, and won decisively.
If you wanted to make a case for thin clients, you could argue that mobile devices are a hybrid; after all, the rise of mobile benefited from and drove the rise of the cloud: nearly every app on a phone connects to a server somewhere. Ultimately, however, mobile devices are themselves thick clients: they are very capable computers in their own right, that certainly benefit from being connected to a server, but are useful without it. Critically, the server component is just data: the actual interface is entirely local.
You can make the same argument about SaaS apps: on one hand, yes, they operate in the cloud and are usually accessed via a browser; on the other hand, the modern browser is basically an operating system in its own right, and the innovations that made SaaS apps possible were the fact that interactive web apps could be downloaded and run locally. Granted, this isn’t far off from Sun’s vision (although the language ended up being JavaScript, not Java), but you still need a lot of local compute to make these apps work.
AI vs. UI
The thick-versus-thin debate felt, for many years, like a relic; that’s how decisive was the thick client victory. One of the things that is fascinating about AI, however, is that the thin client concept is not just back, it’s dominant.
The clearest example of this is the interface that most people use to interact with AI: chat. There is no UI that matters other than a text field and a submit button; when you click that button the text is sent to a data center, where all of the computation happens, and an answer is sent back to you. The quality of the answer or of the experience as a whole is largely independent of the device you are using: it could be a browser on a PC, an app on a high-end smartphone, or the cheapest Android device you can find. The device could be a car, or glasses, or just an earpiece. The local compute that matters is not processing power, but rather connectivity.
This interaction paradigm actually looks a lot like the interaction paradigm for mainframe computers: type text into a terminal, send it to the computer, and get a response back. Unlike mainframe terminals, however, the user doesn’t need to know a deterministic set of commands; you just say what you want in plain language and the computer understands. There is no pressure for local compute capability to drive a user interface that makes the computer easier to use, because a more complex user interface would artificially constrain the AI’s capabilities.
Nicolas Bustamante, in an X Article about the prospects for vertical software in an AI world, explained why this is threatening:
When the interface is a natural language conversation, years of muscle memory become worthless. The switching cost that justified $25K per seat per year dissolves. For many vertical software companies, the interface was most of the value. The underlying data was licensed, public, or semi-commoditized. What justified premium pricing was the workflow built on top of that data. That’s over.
Bustamante’s post is about much more than chat interfaces, but I think the user interface point is profound: it’s less that AI user interfaces are different, and more that, for many use cases, they basically don’t exist.
This is even clearer when you consider the next big wave of AI: agents. The point of an agent is not to use the computer for you; it’s to accomplish a specific task. Everything between the request and the result, at least in theory, should be invisible to the user. This is the concept of a thin client taken to the absolute extreme: it’s not just that you don’t need any local compute to get an answer from a chatbot; you don’t need any local compute to accomplish real work. The AI on the server does it all.
Of course most agentic workflows that work tread a golden path, but stumble with more complex situations or edge cases. That, though, is changing rapidly, as models become better and the capabilities of the chips running them increase, particularly in terms of memory. When it comes to inference, memory isn’t just important for holding the model weights, but also retaining context about the task at hand.
To date most of the memory that matters has been high-bandwidth memory attached to the GPUs, but future architectures will offload context to flash storage. At the same time, managing agents is best suited to CPUs, which themselves need large amounts of DRAM. In short, both the amount of compute we have, and the capability of that compute, still isn’t good enough; once it crosses that threshold, though, demand will only get that much stronger.
This combination of factors will only accentuate the dominance of the thin client paradigm:
- First, if compute isn’t yet good enough, then workloads will flow to wherever compute is the best, which is going to be in large data centers.
- Second, if larger models and more context makes for better results, then workloads will flow to wherever there is the most memory available.
- Third, the expense of furnishing this level of compute means that it will be far more economical to share the cost of that compute amongst millions of users; guaranteeing high utilization and maximizing leverage on your up-front costs.
[...]