Andy Matuschak

What can malleable software learn from Realtalk?

Added 2024-10-01 05:17:01 +0000 UTC

A 1994 interviewer asked Alan Kay what he’s found most surprising about technology in the classroom. He complained that “computers are treated much more like toasters, [with] predefined functions… or running packaged software, and less as a material to be shaped by students and teachers.” He’d originally envisioned the personal computer not as a vessel for monolithic tools, but as a new medium for expressing and manipulating dynamic systems. Thirty years later, app stores and cloud services have further entrenched the “toaster” paradigm. The functions are predefined, as in 1994, but most data now lives in sandboxes and proprietary cloud databases, where it can’t easily be manipulated across app boundaries.

What would it be like if your computer were, instead, more like material you could “shape”? One metaphor is that it would be more like a wood shop. You’d assemble materials and tools from various sources, arrange them just as you like, and combine them to express what you have in mind. Note that in a workshop, tools can be modified: if a tool’s grip is too small, you might wrap some tape around it. New tools can be improvised: you might make a jig to repeatedly execute a tricky angle. And if you like, you can bootstrap your own machine shop from a simple charcoal foundry.

Apps rarely work like this. If you like Zotero’s metadata inference feature, but prefer EndNote’s inline citation interaction, there’s no good way to combine those tools from those two apps. Each has its own database. You need to pick a silo. If you prefer Notion’s editing interface, but your collaborator prefers Google Docs, there’s no good way for each of you to write with your preferred tool. You need to pick a silo. That silo is generally produced by an industrial manufacturer, and it cannot be modified except—if you’re lucky—through proprietary and inevitably limited plugin systems.

There have been many efforts to break us out of these silos. People sometimes complain that we wouldn’t be in this mess if we’d stuck with Smalltalk’s original vision, or if we weren’t beholden to business incentives. But I don’t think it’s so simple. There are some serious unsolved conceptual problems at the foundation of personal computing—in the language of the medium. I don’t mean that in the sense of programming languages, as we usually think about them. Rather, how can we express dynamic systems so that they’re easy to read, write, modify, and combine—even when collaborating with others? A “language” for expressing dynamic systems doesn’t have to (just) mean programming: consider the classic UNIX philosophy of small programs connected through text streams.

Two (very different) modern attempts

In 2015, Clemens Klokmose published a remarkable paper which has instigated a new wave of effort in this space. His system, Webstrates, was built on a key practical insight: the web’s Document Object Model (DOM) already contains primitives for expressing rich content and computation; could we modify those elements to create the kind of fluid medium we want? He presented a provisional answer through a working system and several dramatic demonstrations, which have since led to dozens of elaborations and conceptually inspired variants.

To me, those last two sentences demonstrate why Klokmose’s work has catalyzed a generation of young researchers: the ideas are pragmatic, and very real. This movement—which I’ll imprecisely label “malleable software”[1]—is not interested in boiling oceans. They want a practical solution which can evolve gracefully from computers, operating systems, and software paradigms as they exist today. They build systems you can download and use, right now.

Meanwhile, Bret Victor and colleagues at Dynamicland would like very much to boil the computing oceans—to give us a glorious, if less familiar, new start. Their system is called Realtalk, and their aspirations are about much more than breaking us out of siloed applications. Realtalk is a physical, communal computing environment. It’s about interacting with real, tangible objects, with your real senses and body, in a real space, side by side with other real people. Realtalk programs are malleable, composable, and interoperable, but that’s in part a matter of convenience. The true aspiration is to enable universal literacy in a new medium: all users should be able to author anything in the environment, from scratch, including the operating system. Indeed, these users should have no need for a “software industry”. No few sentences can adequately summarize this project. This paragraph’s job is only to convince you to spend some time with the material on the (new!) Dynamicland web site, if you haven’t already.

Realtalk does exist, but it is not at all pragmatic. It is pointedly uninterested in a smooth evolution from software workflows as they exist today. You can’t download it or use it. In fact, there isn’t (yet) a published technical description of the system. I sympathize with Clemens Klokmose when he wonders aloud “how I (and others) could contribute to the Dynamicland vision without spending a lifetime creating a poor imitation.”

But I want to suggest that it’s worth inverting the question and engaging in some deliberate sacrilege. How can Dynamicland’s ideas contribute to much more pragmatic malleable software efforts? What if we ignore Realtalk’s noble contributions to tangible computing, social computing, liberatory computing, etc etc, and—playfully relishing the hideous myopia—strip-mine it for ideas[2] we might use to bring malleability to more traditional software environments?

This letter is an attempt to get the conversation started with a few initial sketches.

Loose coupling with a global reactive blackboard

One weak class of “malleable software” many of us use every day is the browser extension. For example, you might install an extension which hides Twitter’s “like” and “retweet” metrics, or which adds Spotify links to YouTube videos which include music. These extensions work by injecting custom Javascript or stylesheets into a web page. The trouble here is that the modification usually depends in very precise ways on the deep structure of the web page. And so, if the web site changes, the browser extension can easily break. For the same reasons, you can’t straightforwardly use the Twitter-metric-hiding extension on Mastodon.

In their thesis on malleable software (p86), Memphis Tchernavskij points out that these same kinds of issues apply to Webstrates and many related projects. For example, suppose that you’re using a digital sticky note board to manage a project. You’d like to make a tool which lets you point to a sticky note and quickly add links to any related email conversations you’ve had. Ideally, you should be able to use that same tool with, say, papers you’re reading.

With Webstrates and other typical malleable software techniques, you’d generally need to deeply couple your tool to each piece of software you want to use it with. That is, you’d need to teach your tool how to read and write a sticky note in the sticky note app, and a paper in the paper reading app. In Webstrates, you’d do that by directly encoding assumptions about the interface’s DOM structure. As with browser extensions, these coupling details are often not only brittle but invisible, so it can be difficult to understand what’s going on when things don’t work as expected.

Realtalk programs use an unusual strategy to communicate with each other: a global reactive blackboard. Programs don’t connect directly with each other. Instead, they make claims, wish for things to happen, and describe their behavior through game-like rules. The result is a flexible and loosely coupled ecosystem, in which it’s surprisingly easy to write composable tools. Programs don’t have to “reach into each other” when they can read and write from a shared blackboard. Here’s a brief overview created by Tabitha Yong; see this article from Omar Rizwan for much more detail.

Let’s return to the “related emails” tool for a moment. Here’s how you might implement that tool in a weakly-coupled fashion:

Sticky notes and PDFs and emails claim that their “textual content” is whatever it is.
When the tool is used on an object with “textual content”, it finds related emails using those claims and wishes that links to certain emails be displayed alongside the object.
When something wishes that related emails are displayed on an object, wish that a particular overlay interface be drawn alongside it. (That wish is handled by a system-level graphics layer). Note that this program can be swapped for another to alter the visual representation.

There’s still a sort of brittleness problem here: what if the sticky note program uses a different label, claiming its “body” is what it is, not its “textual content”? But this is easy to fix without actually modifying the sticky note program. You can just make another program which translates claims about a sticky note’s “body” into claims about a sticky note’s “textual content”. This strategy is similar to Tchernavskij’s “entanglers” (p91), but generalized to adapting any kind of data relationship between programs.

This approach seems to require boiling oceans and rewriting everything. But one could write, say, a Trello-specific adapter which translates its API into claims and wishes. Such adapters are cumbersome and somewhat limiting, but at least the ugliness would be (hopefully) isolated to the adapter, and the tool wouldn’t need to worry about it.

Composability through spatial arrangement

Another way that Realtalk achieves loose coupling is by making the spatial arrangement of programs a first class object. Instead of “variables” and “arguments” and “pointers”, programs and data often refer to each other by physically pointing to each other. This often lets users describe systems in much simpler and more direct ways.

By way of example, Omar Rizwan demonstrates a dynamic map with multiple “layers”, e.g. for demographics, transit, major roads, etc. He points out that if you were implementing this kind of feature in normal software, you’d make a “layer list” panel with visibility checkboxes, a rearrange interaction, etc. But instead, with Realtalk, each layer is a physical object, and he just rearranges them in physical space to get the result he wants. He points out that “you automatically get the operations of the physical world: placing and picking up objects, moving and grouping objects in space, pointing objects at each other, and so on.” I really value Realtalk’s tactility, but I think Omar’s points mostly apply to virtual objects on a spatial canvas.

Composability in Webstrates often means describing a path in the DOM from one program to another. Perhaps those technical and coupling-laden references could be replaced, at least in part, with spatial arrangement.

Giving programs spatial location might also help with another of Tchernavskij’s criticisms: that in web apps (and so in Webstrates), there’s no consistent relationship between the structure of an interface and the structure of its code. That is, if you want to modify something you see on the screen, it’s hard to find the code responsible for it. One way to solve that is to colocate visual representations with the programs which create them.

Physical objects are naturally malleable

One key “move” in Realtalk’s evolution has been to move knowledge out of the computer system, and into physical space. If you want to build a digital sticky note app, you’d typically make a database, where each row has some text, and a position, and then some code to display those items in particular locations, and to allow users to modify the text and the position.

But with physical sticky notes, you can just put them in a position, and you can just write on them. No code is required for that. You just need code for the parts which are necessarily dynamic—like, say, highlighting stickies that you haven’t touched recently. And then, if the user feels inspired to draw a little diagram on the sticky, you don’t need to add an “attachments database” and extra interface elements for drawing. The user can just start drawing on the sticky note. Paper is naturally malleable, in terms of its physical representation. Objects—in the sense of object-oriented programming—are usually not.

Translating this idea to a purely virtual context, I wonder about adapting something like FigJam—or perhaps something like FigJam with the dynamic ideas from Apparatus and Cuttle. Rather than writing code to draw an interface and its data, perhaps users can directly draw the interface and its data, and then enliven them with computation.

If you draw a bunch of sticky notes on a spatial canvas, you don’t need a separate database of sticky notes: the sticky notes are just objects, there on the canvas. They’re just rectangles with a fill color and a child text element, or a child vector drawing, or whatever else the user happened to add. If the user wants to add a program which highlights stickies that haven’t been touched recently, they can do that without a formal database concept of a sticky note: perhaps they could add a program to the canvas which matches top-level rectangles with an old modification date, and adds an extra highlight overlay to them.

Practical malleability requires small programs

Many end-user programming systems pride themselves on universal editablity: the user can modify any piece of the system, in realtime. But just because the user has the technical capacity to modify any piece of the system, that doesn’t mean they have the practical capacity. Leaving aside the barrier of programming itself, the central barrier is that even simple modern applications are often many thousands of lines of code. Bret observes:

C++ and Rust advertise "zero-cost abstractions". Their cost metric is tied to execution speed. But the cost we're concerned with here is related to simplicity, transparency, understandability, a grasp of the whole. The confidence of the user, the absence of myth and superstition. What are "zero-cost abstractions" here? What material are they built from?

Realtalk feels as malleable as it does in part because its programs are relatively short. One key reason that Realtalk programs are short is that the system moves much traditional interface complexity out of the computer and into the real world. We’ve discussed a few ways it does that. But there’s also some more traditional systems design insight: Realtalk’s reactive blackboard pattern simplifies much everyday event handling and state management.

Because Realtalk programs are short, the system can make the norm that an object’s program always be visible. You can’t make Chrome’s program always visible because it’s 23 million lines long. Visibility is a key component of understandability and practical malleability. I don’t know that we can translate that property literally to screens—they’re just so much smaller than a typical Realtalk workspace—but I do think that malleable software efforts must take these observations seriously.

In particular, I don’t see how one can get programs small enough using a typical Javascript-based programming system, as Webstrates does. Maybe it’s a matter of finding the right API, or creating a dialect. Or maybe one can get surprisingly far by transliterating the Realtalk programming language into a virtual spatial canvas. It probably won’t work, but I’d guess one would learn a lot by trying.

————————

Thanks to my friends from Dynamicland. Please forgive the blasphemy. Thanks also to my malleable software friends. I’m sorry if this letter reads as critical: I admire your work a great deal!

————————

1: Sometimes “malleable software” labels a broader effort which also emphasizes the importance of realtime collaboration, flexible movement across various device form factors, data sovereignty, and other priorities. Sometimes “malleability” denotes the idea that a user can modify or reprogram a piece of software to suit their preferences, while separate terms like “composability” and “interoperability” describe the ability to flexibly recombine different tools, across application boundaries, to work on the same data. Sometimes the phrase isn’t used at all, but the cultural connection is clearly present.

2: Bret is quite concerned about this kind of behavior. He’s observed that because PARC’s ideas about the personal computer were brought into the mainstream in a partial fashion, with surface resemblance, most technologists can’t even see the richer foundational ideas and aims we’ve inadvertently discarded. He’s also expressed concern about third-party partial reimplementations of Realtalk for this reason, among others. I’m sympathetic. But I think—maybe naively—Realtalk will still look like an alien fantasy next to any “malleable software” projects which narrowly adopt its relevant ideas. I don’t think anyone can mistake software trapped in screens as “basically” Realtalk.