legendkeeper

Technical Notes #3: Caching

Added 2019-04-13 19:05:42 +0000 UTC

I know I said I wouldn't be doing more technical notes in my last post but... I lied. It's a rainy day and I'm waiting on some designs, so why not?

Let's talk about one form of optimization: caching. At its most barebones, a web application looks like this:

You boot up your browser, you send a request to the server, the server asks the database for data, and then shoves that data into a web page using HTML/CSS/Javascript, and hands the page back to the client. This is how a server-side rendered application works.

(LegendKeeper is actually client-side-rendered, which means the server doesn't hand back an entire web page, but rather data the browser can use to create the web page itself. I don't really want to get into the differences here, but for more info: Client-Side vs Server-side Rendering.)

A Simple Approach

Let's model it like a little conversation: Let's say I boot up LegendKeeper and view my article on Red Larch, a location on the Sword Coast.

Client: Hey Server, can I have the content for the Red Larch article? Here's its ID.

Server: Give me one sec; I know who to ask for this...Hey Database! What's the content for the article with this ID?

Database: Here it is! Found the raw data in my storage.

Server: Great; I'll put this in the right format and send it to the client.

Client: Sweet; I'll render the Red Larch wiki page using this formatted data.

The content appears on the screen and everything is great. The next time the client asks for the Red Larch article, the whole conversation happens again. This works well because it means that the database only stores data, the server only transforms/delivers data, and the client only displays data. In this way, the database is *stateful*, and the server and client are *stateless*.

But... this is also not great. Reading from the database and talking to the server costs resources. The application server and database have to use precious CPU and Memory to perform these operations. If many people ask for this Red Larch data at once, you can overwhelm your system with requests and database reads. What can we do?

Time and Space

We can make a tradeoff. What if the conversation went like this?

Client: Hey Server, can I have the content for the Red Larch article? Here's its ID.

Server: Hey Database! What's the content for the article with this ID?

Database: Here it is!

Server: Great; I'll put this in the right format, and I'll hold on to it for a while in my memory cache.

Client: Sweet; I'll render the Red Larch wiki page using this formatted data.

* a little later*

Client: Hey can I get that Red Larch article again?

Server: Already got it, boiiiiii.

Client: Hot damn.

This results in a faster request, because talking to the cache is faster than the database. While we have optimized for time by caching the data, we made a tradeoff here. The server now has to store state in its memory, which costs Space, but not talking to the database saves a lot of Time. This is a Space/Time tradeoff, and is a common encounter when developing software.

We've also made the server stateful, which can introduce bugs. For example, what if something changes the value in the database and the server doesn't know? It has stale data, so when the client asks again, they'll get incorrect information. You can implement things to help this, such as cache expiration (Server: I'll hold on to this for 5 minutes"), as well as methods for notifying the server of changes. (Server: "Oops, I think something changed. I'm going to clear my cache.").

Caching is a common trade-off/optimization, and is really handy, but can also be a source of bugs. The value proposition is that the increased speed is worth the space and additional complexity (and this is often true.)

Cookin' with a Client Cache

But why stop there? What if the client also had a cache?

Client: Hey Server, can I have the content for the Red Larch article? Here's it's ID.

Server: Hey Database! What's the content for the article with this ID?

Database: Here it is!

Server: Great; I'll put this in the right format, hand it to the client, and I'll hold onto it for a while.

Client: Sweet; I'll render the Red Larch wiki page using this formatted data, and *hold on to it for later*.

* a little later*

Client: Hey server can I... Wait. Nevermind. I already have it. *renders instantly*

That's an instant page load baby! We again have traded space and accuracy in exchange for speed. The same caveats apply: We are two degrees of separation away from the real source of truth (the database) meaning we are in an even more precarious accuracy situation. It's more difficult for the client to figure out if their cache is stale; how does my client know that Rebecca's client made a change?

There are ways around this such as using web-sockets to send change notifications down to every client, but we're going into "Is this speed worth the additional complexity?" territory. It's all about tradeoffs. At the very least, the cache only exists in memory so it disappears once the page is refreshed and the web application is reloaded.

The LegendKeeper Way

How does LegendKeeper do it? LK uses caches on both the server and the client. The server cache is selectively cleared whenever users make changes to world, and the cached data expires after a while regardless.

The client also has a cache, but we use it a little differently. Rather than using it as an alternative to the server's data, the client *always* asks the server for data, but it uses whatever is in the cache in the meantime.

Client: Hey server, what's the content for the Red Larch article? I've got the data already, so I'm going to render the screen regardless, but is there anything new?

Server: Let's see... *talks to database, caches, etc.*. Yeah, there's something new. Here you go.

Client: Sicc. I'm going to merge this new state with the old state and re-render the UI.

The UI renders nearly instantly, and is updated if necessary once the server returns its response. This doesn't save server resources, but the user experience is really nice. This approach is common with mobile and desktop apps, and when applied to web applications, is commonly referred to as "Offline-First". Why? Because if the data requests always go through the cache, if you are offline and the cache has the data, the page won't fail to render. It could be out-dated, but you at least can see something.

LegendKeeper is not technically Offline-First, but it eventually will be, and this is half of the mechanism that will support it. (The other half has to do with how to handle changes offline. This is more complicated, but still doable. Another post. ;) I'm not tackling offline support for a while after release.)

LegendKeeper goes the extra mile and also asks the browser if it can keep its cache in the browser's database. When you refresh the page, because it's saved in the browser DB, you don't lose the cache. If the cache starts getting too big, it just starts evicting old values. If you clear your browsing data, the cache goes away, but it doesn't matter. You ask the server for the data, and the cache-building process starts over again. This means that LegendKeeper gets faster and faster the more you use it.

You might be wondering: "Braden, only a few people are using the app yet. Why are you optimizing so much?" I actually did all this when LegendKeeper was just a learning project for me early last year, before it blew up in October, so it's already baked in. Plus, caching is a pretty easy win as far as optimization goes.

I'm hoping that LegendKeeper will be the most useful, performant world-building app out there. There's probably more room for optimization, but I'll save that for after release. Done is better than perfect. :)

Let me know if you have any questions! You can always find me on here or on Discord.

Braden