I’m weary of always comparing the web to so-called “native” platforms like Android and iOS. The web is streaming, meaning it has none of the resources locally available when you open an app for the first time. This is such a fundamental difference, that many architectural choices from native platforms don’t easily apply to the web — if at all. But regardless of where you look, multithreading is used everywhere. iOS empowers developers to easily parallelize code using Grand Central Dispatch, Android does this via their new, unified task scheduler WorkManager and game engines like Unity have job systems. The reason for any of these platforms to not only support multithreading, but making it as easy as possible is always the same: Ensure your app feels great.
In this article I’ll outline my mental model why multithreading is important on the web, I’ll give you an introduction to the primitives that we as developers have at our disposal, and I’ll talk a bit about architectures that make it easy to adopt multithreading, even incrementally.
The Problem Of Unpredictable Performance
The goal is to keep your app smooth and responsive. Smooth means having a steady and sufficiently high frame rate. Responsive means that the UI responds to user interactions with minimal delay. Both of these are key factors in making your app feel polished and high-quality.
According to RAIL, being responsive means reacting to a user’s action in under 100ms, and being smooth means shipping a stable 60 frames per second (fps) when anything on the screen is moving. Consequently, we as developers have 1000ms/60 = 16.6ms to produce each frame, which is also called the “frame budget”. I say “we”, but it’s really the browser that has 16.6ms to do everything required to render a frame. Us developers are only directly responsible for one part of the workload that the browser has to deal with. That work consists of (but is not limited to):
- Detecting which element the user may or may not have tapped;
- firing the corresponding events;
- calculating styles;
- doing layout;
- painting layers;
- and compositing those layers into the final image the user sees on screen;
(and more …)
Quite a lot of work.
Note: RAIL has been a guiding framework for 6 years now. It’s important to note that 60fps is really a placeholder value for whatever the native refresh rate of the user’s display is. For example, some of the newer pixel phones have a 90Hz screen and the iPad Pro has a 120Hz screen, reducing the frame budget to 11.1ms and 8.3ms respectively. To complicate things further, there is no good way to determine the refresh rate of the device that your app is running on apart from measuring the amount of time that elapses between
The usual advice here is to “chunk your code” or its sibling phrasing “yield to the browser”. The underlying principle is the same: To give the browser a chance to ship the next frame you break up the work your code is doing into smaller chunks, and pass control back to the browser to allow it to do work in-between those chunks.
There are multiple ways to yield to the browser, and none of them are great. A recently-proposed task scheduler API aims to expose this functionality directly. However, even if we had an API for yielding like
await yieldToBrowser() (or something of the sort), the technique itself is flawed: To make sure you don’t blow through your frame budget, you need to do work in small enough chunks that your code yields at least once every frame. At the same time, code that yields too often can cause the overhead of scheduling tasks to become a net-negative influence on your app’s overall performance. Now combine that with the unpredictable performance of devices, and we have to arrive at the conclusion that there is no correct chunk size that fits all devices. This is especially problematic when trying to “chunk” UI work, since yielding to the browser can render partially complete interfaces that increase the total cost of layout and paint.
const worker = new Worker("./worker.js");
Before we get more into that, it’s important to note that Web Workers, Service Workers and Worklets are similar, but ultimately different things for different purposes:
While Workers are the “thread” primitive of the web, they are very different from the threads you might be used to from C++, Java & co. The biggest difference is that the required isolation means workers don’t have access to any variables or code from the page that created them or vice versa. The only way to exchange data is through message-passing via an API called , which will copy the message payload and trigger a
message event on the receiving end. This also means that Workers don’t have access to the DOM, making UI updates from a worker impossible — at least without significant effort (like AMP’s ).
Support for Web Workers is nearly universal, considering that even IE10 supported them. Their usage, on the other hand, is still relatively low, and I think to a large extent that is due to the unusual ergonomics of Workers.
Concurrency Model #1: Actors
My personal preference is to think of Workers like Actors, as they are described in the Actor Model. The Actor Model’s most popular incarnation is probably in the programming language Erlang. Each actor may or may not run on a separate thread and fully owns the data it is operating on. No other thread can access it, making rendering synchronization mechanisms like mutexes unnecessary. Actors can only send messages to each other and react to the messages they receive.
As an example, I often think of the main thread as the actor that owns the DOM and consequently all the UI. It is responsible for updating the UI and capturing input events. Another factor could be in charge of the app’s state. The DOM actor converts low-level input events into app-level semantic events and sends them to the state actor. The state actor changes the state object according to the event it has received, potentially using a state machine or even involving other actors. Once the state object is updated, it sends a copy of the updated state object to the DOM actor. The DOM actor now updates the DOM according to the new state object. Paul Lewis and I once explored actor-centric app architecture at Chrome Dev Summit 2018.
Of course, this model doesn’t come without its problems. For example, every message you send needs to be copied. How long that takes not only depends on the size of the message but also on the device the app is running on. In my experience, postMessage is usually “fast enough”, but there are certain scenarios where it isn’t. Another problem is to strike the balance between moving code to a worker to free up the main thread, while at the same time having to pay the cost of communication overhead and the worker being busy with running other code before it can respond to your message. If done without care, workers can negatively affect UI responsiveness.
The messages you can send via postMessage are quite complex. The underlying algorithm (called “structured clone”) can handle circular data structures and even
Concurrency Model #2: Shared Memory
In 2019, my team and I published PROXX, a web-based Minesweeper clone that was specifically targeting feature phones. Feature phones have a small resolution, usually no touch interface, an underpowered CPU, and no proper GPU to speak of. Despite all these limitations, they are increasingly popular as they are sold for an incredibly low price and they include a full-fledged web browser. This opens up the mobile web to demographics that previously couldn’t afford it.
To make sure that the game was responsive and smooth even on these phones, we embraced an Actor-like architecture. The main thread is responsible for rendering the DOM (via preact and, if available, WebGL) and capturing UI events. The entire app state and game logic is running in a worker which determines whether you just stepped on a mine black hole and, if not, how much of the game board to reveal. The game logic even sends intermediate results to the UI thread to give the user a continuous visual update.
Workers are a key tool to keep the main thread responsive and smooth by preventing any accidentally long-running code from blocking the browser to render. Due to the inherent asynchronous nature of communicating with a worker, the adoption of workers requires some architectural adjustments in your web app, but as a pay off you will have an easier time supporting the massive spectrum of devices that the web is accessed from. You should make sure to adopt an architecture that lets you move code around easily so you can measure the performance impact of off-main-thread architecture. The ergonomics of web workers have a bit of a learning curve but the most complicated parts can be abstracted away with libraries like Comlink.
- “ Slow?,” Surma
- “,” npm
There are some questions and thoughts that are raised quite often, so I wanted to preempt them and record my answer here.
My core advice in all matters of performance is: Measure first! Nothing is slow (or fast) until you measure. In my experience, however, postMessage is usually “fast enough”. As a rule of thumb: If
JSON.stringify(messagePayload) is under 10KB, you are at virtually no risk of creating a long frame, even on the slowest of phones. If postMessage does indeed end up being a bottleneck in your app, consider the following techniques:
- Breaking your work into smaller pieces so that you can send smaller messages.
- If the message is a state object of which only small parts have changed, send patches (diffs) instead of the whole object.
- If you send a lot of messages, it can also be beneficial to batch multiple messages into one.
- As a last resort, you can try switching to a numerical representation of your message and transferring an ArrayBuffers instead of sending an object-based message.
Which of these techniques is the right one depends on the context and can only be answered by measuring and isolating the bottleneck.
I want DOM access from the Worker.
This one I get a lot. In most scenarios, however, that just moves the problem. You are at risk of effectively creating a 2nd main thread, with all the same problems, just in a different thread. Making the DOM safe to access from multiple threads would require adding locks which would introduce a slowdown to DOM operations. This would probably hurt a lot of existing web apps. Additionally, the lock-step model has benefits. It gives the browser a clear signal at what time the DOM is in a valid state that can be rendered to the screen. In a multi-threaded DOM world, that signal would be lost and we’d have to deal with partial renders or other artifacts.
I really dislike having to put code in a separate file for Workers.
This content was originally published here.