Web Performance Calendar

The speed geek's favorite time of year
2021 Edition

Stoyan (@stoyanstefanov) is a former Facebook and Yahoo! engineer, writer ("JavaScript Patterns", "React: Up and Running"), speaker (JSConf, Velocity, Fronteers), toolmaker (Smush.it, YSlow 2.0) and a guitar hero wannabe.

In the spirit of reddit’s /r/explainlikeimfive let me try to summarize the art and craft of web performance optimization for the uninitiated reader.

The word “performance” can be misunderstood, so let’s define it first. Say you have work to do. Your performance would be measured by how quickly you can do this work. If Jane finished cleaning up her toys before John, we’ll say that Jane performed better. Faster. Zippier.

And without further ado, here are four principles of web performance optimization, or any work optimization really:

  1. Do less work. Avoid work completely, if at all possible.
  2. Work in parallel. Multitask, do more than one job at the same time.
  3. Pre-work. Try to prepare in advance.
  4. Post-work. Appear done, but finish later.

Simple enough? Let’s see more details as they apply to the web performance world.

Do less work

This is obvious, isn’t it? The less you do, the faster you’ll be done. Some may call you lazy, but you and I know you’re optimizing for performance.

Traditionally most of the web sites out there were/are “network-bound”. This means they spend most of their time moving bytes across the web while the user waits. And waits. So doing less work on the network is where optimization starts (and often ends, because it proves good enough).

Web apps (vs web sites) are increasingly CPU-bound, meaning they are so JavaScript-heavy that the slowest part (bottleneck) is executing all this JavaScript code. But still, all the JavaScript has to arrive over the network, so… back to the network.

And how do you do less work as far as the network is concerned?

A good place to start an optimization effort is to audit what’s going on as your page is being loaded with a tool such as WebPageTest.org. It shows you a “waterfall” view of what’s happening. Obviously the shorter the fall, the better. And just as obviously, fewer items (scripts, styles, fonts, images) in the waterfall make for shorter waterfall.

The picture looks like a waterfall of listing all components required for a complete page. There are some interesting events (vertical lines) but in general the right-hand end of the picture is when the page is completely done. Your goal is to be done as soon as possible. This is accomplished by having:

  • Fewer things (a.k.a. components, assets) in that waterfall, meaning less work in general
  • Smaller things in that waterfall (e.g. less bytes per component)
  • As many of the components falling down at the same time (see next section)

See also more details on how to read a waterfall.

Transfer fewer things over the network

So take a long hard look at the items in the waterfall. Do you see items that can be removed? Are you running multiple versions of jQuery? Using two CSS frameworks? Do you need this font? Any third-party scripts you can live without?

Make things smaller

Now that you’ve reduced the number of assets that need to travel to your user, take another look at them and make sure they are as small as you can make them. This includes:

  • Turn on compression for textual components (scripts, styles, json, etc). This instructs the server to compress the asset so fewer bits travel over the network and then the browser uncompresses it.
  • Minify scripts and styles. This process removes whitespace, comments and other non-essential information, renames variables to be shorter and so on.
  • Optimize images. Believe it or not it’s possible to have the exact same image with fewer bytes (lossless optimization). Also to have an image virtually indistinguishable from the original, but much smaller (lossy optimization).
  • Subset fonts, so you only use the characters you need.
  • Come to think of it, subset styles and scripts too: find dead code and delete it.

Come closer

If budget allows, host scripts, styles, images, documents (anything static that doesn’t change per user request) on a CDN (Content Delivery Network) so your bytes travel shorter distances and arrive earlier. You can start here.

Work in parallel

Now you’ve reduced the amount of work as much as you can, but you can still do better. The work that needs to be done is, first and foremost, to load all these components. Imaging component 1 takes 1 second to load, component 2 takes 2 seconds and component 3 – 3 seconds. If you load them one after the other, as if in a queue at the cash register in the store, the total time will be 1 + 2 + 3 = 6 seconds. But if you manage to load them in parallel, you’re as slow as the slowest component, or 3 seconds.

So you need to watch your waterfalls for potential “blocking” behavior where the waterfall seems to be stuck waiting for a component to finish. The most common offenders are:

  • Redirects
  • Blocking (synchronous) scripts
  • CSS

Redirects are just silly. It’s not uncommon to have ridiculous number of redirects like http://site.com to http://www.site.com to http://m.site.com to https://m.site.com. The user may be waiting for seconds staring at a blank screen, while the application decides who’s responsible for handling this particular request.

Blocking scripts. In older browsers regular <script src=""> just block everything else (because the script may do something as atrocious as document.write). New browsers are smarter than that, but your best bet is still to load script asynchronously so they don’t block.

CSS blocks rendering so the user doesn’t see anything even if many assets have already arrived. But it can also block scripts. So it needs special attention to make sure it’s as small as possible.

This handy little tool can point out some things you can reshuffle in your page’s head to prevent blocking and improve parallelization.


Wouldn’t it be nice if some of the work is already done for you? One way to accomplish this is by using caching.

For repeat visits, it’d be great if the browser reuses what’s already been downloaded. Setting far-future Expires HTTP header helps the browser cache static components for longer, avoiding the network altogether.

Another example of pre-work is done for you by minification tools. For example it may be friendlier to read code that goes:

const day = 1000 * 60 * 60 * 24; // a day in milliseconds 

But there’s no need for this computation to happen every time someone runs your code. A decent minifier will turn this code into something like:

const a=864e5;

As you see: shorter variable name, no spaces, no comments and the shortest way to represent the integer 86400000.

Yet another example of pre-work is to use preloading. Meaning have page A preload some assets for page B. For example start loading some scripts and styles for your main app while the user is still typing their password in the login screen.


The goal is to have the user use your page as soon as possible. The page may not be completely done but the user can start parsing visually and even interact with the page.

One approach is to have the critical resources (e.g. necessary CSS) early on or even inline in the HTML, while the rest can wait.

Another idea for more static pages (vs apps) is to use the ideas of progressive enhancement: load the bare necessities for a useful page and then load scripts that enhance the experience. Similarly for apps you don’t need to download all the JavaScript for all the app. Using modules and loading them only when a component is required is a good example of post-work laziness.

Yet another idea in to use early flushing: spit out the HTML and CSS for the header, while the server is still working on the full response (expensive database queries and so on). The use has something to be occupied with and can see that things are progressing.

It’s a start

There are many intricacies and details that were glossed over in an overview article such as this one. As you dig deeper you’ll learn a lot more, but now you have an overview of the optimization efforts.

To summarize: do less, then parallelize what’s left, and when all else is done, cheat (via pre- and post-work).