Lessons Learned from Building WebPerfDemo

31stDec 2023 by Shane Niebergall

ABOUT THE AUTHOR

Shane Niebergall is a web developer, proud dad, and space nerd from Southern California, spending his spare time trying to make the web a better place.

Last year I came across Jeremy Thomas’ demonstration of Web Design in 4 minutes where he guides the reader step-by-step through beautifying a bland web page. It was powerfully effective, and I began thinking how neat it would be if there was a similar walkthrough based on web performance.

WebPerfDemo is the result of that inspiration, but it was no easy feat. Along the way I learned a lot of lessons that may be useful to the webperf community.

Most metrics need a median

When I first started using SpeedCurve years ago, it annoyed me that 3 runs was the minimum amount of tests per url. I had a lot of urls to test, and tripling that was eating into my monthly allowance of checks.

After building WebPerfDemo, I understand why.

Every time you run a test, there is an absurd amount of variability. One page load may have had a busy network. The next load your CPU was busy in the background. Refresh again and this time the CDN was having a hiccup. This became apparent when I was testing a step of WebPerfDemo that was shaving off tons of page weight, yet sometimes the page load was slower than the previous step. How could a 2 MB page load faster than a 1 MB page? When there’s so much variability from load to load, even on the same page, it became a challenge to prove to the user that we were making an improvement.

The obvious solution is to take multiple loads of a page, and then use the median. At least 3, preferably 5. This throws out the outliers and reduces the noise, giving you more consistent data to work with.

I actually didn’t end up doing this with the demo, as I figured it would be a bit jarring for a page to load 5 times before you understood what was going on. Instead I added another metric, page weight, which was more stable than the core web vitals.

How to scale inefficiencies

Most websites are optimized for efficiency in order to scale. However I was doing the opposite. I wanted to show a horribly inefficient website to many. How can you do that without killing the server?

Luckily Cloudflare came to my rescue.

Using their Workers, I could easily fake a server side delay. By serving everything through a worker proxy I could decide whether I want to slow down an HTML response (to simulate a slow backend) or speed up an image (to simulate a CDN). This was accomplished with this snippet:

// Delay the response if necessary
if (msDelay > 0) {
  await new Promise(resolve => setTimeout(resolve, msDelay));
}

However, this solution came with its own hurdles, as if you take a look at Cloudflare’s order of operations, you’ll notice that caching is done before workers. If I was to use workers that’d mean that I wouldn’t be able to take advantage of server-side caching, and each user request would hit my measly shared server. Not ideal.

Luckily domains are cheap, and I realized that if I host my content on one domain that is cached, and then use the worker to load that content, then I’d be golden. The webperfdemo.com domain is served by the worker, and I use another domain to host the actual content, which is cached. I wonder if anyone else has had to do this in order to reorder the Cloudflare order of operations.

Browsers and Servers are inherently efficient

One of my goals was to show the difference that network compression makes. Ie, load a page with no compression, and then load it next with gzip/brotli enabled. Naturally it should shave off 50-60% of the text-based resources.

Yet doing this in the wild proved difficult. I assumed it would be as simple as stripping the ‘accept-encoding’ headers from the request, but various levels of the transport kept trying to compress the result. After some struggle, I gave up – if anyone has any ideas on how to make this work I’d love to hear it.

Browsers and servers also want to cache things as much as possible, rightfully so. However in order to make this a true demo, I needed to ensure that each step used a fresh copy of resources. I first thought about renaming each resource for each step, but that wouldn’t prevent the same step from using the cache. I then explored appending a unique query string to reach resource request, but that was messy. I ended up overriding the ‘cache-control’ headers.

Headers for most requests to ensure it isn’t cached:

cache-control: public, no-store

Header for the last step to show repeat-view performance:

cache-control: public, max-age=604800

Some Core Web Vitals are ‘lifespan’ metrics

My original plan was to showcase each of the core web vitals in the table of metrics. The Google Chrome team has provided a small javascript library that easily allows you to grab the web vitals of the page being loaded, so this should be no trouble.

But when I started playing with it I realized that some of the metrics weren’t being reported. Time to first byte (TTFB) and largest contentful paint (LCP) were reliable, but the others were reluctant to give a value. That’s when I learned about lifespan metrics – those that don’t report upon page load, but keep recording until the page is unloaded because they measure the entire lifespan of the page.

Metrics like cumulative layout shift (CLS) and and first input delay (FID) aren’t ones that report upon page load. In fact, this caused so much confusion that some people filed an issue with the web-vitals library.

I tried to mimic some interaction with the page in order to trigger the interaction to next paint (INP), but couldn’t find a solution that worked reliably. This is the struggle of lab environments that are trying to measure these metrics without a real user. Ultimately I gave up and limited the metrics I reported, to TTFB, LCP, Page Complete, and Weight, which can all be reported on load.

AI / LLMs are not going to take our jobs

I’m a backend developer – front end is my weakness. So when I needed to come up with a design for our sample page, I decided to use some outside help. In the past I’ve bought templates or hired a designer, but with the advances in artificial intelligence or large language models I decided to give it a try.

ChatGPT was my tool of choice, and I was impressed with its rough draft of a typical web page. But when I needed to make any alterations, it was evident that it was no pro. Many times I had to edit its proposed html, css, and javascript to accomplish what I was looking for. It got me 80% of the way there, but it was up to me to finish the last 20%.

Fortunately, that last 20% is the hardest part – which means that ChatGPT is not going to render web developer jobs obsolete anytime soon. Tweaking the details of a page until it is exactly what the user is looking for requires some finesse that the LLMs simply don’t have yet.

Instead, these advances should be looked at as tools that will assist us, not replace us. The sooner we accept and adopt these tools, the better prepared we will be for the future.

Tip of the Iceberg

I hope that the demo I’ve created is helpful to some, and if it is well-received I’ll plan on updating it every year. Web performance is a field that literally changes every day, and what I’ve demonstrated in my example is just the tip of the iceberg.

Thanks for reading, and thanks for doing your part in making the web a faster, and better, place.

Web Performance Calendar