Frank van Gemeren (@frvge) got the #perfmatters vibe the moment he read High Performance Web Sites when it was released. Since then he read most, if not all, of the books with "High Performance†in the title and worked on optimizing trivago for both visitors and developers, and the portals of casual games for Spil Games as Performance Engineer with a focus on CDNs. Currently dipping his toes in various clouds and Kubernetes at HERE Technologies as Lead Software Engineer.
Introduction
What is this? A post with questions and no answers? Close. I still hope you learn something new. At the very least, you can use it as a sort of checklist to improve your web application’s performance. My aim is to go a bit deeper than most online performance tools like Lighthouse and WPT, but still leave the details up to the reader to investigate.
Let’s start. There are many areas where you can look at performance. From back-end to front-end and everything in between. Let’s start with the front-end, because that’s the largest time drain.
HTML
The grandfather of all things “webâ€, HTML, is relatively easy to optimize. Do you really need 10 divs to mark up a button, or would more semantic HTML with a sprinkle of CSS also get the job done? Do you limit the amount of DOM nodes, and make use of data-attributes to dynamically create/delete them when needed for visual states like hover and clicked. You can stream them in/out for larger sites, but in that case keep an eye on accessibility and findability (CTRL+F/⌘+F). Is the order of the various elements optimal in the HTML? Did you get your most important content as high up as possible? Are you preventing reflows/layout by specifying correct dimensions and are you using CSS containment properties? Of course you follow the best-practices of placement of CSS and JavaScript. Do you make informed decisions to prefetch/preload/prerender follow-up pages and resources?
CSS
Moving on to CSS, this area is often seen as less important, a micro-optimization. While that may be true for most sites, the more DOM elements you have, the more important the correct selectors and properties get. Understanding how selectors are evaluated (right-to-left) and the impact on element lookup of the various kinds of selectors can make a difference in scrolling performance and thus creating a jank-free scrolling experience. Do you use stacking contexts in the right way? Do you use visual properties that are computationally expensive to render? Do you have a structured way of defining your CSS (BEM, OOCSS, SMACSS, or the JavaScript-compiled CSS classes)? It doesn’t matter much, as long as you use the right tool in the right place and you’re aware of its pros and cons. While we’re at CSS, don’t worry about often-repeated classes in your HTML. Those get optimized away by compression. Concatenation and splitting the CSS payload in chunks is a tricky subject so think about what works the best for your use-case.
Images
The world is more and more visual. Images and video take up a huge amount of the size of the average website. Knowing your audience is the most important thing. Most multimedia optimization should be done ahead of time, since realtime optimization is often too slow. Depending on the device that will render the site, you can opt for the safe formats like JPG (baseline/progressive) or PNG but maybe you can switch to more optimized formats like WebP or SVG. Do you take client hints and other preferences into account like data-saver and the bandwidth API and “prefers-reduced-motionâ€? If you go one step further, how about the battery drain by the image decoding algorithm?
Do you want to use lazy-loading? If yes, the native version or something based on JavaScript? Maybe you want to use those fancy image previews (LQIP, SQIP) during the lazy-loading to help with perceived performance? Do you serve the images in roughly the same dimensions as they will be displayed? When does it make sense to use the better compression of a short video over the simplicity of a GIF? And does it make sense to do it all yourself or to outsource it to a specialized third-party service that handles all of this for you? Do you use picture or srcset for responsive images? And how to deal with video and fallbacks? Every site and its audience is different. Find what works best for you.
JavaScript
Next up, the elephant in the room: JavaScript. What started as simple mouse-over effects and funny snowflakes on your screen turned into the engine or functionality or behavior, taking over the role of routing and controller from a server-side language. As with everything, less is usually better. JavaScript performance keeps increasing as engines get better and it’s close to native C. With Web Assembly it can come even closer in certain specialized use-cases. Still, there are many things you should have a look at.
Build time tools come and go, but Webpack seems to be pretty stable in popularity lately. Are you using its web performance features? It will help with optimizing the code by minification and tree-shaking so there’s less code over the wire, and less code to interpret. Do you use dynamic chunking? Although it’s a more advanced topic, when used wisely, you can save yourself the costs of unnecessary downloads and interpretation, and do that on-demand. This helps with the initial load. The right time to send out the request is still up for discussion.
Next up, event handlers. Instead of having 1 event handler on each item in a list of 100 items, totalling 100 event handlers, you can use event capturing/bubbling on the list itself to save on memory. Have you had a look at the number of event handlers on your site lately? Are you using event throttling/debouncing to keep a smooth framerate? And are you cleaning up the event handlers when an element is removed from the DOM to prevent memory leaks?
Finally, a pet peeve for many performance engineers: third-parties. Do you keep an eye on third-parties like trackers and ads? Besides expensive redirect-chains, they might also take up a significant amount of execution time on the main thread. Do you monitor and graph their download size, memory footprint, and execution? Does any non-technical person have the power to dynamically inject potential new third-parties on the page, like with GTM? Have they been trained to look for “bad†third-parties? Is there a third-party performance budget? The clash of performance with business goals like remarketing is a tricky one, so make sure you have as much data about the negative impact of performance, in order to sway executives that there should be a limit on the amount of third-party snippets.
Caching and Storage
“The fastest work done, is no work at allâ€. If you can cache a request, do it. However, there’s a catch. Local storage might seem like a good option, but on some hardware the network is faster than the storage. In the age of decent storage in phones, caching is probably still better overall. Did you think about a good cache busting strategy? Hashes on file names, query parameters or something else? If you use caches and cache busting, how does that affect the hit rate on your Content Delivery Network? Most CDNs can exclude or ignore certain parameters. Have a look if everything is set-up correctly. CDNs are very important, so they get their own section.
Execution
The days when code lived mostly on the server are behind us. Where you run your code is getting more fluid every year as new technologies become available. The fastest code is the code that has an idempotent result that can be cached (either by CDN or ephemeral storage like memcached or redis), so try to get to that result.
JavaScript in the browser is often a good step for quick replies and checks used in form validation and other interactions that should feel snappy. I’m personally a fan of progressive enhancement and I would never “only†depend on client-side JS-based checks, also for security reasons. Having said that, a lot of behavior can currently be done without any programs running server-side, besides connections to databases and other backend resources via APIs. There are pros and cons to SPAs and it’s up to you to see if it makes sense in your case.
Maybe you can offload some work to a Web Worker so long-running calculations don’t block the main thread? This comes with many limits so it heavily depends on what your application or website does.
A weird middle ground between client-side and its operating system, the Service Worker can be used to provide tasks relating to Progressive Web Apps. It can also deal with caching and rewriting of requests. It’s not a normal work horse for code execution, so that’s all I’ll say about it.
Until now we’ve still been on the client. One step up is the Edge Worker. This is a service offered by a few CDNs which lets developers run code on the edge infrastructure of the CDN. This minimizes latency for dynamic content and is sometimes seen as “serverless†computing. The negative side is that resources and APIs might be limited. Do you have dynamic requests that are similar and that don’t need a lot of CPU and IO resources to return a result? Maybe an Edge Worker can help you increase performance.
Cloud-based serverless code execution removes the administrative overhead of running your own infrastructure and might be a good option.
Finally, the characteristics of your infrastructure and servers also influence performance. Here, more resources is usually better. More CPU, more memory, more IO, more bandwidth. Usually stability and high availability win here in the business sense. Running dedicated hardware (whether bare metal, virtualized or in containers) for long-running tasks is still better than sending the same tasks to low-end mobile phones so check your use-cases and target audience.
Web servers and headers
Web servers come in many flavours. Apache, nginx and IIS are the well-known ones. These came up during the times where sites consisted of mostly static files. Nowadays you can run stateful web servers easily from most programming languages. Sometimes it’s not even necessary to run a webserver yourself. Others can do it for you, as a Static-Site-as-a-Service provider, or you can serve your content directly from a CDN (with a push strategy). You will lose configuration power but it’s an easy way to get going. Do you have the knowledge to tweak everything to perfection? Great, go with your own server software of choice. If not, take a pick of the many external services.
Next up, the headers. Unseen, but very important. Is your keep-alive on? Do you tweak your cache-control per file type or, possibly even better, dynamically? Is compression enabled and is it dynamic (gzip/zopfli/brotli)? If you do redirects based on mobile vs desktop, and the “ending slashâ€, are those optimized so only 1 redirect is done? Are all redirects using the appropriate status codes? Are you using the HSTS header? And of course, are you using HTTP2 or maybe even HTTP3/QUIC? Do you prevent pre-flight requests by setting the appropriate CORS headers? Every little thing counts.
Servers and instances
As the main conductor of everything that happens on a server, the operating system can also be tweaked. The TCP network stack can be adjusted to get just that little bit extra performance. Also, keeping smaller TCP buffers can now be beneficial, but you might need to partially re-architect parts of your infrastructure if you have separate servers or load balancers that handle the TLS offloading. This matters most for HTTP2.
More/better hardware usually results in a faster site. Keep current if you have your own bare metal, or keep an eye on the instance type that you use if you’re in the cloud. New instance types can give significant boosts to your final throughput. This can lead to financial savings because fewer instances are necessary to handle the load.
CDNs and DNS
There’s the saying “The Internet is always broken, somewhereâ€. This means that the direct connection from your prospective visitor has a higher chance of being disrupted or slow, the more networks it has to travel through. Using a CDN limits that, and due to the geographical spread of the Points of Presence, the response times are usually faster if you store your static files at a CDN. However, their optimized internal network topology can also enhance dynamic requests that end up in your own origin. This feature isn’t available on all CDNs so do your own research. Do you use dynamic acceleration already?
CDNs have different use-cases. Some are extremely fast for small files, others specialize in video streams or security, some are cost-effective, and others are the best in a specific geographic area. How many CDNs did you test before making a decision?
The last question should preferably be answered with an answer like: a few. Having multiple CDNs can improve performance because networks will always fluctuate in their congestion. Where CDN 1 might be the best for location X, 5 minutes later, it might have a micro-outage and CDN 2 takes over the performance crown. Having multiple CDNs can be beneficial for optimizing performance and uptime, but also costs. If you have enough information about the performance characteristics from your visitors, you can offset the CDNs against each other and then negotiate based on that. You’ll need Real User Monitoring (RUM) for that.
HTTP2’s prioritisation feature is a touchy subject when it comes to CDNs. Most of them haven’t properly configured it yet. If you use this feature, be selective in your CDN choice.
Talking about HTTP2 features, CDNs can pool and coalesce connections and act as a reverse proxy for multiple, different origins for dynamic requests. Less connections to your origin saves TLS handshakes and thus time. Do you use the reverse-proxy functionality of CDNs effectively?
Next up, HTTPS/TLS. The more certificates are required to secure the connection, the slower the connection is going to be due to the increased download size. With HTTP2, the headers are optimized away. Do you use a lot of intermediate certificates? If yes, check with your security team if this can be optimized.
Moving on to DNS, it’s the first part in the long journey to a successful connection. It all starts at the nameservers. If they are on another top level domain than your main site, you might lose a bit of time with nameserver lookups (but you can gain resiliency in case the TLD nameservers go down, but that’s rare). When’s the last time you used “dig +trace†to investigate the whole DNS chain? Can it be optimized? Are your DNS servers on an anycast network and in multiple PoPs to shorten the network path for your visitors? Maybe the subresource requests in your HTML can be served directly from the CDN instead of a good-looking vanity URL like cdn.mysite.com to save an extra DNS lookup? Also, is the TTL of your DNS at a correct number? Too high and you’re down longer in case of issues, too low and there are performance penalties and higher costs for DNS due to increased resource usage.
The multiple CDNs only work well, if you can respond to DNS queries in a dynamic way. You can use programming and data sources to figure out which CDN is “best†according to your own rules. You can use RUM performance data, synthetic data, geographic data, financial data and more. Besides CDNs, this can also work with multi-cloud. The best part of this is your connections are always self-healing and performance-optimized after the initial set-up. Did you look in (managed) DNS services that provide custom rules?
Till infinity
There are plenty of areas I didn’t mention yet. Fonts, databases and their optimisation and cloud-specific optimisations and best-practices. There’s always something to optimise.
Optimisation without measuring the effect is meaningless. RUM and synthetic monitoring deserve their own sections but I feel those get plenty of in-depth posts.
Summary
There are tons of factors that affect web performance in small or bigger parts. The usual suspects are HTML, CSS and JavaScript, but also headers, CDN configuration, DNS and where the code executes.
Web Performance Optimisation is never finished. Technology, frameworks and tooling improve at breathtaking speed. Let’s keep that up.