Beyond soft-navigations: tracking your SPA's TTFB

Beyond soft-navigations: tracking your SPA’s TTFB

19thDec 2023 by Erwin Hofman

ABOUT THE AUTHOR

Erwin Hofman (@blue2blond) has been a developer since 2000, created his own performant CMS (2003) that is used by his own agency (2005), became a pagespeed consultant for e-commerce in 2015, and co-founded RUMvision in 2021. He became a Google Developer Expert in 2023.

He especially writes about pagespeed on LinkedIn

SPA site owners most likely are aware of the gap in Core Web Vitals data for their site. Most metrics (FCP, LCP, FID) are only reported by the browser once, so can’t be measured even if sites were willing to put in extra effort. TTFB is similar, but could potentially be calculated for soft navigations. And reporting CLS and INP, while technically possible, comes with other challenges.

SPA site owners waiting for Core Web Vitals to fully support their sites

Introduction

In summary, Core Web Vitals metrics won’t treat SPA route changes the same way as traditional page loads (Multi Page Applications, or MPAs).

When they wrote the article linked in September 2021, Google already asked themselves what they are doing to ensure MPAs do not have an unfair advantage compared to SPAs.

One of their answers to their own question is as following:

Design new APIs that enable better SPA measurement.

Which is exactly what Google has been doing by working on and introducing a soft navigations experiment. If you’re new to soft navigations: this is the term used by the web community to distinguish SPA navigations from MPA navigations.

RUMvision

I’m a developer by origin, nowadays working both as a consultant and involved with RUMvision’s JavaScript. This JavaScript is responsible for collecting and submitting web-vitals data and additional metrics and dimensions. And our JavaScript is a layer on top of Google’s web-vitals library.

Origin trials

Soft navigation measurement support in the browser is in an experimental phase, also known as an origin trial. Origin trials allow developers, RUM providers and other web enthusiasts to try out new features and API’s and provide feedback to the people that came up with a new web platform idea. You could say it allows you to experiment with features today that could be part of a browser tomorrow.

Not every first concept is a success. Luckily, this never really happened while soft-navigation is in fact nailing it

Origin trials are an area of focus of RUMvision because we believe they offer us a way to get ahead of the curve for our customers, while also allowing us to give feedback on upcoming APIs. So far, we’ve embedded 4 origin trials.
Obviously, we only start to work on origin trials when they make sense for a RUM provider like us, so we don’t look at every origin trail, but we do strongly consider ones that can help us provide more insights to our customers.

Long Animation Frames (LoAF) API and the Soft Navigations API are examples here. And if you ask me, both are masterpieces. Origin trials and web API incubations in general are something that I have easily taken for granted in the past, but it’s very interesting to be more involved in them. And quite fun to work with too.

Soft navigations

In this article, I will talk a bit about soft navigations and especially elaborate on how we built a layer on top of Google’s web-vitals soft-nav branch.

The soft navigations explainer page is actually perfectly describing this, as well as why the community needs a soft navigation API.

The SPA web-vitals challenge

The often-heard Core Web Vitals complaint from SPA (an PWA) site owners is that it’s being penalized compared to non-SPAs.

Missing and incorrect data

The reason is that Core Web Vitals is not reporting metrics specifically for soft navigations. Well, it does, kind of. For example CLS and INP are reported through the whole page. But it typically is then attributed to the initial “hard navigation” URL, as the URL or route change isn’t picked up by Core Web Vitals APIs. This means you typically get one CLS and INP measure across the whole of the SPA, rather than one per page.

This could mean that when looking at Google Search Console, CrUX API’s, PageSpeed Insights or other data coming from CrUX, you might end up looking at data that is not really measured by what users consider page views, compared to MPAs.

“But SPAs are much better in real life”

Mainly seeing experiences by landing URL is unfortunate though, as instant subsequent page navigations is a promise that comes with most SPA frameworks (though whether that actually is true is a totally different discussion).

At RUMvision we have started measuring TTFB for soft navigations and often do see a much smaller time for these:

A screenshot from RUMvision, showing different TTFB’s for different navigation types

As soft navigation’s TTFB is showing a better value, this could indicate that SPAs are penalized by web vitals heuristics. But do they when it comes to the Core Web Vitals?

As a matter of fact, the Google team already improved the playing field somewhat by changing LCP, CLS and INP characteristics:

LCP used to re-report a removed and re-hydrated candidate. Google changed this behaviour in January 2021 (Chrome 88). Elements that were the LCP but are then removed are still considered the LCP if no bigger element is being rendered.
One of the changes of the CLS metric is that it is nowadays tracked per 5 second window session per May 2021. So pages only get the worst burst of CLS, rather than the full accumulation. This helps longer lived pages like SPAs that previously experienced an ever increasing CLS number.
The highest INP numbers are already being ignored which again benefits longer-lived pages like SPAs, as they tend to get more interactions within a single page life cycle.

So the CLS and INP metrics account somewhat for the use cases of SPAs.

However, the LCP metric is genuinely harder to pass when all other page navigations within your site are soft navigations. That’s because the LCP number will be based on those initial navigations where both DNS lookup and render blocking stylesheets weren’t cached yet, leading to a higher TTFB, FCP and LCP.

For MPAs these initial heavy hit can be offset in future page loads within the site (where the connection is already made and common site assets like CSS, JS and logos are already cached), MPAs do not get the lighter, subsequent LCPs included as so only get the full initial hit.

The web-vitals library

I do acknowledge the responsiveness challenge that comes with most JavaScript frameworks. And maybe even layout shifts due to page transitions. However, that should not be attributed to limitations on how performance metrics are collected and reported (but instead should maybe be attributed to how the site is experienced).

As mentioned previously, it is already possible to segment some of the metrics (CLS and INP) by soft navigation route, but when trying to do that through the web-vitals library, it did come with other challenges:

You can either let the library report metrics once per navigation, or report all changes and triage incoming data yourselves. This could be used to report on soft navigations in a custom way.
Unfortunately, even then some metrics (LCP, FCP, FID, TTFB) are not reported for subsequent navigations.
Even for those that are (CLS, INP) only larger values are reported, meaning you only get a partial view of these metrics for soft navigations.

Luckily, as the soft navigations API was introduced, web-vitals introduced the soft-nav branch which you might want to use instead.

This made it easier to collect LCP, CLS and INP for other navigations as well. But the TTFB would always result in 0ms.
CLS and INP of a previous page will be reported before the new TTFB is being reported. This makes it easier to send them to your analytics endpoint in a chronological way.
This branch includes a unique uuid per page as metric.navigationId. Which allows you to tie together page specific data.

In other words: the soft-nav branch of web-vitals allows you to measure soft navigations with minimal effort and attribute them back to the appropriate route.

The code below (latest at time of writing) can be used to report metrics including soft navigations.

(function() {
  var script = document.createElement('script');
  script.src = 'https://unpkg.com/web-vitals@soft-navs/dist/web-vitals.attribution.iife.js';
  script.onload = function() {
    webVitals.onTTFB(console.log, {reportSoftNavs: true});
    webVitals.onLCP(console.log, {reportSoftNavs: true});
    webVitals.onCLS(console.log, {reportSoftNavs: true});
    webVitals.onINP(console.log, {reportSoftNavs: true});
  }
  document.head.appendChild(script);
}());

Do note to use this soft-navs branch you need to either enable Experimental Web Platform features or participate in the soft navigations origin trial.

You still need to send the data to an endpoint yourselves and correctly batching together page specific data to prevent incorrectly attributed data. The web-vitals documentation shared a code example to give you a head start though, but this is what a RUM provider like RUMvision can help with.

The start of embedding SPA tracking in RUMvision

Until the web-vitals published the soft-navs branch, tracking even limited CWV metrics in SPAs needed quite a bit of customization, and even then had severe limitations.

But with the soft-nav branch of the web-vitals library being published, it became way easier. On top of that, soft navigation experiments entered a second trial. So, over at RUMvision, we decided to start working on v4 of our tracking snippet to incorporate soft navigations.

Who should fix the 0ms TTFB?

Although the web-vitals library is tackling some challenges already, it was still missing some data.

Mainly, TTFB was known to report 0 milliseconds. To be honest, all the concerns described over at developer.chrome.com are justified (they’ve spent way more time in this area afterall). Their reasoning to report a TTFB of 0ms is based on the following (in my opinion, valid) concerns:

Does a fetch request always actually happen for a soft nav?
And if it happens, which one to pick from?
And was it actually related to the navigation?
Could the LCP be painted even when the fetch request fails?

All questions above are tricky to answer and even vary per stack and framework. For a general purpose library taking an opinionated stance on this for the general use case is understandable. The Google team also tries to keep the library as tiny as possible so adding a lot of extra code to try to tackle may grow the library more than desired.

With this in mind, how could we still end up in a situation where we do collect TTFB? This is where RUM providers can make a difference. Because:

On top of the web-vitals library, we know what framework is going to run that JavaScript;
Or allow site owners to add additional pointers as to which resource should be considered the main TTFB resource.

How RUMvision added TTFB support to SPA tracking

We’ve tested different stacks here to learn more about which request should be considered the most important one. For example, which resource actually contains contents for the upcoming page transition. We tested sites running on NextJS, NuxtJS, Angular, Gatsby and custom PWA’s.

No uniform way

And even within NextJS, you can find different scenario’s when it comes to soft navigations and possible TTFB candidates:

One prefetched all data up-front;
One just started a request on click;
Another one prefetched on hover, but then did two additional calls to the same URL.

Especially the last one leaves you, well.. unsure as to which one to consider the most important request.

And while the initiatorType within NextJS will always be fetch, I also ran into link (when prefetched) and xmlhttprequest within other stacks.

There just doesn’t seem to be a uniform way of telling which resource to use for TTFB calculations. Once again confirming the web-vitals concerns and answering why they chose to report 0ms as TTFB.

Configuring per platform

Site owners could already share their tech stack with RUMvision. For example to import ServerTiming metrics and dimensions that are often exposed by the CDN, host, stack or plugins that a site is using.

NextJS

Knowing the stack means we could use this information to automate parts of TTFB analyzing as well. We just learned that the exact moment resources are loaded within NextJS but even SPAs in general can be inconsistent.
But the pathname is often predictable. A NextJS example is as following:

/_next/data/url-of-actual-page.json

As a result, NextJS is where our experimentation began as the setup proved to be the easiest. This phase was successful, so we moved on to other stacks.

Patterns for other platforms

Based on additional research within other platforms, we decided to introduce endpoint patterns per template type. Because we discovered quite soon that category pages had different endpoints than product pages.

An example we saw at a NuxtJS website is as following:

/api/categories/{3}?lang={0}&slug={2}

We already supported regular expressions as part of identifying and grouping data per page template. We extended that feature and allowed site owners to also provide a pattern for API endpoints for pages falling in that page template group.

With the pathname of a visited page in mind, our code will then translate it back into an exact API endpoint. A full transformation example can be found in our docs.

The exact (or substring) of the API endpoint will then be used to apply filtering and return the correct fetch request(s) from the list of resources. And that request will then be used for TTFB purposes.

Analyzing new fetch requests

The more technical explanation is that we observe and save all upcoming resources in the following way:

const fetchResources = [];
const resourceObserver = new PerformanceObserver(function(list) {
  list.getEntries().forEach(function(e) {
    fetchResources.push(e);
  });
});
resourceObserver.observe({
  type: 'resource',
  buffered: true
});

But such a list could easily grow into 50 or even 200 resources for a single user interaction/soft navigation. That’s why we ask site owners to either specify their API endpoint or initiatorType of the API resources (or both) to prevent our script from running into max buffer size issues.

But even that and intermediately truncating the array could leave us with more than one resource in the fetchResources array.
An example: when navigating to a product listing page, a framework could already eagerly fetch data of all products listed on that page. It’s then hard to tell which request was related to the next user interaction, if we don’t have additional patterns to work with.

Which is the reason why we introduced endpoint patterns that were described earlier.

Waiting for the LCP

But when to report the TTFB? Because the web-vitals library won’t wait for any specific resource when dispatching the 0ms TTFB. And even if the (empty) TTFB is being reported by the web-vitals library, we have no guarantees if the resource we are expecting is downloaded already.

So we intercept and delay the reporting of the TTFB until the LCP is reported.

As both web-vitals as well as soft navigations are Chromium-only API’s anyway, we thought this solution is a safe bet (for now).

Because once the LCP is known, we have the timing information of the LCP (such as startTime and the actual file). And if there was in fact a dependency on a fetch request, we can assume that it should be finished by the time the LCP is reported.

Reporting the TTFB

Once we know the LCP, we can finalize the TTFB as well. That is the moment we will transform the current URL into an API endpoint pattern, loop through the remaining fetchResources and retrieve one or multiple entries.

With those entries, we check which one of them were fully done downloading before LCP starts. We do this by comparing the entry.responseEnd with the lcp.attribution.lcpResourceEntry.startTime (or its value if attribution wasn’t enabled).

That could still result in multiple TTFB candidates. We decided to pick the last one matching the above clause.

Having a single TTFB candidate at this moment, we supplement it with additional information before reporting it to RUM.

For example:

If the TTFB’s entry.responseEnd happened before the actual soft navigation startTime, it was fully prefetched;
If not, but if the entry.startTime happened before the soft navigation startTime, there was an attempt to prefetch it, but wasn’t done downloading (for whatever reason, which will be collected via other dimensions, such as bad internet connectivity);
If it doesn’t meet the above scenario’s, we consider it a request in a normal flow, but will add an index number (to cover the case described earlier where multiple entries were dispatched, this will help determine if other requests sat in between or were downloaded simultaneously, giving developers pointers as to where to start debugging).
Additionally, we will share if the LCP might have been depending on the TTFB entry, or (when entry.responseEnd is smaller than lcp.attribution.lcpResourceEntry.startTime) not.

A screenshot from RUMvision, showing the server response time (so, not full TTFB) for different resource priorities

In the case of the screenshot, files were clearly prefetched up front. And given the prefetch (finished) state, it works. But chances are that simultaneously and eagerly prefetching so many files up front could mean that not only prefetched files, but maybe also files within the critical path are impacted.

TTFB and LCP sub-parts

We now also have the info to change sub-parts of the TTFB and LCP metrics. For example, the LCP that happened after a soft-navigation will likely have a resourceLoadDelay.
But unlike hard-navigations, web-vitals can’t attribute any delay to the TTFB of a main document.

By this time, we do, so we will alter the resourceLoadDelay (or elementRenderDelay when the LCP is not an image) and set the timeToFirstByte sub-part (which -as explained- will always be 0 as the default reported TTFB will also be 0ms).

When it comes to the TTFB itself, we also calculate attribution timing to mimic the web-vitals way of reporting such metric data. But we alter it in two ways:

waitingTime

We calculate the difference between the TTFB’s entry.startTime and the actual startTime of when the soft navigation happened. This gets reported as the TTFB’s waitingTime (which already is around for hard-navigation TTFB).

Because if there’s a delay, site owners would want to know. But if that delay is below an even higher INP, it would otherwise not be reported in a consistent way, making it harder to be aware of bottlenecks.

resourceLoadTime

resourceLoadTimentry.startTime originally is an LCP sub-part and isn’t around for TTFB entries.

But TTFB represents the time that the first bytes of the original request are returned. And not when it was fully done downloading aka responseEnd. However, that is actually what might be important if your SPA needs its full contents to be able to act on it and render images and text.

Without this information, site owners would still have other TTFB sub-parts, but that could make them blind for issues with downloading the contents. That could regress over time, when the site’s traffic is growing, responses are growing, underlying architecture is slowing down or audience conditions are changing.

Non-soft navigation SPA tracking

This article is already longer than expected, so I will keep this short. We did need additional code to track SPA navigations that aren’t meeting the soft navigation API heuristics.

Our docs elaborate on this as well, but we will then fall back to using pushState or replaceState (to be configured by site owners). And we will set the reportAllChanges flag when using the web-vitals library, to then apply additional triage to beacon metric information at the right moment.

Conclusion

Within the sites where this is running already, we’ve seen very positive and consistent results. And while there are many RUM providers out there that will collect data of all resources, we’ve been able to pinpoint it a bit more, relate resources to the correct URL and LCP and shape it into web vitals heuristics.

Still experimental

Despite its consistency, I would still like to call it experimental, just like both the soft-navigation API, as well as the soft-navs branch. But putting it out there already might help other RUM providers and might help us to come up with improved heuristics on our end too.

Measuring soft navigation Core Web Vitals means it likely will not mirror CrUX data

The goal of the web-vitals library is different though:

The web-vitals library is a tiny (~1.5K, brotli’d), modular library for measuring all the Web Vitals metrics on real users, in a way that accurately matches how they’re measured by Chrome and reported to other Google tools

This is one reason the soft-navs branch is just that -a branch- and has not been merged into the main branch yet. We don’t know how (or even if) soft navigation Core Web Vitals will be reflected in CrUX.

RUM data can already be different than CrUX data. Tracking SPA navigations could cause this gap to become bigger. Either showing more positive numbers, or not.

In either case, SPA owners at least are able to measure this with the web-vitals soft-navs branch. Or benefit from the work that RUM providers do on top of this (in our case, TTFB included).

Although such data might not reflect your Core Web Vitals assessment nor its SEO value, you still want to know about things impacting your UX and revenue

Comments are closed.

Web Performance Calendar