Web Performance 2025: The Shift from Optimization to Prediction

7thDec 2025 by Fabian Krumbholz

ABOUT THE AUTHOR

Fabian Krumbholz (@fabkru.bsky.social) is a Web Performance Consultant at Speed Kit and a recognized Google Developer Expert. He helps businesses deliver instant experiences.

To build instantly loading websites was always my goal. For years, however, this remained an aspirational target rather than a technical reality for the open web. We spent a decade optimizing critical rendering paths and shaving milliseconds off Time to First Byte, but the physical limits of the network always kept “true instant” just out of reach.

But this changed in 2025—at least for Chromium browsers.

With the maturation of the Speculation Rules API and aggressive prerendering, we can finally get as close as possible to instant loading pages. While it is not perfect yet—specifically, we cannot prerender the very first page of a website (the landing page)—the impact on subsequent navigations is profound.

Redefining “Good” Performance

Our RUM data from hundreds of e-commerce sites this year has shown that with predictive preloading, we are able to move a significant volume of page loads into the sub-300ms category. This shift has forced us to rethink how we grade performance.

Internally, we have split the standard “Good” section of the Largest Contentful Paint (LCP) into three distinct tiers to make this shift transparent:

Instant (<300ms)
Fast (<1000ms)
OK (<2500ms)

The current “Good” threshold of 2.5 seconds is now five years old. It was originally chosen largely as a motivational target to encourage broader adoption, but with today’s new browser APIs, more powerful devices and infrastructure, it is no longer sufficient. It is time to rethink these thresholds, driven by two irrefutable arguments: Psychology and Business.

Psychologically, a response must happen within 100ms to feel truly instantaneous. However, achieving 100ms consistently is still incredibly difficult given current network latency and device constraints. A response under 1000ms, meanwhile, is sufficient to maintain the user’s flow of thought. Therefore, we propose 1000ms and 300ms as the new, practical thresholds for the next generation of web performance.

The data supports this distinction. We see a much higher conversion rate increase below 1000ms. Especially for e-commerce pages, the best Conversion Rate (CVR) uplift is specifically found in this <1000ms area. A “Good” Core Web Vital score (2.5s) is simply not good enough anymore to compete at the highest level.

Overcoming the Hurdles: Prediction and Execution

Achieving these results isn’t as simple as just implementing the standard Speculation Rules API. The built-in triggers are often too close to the actual user navigation to be effective. To get the best results, you have to overcome substantial hurdles, primarily revolving around timing and accuracy.

Specifically, you need to use the immediate trigger to give the browser enough time to fetch, parse, and prerender the page fully before the user clicks. However, this demands high accuracy; otherwise, you waste the user’s bandwidth and increase load on the origin server. We have found good experiences using a combination of AI models trained on RUM data alongside real-time user behavior signals.

A critical hurdle that is often underestimated is the sheer impact on server load. If you prefetch resources directly from the origin server rather than serving them from a CDN edge, aggressive speculation can easily 10x your server load. This not only skyrockets infrastructure costs but can also cause the server to become unresponsive if it isn’t capable of handling the extra throughput. This creates a dangerous scenario during high-traffic campaigns like Black Friday, where aggressive preloading could inadvertently trigger a self-inflicted DDoS attack, turning a sales peak into an outage.

The other major problem we had to solve this year was the potential side effects caused by running JavaScript in a hidden tab. Prerendering a page blindly can trigger analytics events, mess up product history, or execute heavy hydration logic before the user even sees the page.

With the new prerender-until-script feature, this problem gets addressed directly. This feature allows the browser to fetch the document and process the DOM but pauses execution before hitting the scripts. This fixes the side effect issues (like premature tracking) and solves a major performance regression: we saw significant experience degradation on the current page when prerendering JavaScript-heavy pages in the background. Even though these background tasks don’t run on the main thread, the CPU contention was enough to cause jank. This new approach ensures the actual active page remains smooth, especially on older or less powerful devices.

The Compression Dictionary Breakthrough

The last big update of the year—and the perfect partner for prerendering—is Compression Dictionaries.

This technology allows us to compress HTML on average 79% more compared to the standard gzipped version. We are not just using this for the “last mile” to the user; we have integrated dictionary compression deep into our backend systems. We use dedicated shared dictionaries for our long-term HTML storage. To ensure maximum efficiency, we leverage our RUM data to identify the most frequently accessed documents, which helps us to build the most effective dictionaries possible. We also typically generate one single, optimized dictionary per site. By storing and transferring data internally using this format, we are seeing significant cost reductions in both storage and internal bandwidth.

For delivery, we push the dictionary-compressed HTML directly to our edge infrastructure. The edge handles the negotiation: if a browser supports the new standard, it gets the hyper-compressed version. If not, the edge automatically transcodes it to a cached Brotli version. This effectively allows us to remove complexity from our application servers—they only need to output the most efficient format, leaving compatibility concerns to the edge.

We were honestly expecting a bigger impact on LCP directly from this technology, but the bottleneck is often processing rather than raw HTML transfer. What we see is a ~40ms uplift on average, though in edge cases we saw a 250ms improvement. However, the savings in bandwidth are a game changer for our preloading strategy. The massive size reduction allows us to preload up to 8 HTML files for the bandwidth cost of one. This drastically lowers the penalty for false positives in our prediction models and allows for much more aggressive predictive preloading strategies without bloating the network usage.

2026: Closing the Visibility Gap

Looking back on 2025, it is clear that we have crossed a major threshold. We are no longer just shaving milliseconds off load times; we are fundamentally redefining the user experience by transitioning from “making things faster” to “making things instant.”

But as we celebrate the “Instant” revolution in Chromium, I am also excited for what is coming next year.

For too long, our view of web performance has been monocular, dominated by data from a single browser engine. I am incredibly excited about the work being done to close the Core Web Vitals blind spots in Safari and Firefox. Bringing these major platforms into the fold means we stop optimizing for a browser and start optimizing for all users.

Furthermore, the long-awaited standardization of Core Web Vitals for Single Page Applications (SPAs) is finally on the horizon with the Soft Navigation Experiment. We are moving away from the “hacky” workarounds of the past toward reliable, spec-compliant measurement for Soft Navigations.

When we combine instant loading capabilities with truly cross-platform, architecture-agnostic measurement, we finally get the full picture. Our RUM data becomes a source of truth rather than a source of hints. This is the foundation we need to identify and fix performance bottlenecks on every device, every browser, and every framework. 2025 was the year of speed; 2026 will be the year of clarity.

Web Performance Calendar

Web Performance 2025: The Shift from Optimization to Prediction

Redefining “Good” Performance

Overcoming the Hurdles: Prediction and Execution

The Compression Dictionary Breakthrough

2026: Closing the Visibility Gap

Search

Planet Performance

Archives