From Theory to Tiny: Implementing Compression Dictionaries

31stDec 2025 by Ryan Townsend

ABOUT THE AUTHOR

With over 20 years of experience in technology, including 10 years as a full-time CTO, Ryan Townsend is a passionate leader and distinguished software engineer who has delivered growth for his clients measured in the hundreds of millions.

He now runs his own consultancy, TWNSND, and publishes videos and articles at Lessons of a CTO.

Compression Dictionary Transport became an official IETF Proposed Standard this September (congratulations to Yoav Weiss & Pat Meenan 🎉). If you don’t know what they are, I recommend watching Pat’s wonderful performance.now() talk from 2023 – the TL;DR is that you can achieve incredible compression ratios resulting in file sizes a fraction of what we achieve with Brotli and GZIP today:

The results can be pretty dramatic, with responses anywhere from 60-90+% smaller than using the best non-dictionary compression currently available.
Getting Real (small) With Compression Dictionaries (2024) – Pat Meenan

They’ve been available in Chrome and Edge since version 130, released over a year ago. Firefox support is on the horizon and WebKit is supportive of the standard. Google, Bing, Shopify and Notion already have an implementation in production, proving it’s ripe for adoption. But it is early days: outside of these platforms, just 13 websites in the HTTPArchive (of ~12 million) have implemented the feature. So there’s an opportunity to get ahead of your competition and benefit over 75% of web users today.

In order to understand the challenges involved in adoption, I recently implemented them on my website. I can attest to the benefits: I’m now able to load sizable HTML pages with inlined global CSS using less bandwidth than the famously-minimal HackerNews homepage—Google themselves moved their logo from an external file to inline thanks to this benefit. But I can also tell you why adoption has been slow: this isn’t as simple as flipping a compression algorithm switch like when Brotli or ZSTD came along.

You’re going to need to make some architectural decisions. So let’s explore these pain points in some more detail.

Utopia

In an ideal world CDNs would handle this transparently. Picture a CDN that captures any response with a Use-As-Dictionary header into a Least-Recently-Used (LRU) cache (or the existing HTTP cache, if a minimum TTL can be depended upon), keyed by dictionary ID and SHA256 hash. Subsequent requests with matching Dictionary-ID and/or Available-Dictionary headers would trigger automatic dictionary-compressed responses.

Diagram of ideal Compression Dictionary implementation on CDNs

This doesn’t yet exist as a turnkey solution at any major CDN provider, which means if you want this: you’re building it yourself.

The good news? Pat has an example Cloudflare Worker implementation that gets you 90% of the way there and contrary to this 3,700 word count article… it’s actually not that complicated once you understand the pieces. In my view, the performance gains are substantial enough to justify the engineering effort.

Implementation

1. Define your dictionaries

This is your first real architectural decision, and it’s worth taking time to think through your specific use case.

Our goals are twofold:

Minimise the difference between the dictionary and its target resources
Maximise the chance an up-to-date dictionary will be available for a newly requested resource

Your architecture and traffic patterns are significant to ensuring dictionaries are downloaded ahead of applicable new requests. To squeeze the most out of this platform feature you’ll need to experiment with one, two or all three of these strategies in combination:

Resource upgrades

This is where you use a specific resource for a future version of itself, e.g. my-script.v1.js becomes my-script.v2.js

This is very useful for sites with regular deployments and regular revisits, but generally won’t help our visitors during an individual browsing session.

This also can be tricky to implement if you’re using a bundler which doesn’t produce stable chunks. You may need to specify manual chunks to get the most benefit and Yoav has an article on what to look out for.

Standalone dedicated dictionaries

This requires an additional download, but they are made during idle-time with low priority and they can be hugely effective. My dictionary is a mere 15kb and unlocks 4x more effective compression for my HTML documents than regular Brotli.

For such a small extra request, you could justify creating dedicated dictionaries per page template and then pre-empting the next request by loading its template’s dictionary, for example loading a product detail page template dictionary on the product listing page.

Dictionaries are just raw plain text so there’s no special build process required, but there are options for how to prepare them:

Generate it manually: e.g. copy-paste or concatenate a few text-based files together. This is great if your total file size (after compression) doesn’t amount to too much, it will be downloaded during idle-time but you still don’t want it to end up eating megabytes of bandwidth in the hopes that following requests being lighter pays back the debt.
Automate it: both Brotli and ZSTD can ‘train’ a dictionary on your content (example command below, plus there’s a hosted UI here). You can decide an output size (which will limit bandwidth to download) and it will figure out the most useful/repetitive content to include to compress your resources.

In both examples, it’s worth being careful not to build/train your dictionary from potentially sensitive data. LOCTO is an Eleventy-based static site so I’m training a dictionary on the sum of all my HTML output after each build. This ensures no internal logic, comments or (God forbid!) secrets are included—not only would this represent a huge security issue, but also reduce the effectiveness of the dictionary too.

Formatting gotcha: ZSTD currently outputs a ZSTD-specific format with 4 extra bytes at the start which make the dictionary invalid for browser use, so until Pat’s filed issue is resolved we have to strip them off ourselves (see tail below).

To give you a flavour, here’s the command I’m currently using:

zstd -q --train --train-cover=k=1000,d=8,steps=128,split=8 --maxdict=128K ./dist/**/*.html -o ./dict.dat && \
tail -c +5 ./dict.dat > ./dict.dat.tmp && \
mv ./dict.dat.tmp ./dict.dat

Given they’re just raw text, you can open the generated dictionary to verify the contents seem appropriate for your site.

In the likely case you have a more dynamic backend here are a couple of ideas:

Curl your website to collect some responses and pipe those in via xargs. This works pretty well when you’re initially evaluating the potential of compression dictionaries but if used long-term can be less effective for releases including a significant change given you’ll have effectively trained the dictionary on the previous version of the website.
Create a non-production page or collection of pages which render your templates and components with sample data then extract their content as training data.

Pro-tip: however you decide to generate your standalone dictionaries, don’t forget they can also be compressed with another resource — see Yoav’s prior calendar article on using this technique with stylesheets. I’m using the HTML document to compress the standalone dictionary, which then compresses the future HTML requests: this means the dictionary is HALF the size it would otherwise be vs normal Brotli compression.

Similar resources

This is more of a niche approach but it does mean your users can benefit within a browsing session without the need to download a standalone dictionary (or for you to generate one).

Given you can match on both globbed paths (e.g. *.js or *.css) and/or fetch type (e.g. script or style), you can use one existing resource as a dictionary to compress a sibling.

You’ll only want to consider this with publicly-cacheable or statically-generated resources—storing uncacheable responses as dictionaries could push more useful dictionaries out of the CDN cache and could even represent a security risk—so it’ll more commonly be stylesheets and scripts than HTML or JSON.

The effectiveness of this approach depends on how similar every different combination of resources is (ssdeep can measure similarity), which landing pages most commonly download which resources and what common paths users take through your website–e.g. you can’t depend on using a resource from an infrequently browsed page to be present as a dictionary.

2. Serving our dictionaries

Resources as dictionaries

To permit resources to be used as dictionaries, you need to define Use-As-Dictionary headers on their responses—the format is well-documented on MDN.

match="/path/to/my/script.*.js" – this format is useful for delta upgrades where the splat is used to match the hash/fingerprint.
match-dest=("document") – this is what I’m using on my dictionary to instruct it to be used to compress all my HTML documents, it would also be a potential candidate for the ‘similar resources’ approach

A noteworthy limitation is that path matching is limited to a single URLPattern entry and regexp is disabled, so this can mean relocating resources to accommodate a pattern that encompasses all the necessary requests.

Add standalone dictionaries to HTML

If you’ve decided to go the route of generating a standalone dictionary, you can add it to your HTML with a simple <link rel="compression-dictionary"> tag.

Browsers will download this as a low-priority resource during idle time, which minimises its overhead but don’t expect requests to hang around during page load for the dictionary to become available.

If you’ve implemented a Content-Security-Policy (CSP), you’ll need to ensure the connect-src directive (or default-src if you don’t override it) permits the location your standalone dictionary lives at. I have an open PR for MDN to note this detail.

3. Implement the compression

Where should our compression happen?

Where your compression happens determines your latency characteristics, cache hit ratios, and infrastructure costs.

The ideal answer is “at the edge, in your CDN”. Compression is stateless, so handling it as close to users as possible makes sense. Scale it with serverless functions that spin up in milliseconds. Simple, right?

Except not all platforms and CDNs were specifically designed with this use case in mind, so you’re threading a needle between several competing requirements:

Routing & conditionals: You need the function behind routing logic that only triggers for requests with appropriate headers (Accept-Encoding includes dcb/dcz, Available-Dictionary is present, etc.). Not all platforms support declarative routing based on headers.
Cache positioning: Ideally you want the function sitting behind your HTTP cache so dictionary-compressed responses get cached. But you also want access to read from that same cache to repurpose existing Brotli/GZIP responses without hitting origin. This is a tricky dependency cycle.
Caching metadata: If you’re reading existing cached responses, you generally need to preserve their original headers—especially for key-based purging strategies—and apply them to your new dictionary-compressed versions.
Dictionary access: Whatever performs compression needs fast access to your dictionaries. If you’re pulling from object storage on every cold start, you’re potentially adding hundreds of milliseconds of latency.

The trade-offs you make here depend entirely on your CDN’s capabilities, your traffic patterns, and how much latency you can tolerate during cold starts.

It may be that you have to compromise on some of these, for example re-requesting from origin rather than repurposing the Brotli/GZIP versions. Or maybe you need to perform the compression directly within your origin.

A real-world example: working around Netlify’s limitations

Let me walk you through my specific implementation to illustrate the kind of puzzle-solving you could face.

The routing for Netlify’s ‘Edge Functions’ supports HTTP headers, but can’t write to the distributed CDN HTTP cache, so hit ratio is dire
I could mitigate the low hit ratio if cold starts and storage access for the dictionaries were fast enough but this amounted to TTFBs regularly measured in seconds
Their ‘Serverless Functions’ can write to the ‘durable’ cache but their routing doesn’t support conditionals based on HTTP headers or manually passing through to statically-generated files at the same path.

So my compromise is having to handle the routing in an edge function applied to all requests where Accept includes text/html, Accept-Encoding includes dcz and both Dictionary-ID and Available-Dictionary are present—thereby limiting executions. This function rewrites the URL to a specific path name which reads through the durable cache to where the Serverless Function is listening. This function will re-request the original HTML, compress it and set caching headers for Netlify’s CDN to cache it in a distributed manner.

Another compromise is to push the dictionaries (my HTML resources and generated standalone dictionary files) into Netlify’s Blob Storage during build, so they persist across releases. Metadata includes the Git commit hash, allowing cleanup of old dictionaries after a period of time.

Is this elegant? Not particularly. Is it more complicated than I wanted? Absolutely. Have I submitted multiple feature requests to improve this? You bet. But it remains within the platform constraints—i.e. no additional services needed adding to the stack—while maintaining reasonable latency and hit ratios.

Your platform, tech stack or CDN will have different limitations, and you may need to make different compromises but hopefully this might provide some creative ideas if you hit unfortunate limitations.

Applying the compression

I’ve currently only implemented ZSTD so I can’t compare dictionary compression performance between Brotli and ZSTD just yet. ZSTD claims to compress faster than Brotli at similar compression ratios, but only comparing Level 1 doesn’t really help, and Paul Calvano found a more nuanced result. None of these are specific to dictionaries so results may vary further.

Ultimately, your choice will more likely depend on what libraries/binaries you have access to.

The dcb and dcz response formats aren’t simply the output of the compression. They are composed of three parts:

Magic bytes:
- 4 bytes for dcb: 0xff 0x44 0x43 0x42 (fun fact: this spells ‘ dcb’!)
- 8 bytes for dcz: 0x5e 0x2a 0x4d 0x18 0x20 0x00 0x00 0x00 (this one is far less interesting, it was just an ‘unused’ frame type)
The 32-byte SHA256 hash digest of the dictionary (which must match the Available-Dictionary header)
The compressed response body

Once again, Pat has filed an issue with ZSTD to update the library to output this format natively, but in the meantime we have to concatenate these buffers ourselves.

You’ll need to find a library which works for your chosen programming language, unless you have access to—and want to directly interface with—native brotli/zstd binaries.

A word of warning from the trenches: LLMs will confidently hallucinate library capabilities here. I provided explicit context about my requirements and both Sonnet and Codex repeatedly recommended libraries that required native binaries, needed to write to the filesystem, didn’t support dictionaries, or even only handled decompression. I burned time on six different libraries before landing on one that actually satisfied my restrictions.

Once up and running, I witnessed some Chrome net::ERR_ZSTD_WINDOW_SIZE_TOO_BIG errors at the ultra (20-22) ZSTD levels, so I’m currently running at 19 — the aforementioned ZSTD issue should address this, but in the meantime there were minimal gains to be had above 19 anyway.

If you’ve trained your dictionary well, compressing with shared dictionaries can be so powerful, you may find that the compression level doesn’t greatly influence the output size. But if you’re implementing at an enormous scale where every millisecond has a measurable cost, you may wish to adjust the compression level depending on how cacheable the response is. For uncacheable or low-TTL responses, you could lower the level to favour latency and cost over minimising the transfer size.

4. Vary your CDN cache

This is where things get interesting from an infrastructure perspective. You essentially segment your cache based on dictionary availability.

You need to ‘key’ or Vary your CDN cache based on two headers:

Accept-Encoding to ensure it includes dcb and/or dcz for brotli/zstd respectively
Available-Dictionary to ensure we don’t return a response compressed with another dictionary

Chrome only adds Available-Dictionary and Dictionary-ID headers to requests after a cache miss from its HTTP cache, so there’s no need to filter them out of your Vary header before delivery from the CDN.

Critical gotcha: Some CDNs actively filter compression methods from Accept-Encoding. Fastly and Akamai, for example, operate a safelist that by default doesn’t include dcb/dcz. You’ll need to explicitly reintroduce them. Check your CDN’s documentation for header manipulation capabilities.

5. Testing

First, enable the Response Headers > Content-Encoding column in Chrome DevTools’ Network tab, if you haven’t already. This lets you verify dcb/dcz encoding at a glance.

Enabling Content-Encoding in Network tab of DevTools

Next, open chrome://net-internals/#sharedDictionary – this will show you the shared dictionaries for your open tabs and let you clear them to simulate fresh page loads, because debugging surprise #1: I found Chrome DevTools’ “Clear site data” (under Application > Storage) doesn’t actually clear dictionaries. It’ll clear local storage, IndexedDB, cookies, cache storage… but your dictionaries persist. I’ve raised this as a Chrome issue so please +1 to raise awareness.

Aside from the net::ERR_ZSTD_WINDOW_SIZE_TOO_BIG error I mentioned earlier, if you see Chrome errors in the status column, you probably have issues with missing the magic bytes or your SHA256 not being generated properly. I’ve had most success with OpenSSL on the command line and the Node crypto standard library:

import crypto from "crypto";
crypto.createHash("sha256").update(dictionaryBuffer).digest();

Debugging surprise #2: Chrome intermittently opens a separate connection for standalone dictionaries. I’ve reported this as a Chrome bug.

Chrome DevTools showing a new connection occurring when downloading a dictionary

6. Monitor

We should all be monitoring our (Core) Web Vitals with Real User Monitoring—if you’ve got this far into this technical article and you disagree, colour me impressed!—but beyond seeing an improvement to the likes of FCP and LCP for Chromium browsers when we release the feature, we should be looking to monitor and optimise its effectiveness thereafter.

The resource timing spec has been updated to expose content encoding and Chrome has already been updated so you can segment your RUM data to measure the impact of all your efforts on the likes of FCP and LCP.

Note: there are open issues for Safari and Firefox, but given neither have support for dictionary compression, it’s not currently a problem.

You may want to capture data on how many of your requests could be served with DCZ/DCB encoding vs how many actually are—effectively like a cache hit ratio, except for your dictionaries. There are two parts to this:

How many requests include Available-Dictionary but don’t respond with DCZ/DCB encoding – this will alert you to dictionary caching issues, e.g. needing to increase the available storage to persist older entries.
How many requests match your configured Use-As-Dictionary headers (i.e. based on path or resource type) but didn’t include an Available-Dictionary header – this will potentially highlight opportunities where visitors haven’t yet loaded a dictionary.

The Resource Timing API also exposes encodedBodySize and decodedBodySize allowing you to calculate and segment compression ratios by encoding type and monitor fluctuations over a period of time. This will provide a picture of how well-trained our dictionaries are and whether we’re applying an appropriate compression level. Unfortunately, Yoav found a bug in Chrome where browser-cached resources will have their encodedBodySize reported as equal to their decodedBodySize, so until this is resolved you may need to rely on a custom Server-Timing header.

Impact

While Compression Dictionary Transport can unlock architectural possibilities and simplifications that we’d typically avoid due to bandwidth impact, we shouldn’t yet be rearchitecting around them given the lack of Safari and Firefox support.

So why bother with dictionaries today? The benefit lies in two areas:

By at least considering how you’d implement them within your stack, you don’t wall yourself in and make future adoption more challenging.
As Jordy rightly pointed out earlier in this year’s calendar: improving things beyond your P75 is a worthwhile task. Quite frankly, it’s wild how it’s commonplace to test browser compatibility down to single digit adoption but then ignore 25% of our performance data.

With compression dictionaries, you can help rein in those higher latency, higher packet loss and lower bandwidth environments that live beyond the P75.

When Firefox and Safari support lands, many of the tradeoffs between performance and complexity will rebalance, e.g.:

Should we bundle these two scripts together for better compression or leave them separate for simpler builds and reduced cache busting?
Should we calculate critical CSS for every page to reduce FCP, or just inline the entire global and template styles?

We can also be more bullish with our prefetching/prerendering to further hide latency from our visitors and increase our chances of those ‘instant’ page loads. Compression dictionaries pair well with Speculation Rules, especially given both are currently Chromium-only: we can prefetch more because doing so costs less.

How much less?

My DCZ-encoded HTML responses are 4x smaller than their non-dictionary Brotli counterparts served to Firefox/Safari.

A whopping 5.5kb of average overhead for prefetching an HTML page, including all the CSS inlined to render it.

Yes, my visitors have to download the 15kb dictionary, but so long as I’m doing a single prefetch, or they navigate to just one other page, this pays for itself in the additional compression capability.

Shopify has already recorded a 130-180ms LCP improvement from Speculation Rules alone, which makes me wonder how much further they can push it with less conservative prefetching, especially at those higher percentiles.

Worth The Effort?

Here’s what you need to decide: is a 4x improvement in compression ratio worth the infrastructure complexity in the short-term?

For me, that answer was obviously yes. Not just because professionally I needed to understand the rough edges for client work. Not just because I want to beat Sia on the Eleventy leaderboard. The ability to inline full CSS, triple my prefetching volume, and eliminate build-time complexity made this worthwhile despite the headaches along the way.

Your decision will differ based on your traffic patterns, infrastructure constraints, and team bandwidth. But if you’re in a position where this makes sense, the performance gains are substantial and the competitive advantage is real because almost nobody is doing this yet.

Today’s implementation path isn’t as clean as I’d like. You could hit CDN/platform limitations, library compatibility issues, and debugging challenges that aren’t well-documented. But the primitives work, the browser support is solid, and the performance characteristics are exceptional.

It’s only going to become better documented and more productised over time, but that doesn’t mean we should wait around: we can improve user experience today with a little additional effort and complexity, then refactor it away as CDN support improves—who doesn’t love deleting code?

If you implement this, please apply the Boy Scouts rule and leave things better than you found them: document your journey and raise issues and pull requests along the way to improve the ecosystem for everyone else.

Acknowledgements

Yoav Weiss and Pat Meenan, firstly both for inventing and delivering this incredible feature and secondly for answering my endless questions along the way.
Robin Marx for also supporting with my implementation, he’d hit many of the issues I had and shared his early findings at performance.sync this year.
Barry Pollard for going above-and-beyond validating, tuning and executing my HTTPArchive queries.
And all four of these fine gentlemen for their extensive reviews of this article.

Web Performance Calendar