Doug Sillars (@dougsillars) is a leading mobile developer advocate and evangelist.
He is widely known as an expert in mobile application architecture, especially when it comes to performance. Doug has worked with thousands of developers around the world, helping them improve the speed, battery life and customer satisfaction of their applications (both native and web). The author of O'Reilly's "High Performance Android Apps", he has spoken at conferences around the world on mobile performance.
He is currently freelancing, and traveling with his family of 6 (plus the dog!) as a digital nomad in Europe.
When it comes to the rules of building a fast web, many of us started with Steve Souders’ list of performance rules. These rules are great, and for the most part, still hold true years after their original publication. Rule #1 from this vaunted list is “Make Fewer HTTP Requests.” In the examples for rule #1 on Steve’s website, one of the tests shows inlining images with Base64 encoding. By encoding a file in Base64, you can add the asset into your HTML or CSS – reducing the number of requests to the server (and thus fulfilling “Rule #1”).
But, in general, using Base 64 to encode files is an anti-pattern.
But why is this an anti-pattern? We’re reducing requests, right? As always, there is a price to pay for inlining content into your HTML or CSS. Base64-encoded files are generally 20-30% larger than if the file were simply downloaded – which, depending on the file size, may alone make the cost too high in the delay to load the page. Secondly, once a file is Base64-encoded and embedded into HTML or CSS they become render blocking -since they must be downloaded before the CSS can be rendered and displayed. Harry Roberts has a great post showing how Base64 encoding an image both delays CSS download, but also CSS parsing – creating a slower website than if the image were just downloaded as a separate request..
In investigating some slow websites in the HTTP Archive, I ran across a few sites that took Base64 encoding to the extreme:
https://twitter.com/dougsillars/status/1070428949753278469
I became curious as to how prevalent this method is, and how its usage impacts website load time. So, I ran a search through the response bodies of 1.2M websites for Base64-encoded strings “url\(data:[\s\S]+?\);”
I found 4.15M Base64-encoded files on 383,150 sites (31% of all sites). So, approximately â…“ of all sites in the HTTP Archive are using an anti-pattern? Is the world gone mad? Let’s see how bad it really is…
Mime Types
When I began this study, I assumed that most of the files being encoded would be images. I also hoped that most would be SVG images (since the SVG format is XML based, it benefits from Gzip compression, while JPG and PNG are already compressed, and Gzip compression will not reduce their file size greatly). Doing another regex to obtain the mimetypes (and count the number of occurrences) confirmed my expectations.
mime | cnt |
---|---|
image/svg+xml | 2,277,575 |
image/png | 1,367,013 |
image/gif | 213,858 |
application/x-font-woff | 87,607 |
application/font-woff | 85,227 |
application/x-font-ttf | 41,189 |
image/jpeg | 17,739 |
font/opentype | 12,572 |
image/svg+xml | 8,812 |
application/vnd.ms-fontobject | 8,537 |
application/octet-stream | 8,122 |
font/truetype | 6,421 |
font/woff | 4,887 |
image/jpg | 2,690 |
font/ttf | 1,664 |
The results basically confirmed my hypothesis, 55% SVG, and PNG + GIF brings the percentage of these three file types to 93%. What did surprise me was that the remaining 7% are nearly all fonts.
What are the tonnage costs of these files? We can calculate the length of each of these files (this is uncompressed, and in bytes), and graph the percentiles:
Looking carefully at this chart, there are 3 lines that are distinctly lower across the entire chart – and they correspond to SVG and PNG files. JPEG (in dark blue) is also pretty low. The largest files (across all percentiles) are fonts!
The median Base64-encoded WOFF file is 29.5 KB, and the median truetype font is nearly 70 KB. As Stoyan reported (9 years ago) TrueType fonts are XML based, and will Gzip about 50% smaller, but WOFF files will have no savings when gzipped, so these files will remain large, even if the CSS or HTML is compressed.
Why Fonts in Your CSS?
Zach Leatherman has a series of posts where he initially advocates inlining WOFF files in CSS (but he very clearly states asynchronous CSS or nonblocking in his post) to use fallback fonts, and avoiding a Flash of Invisible Text (FOIT).
Edit: In response to this post, Zach tweeted two more great posts on inlining font content:
Web Font Anti-pattern: Data URIs
Inline Data URI strategy on The Guide™
However, all of the posts I have found regarding inline fonts either 1. Ignore any performance issues, or 2. Recommend that the CSS is not blocking. In my manual examination of several sites, all of the Base64 fonts I examined WERE render blocking. – but it is possible that some are not.
Even though just 7% of all Base64-encoded files are fonts, the median font file is 7-30 times larger than the median image file – greatly increasing the blocking impact for each font.
Many pages use more than one font. Of 87,000 pages with Base64-embedded fonts, ~49,000 (56%) have more than one font Base64-encoded, further compounding the issue. 3852 (4.4%) of sites with fonts request 10 or more Base64-encoded fonts.
Images
There is a good reason why CSS and HTML load before images – it is to prevent images from blocking the load of your webpage. When you insert large images into your CSS, you are effectively placing those images in the critical rendering path, and potentially blocking your page load.
For example, this jewelry site has a huge CSS file that takes 13s to load over a 3G connection:
When I reviewed the file, there are a number of very small PNG files… and one VERY large one. 644 KB of text, 458 KB as a PNG.
When your file is encoded into the CSS, you lose any opportunity to be responsive with your images. So now you are forcing a 2090×580 image to every device – and it is blocking the critical rendering path.
Death By A Thousand Cuts
Now, it is true, the presence of 600 KB Base64 images are rare, most of the images are really small. But even if the images in your css/html are small, and even if you are using SVG images (which compress well), you can still have a huge impact on your load time.
For example, I found a CSS file that is 288 KB on the wire. Once downloaded and uncompressed, it is 1.9 MB, of which 1.45MB are inlined SVG files. Each SVG is small, but when there are over 1,000 of them, it adds up to over a megabyte very quickly. Using the Chrome coverage tool, we find that most (if not all) of these SVGs are not used.
Base64 Speed
What is the impact of having Base64-encoded files on your website? If we compare the SpeedIndex of websites with Base64 files to the dataset of all websites, we find that addition of 10-50 KB of Base64 content raises the median SpeedIndex by 1.3s, and (as expected), higher amounts further slow down the page load time.
Duplicate Base64
If a website has multiple requests for a font or an image on one HTML page, the first request is downloaded from the server, and the remainder of the requests use the locally downloaded file, i.e., the file is only downloaded across the network once
However, if you refer to a Base64-encoded file more than once – you paste the entire Base64 file multiple places into the same file. This results in the same file being downloaded multiple times on each page load. How often does this occur?
Here is one page where a relatively small 500 byte background image is requested 115 times in the HTML document (and bloating the page by over 50 KB with duplicate text):
For the 4.1M Base64-encoded files in the HTTP archive data set, 1.1M (27%) appear more than once on a single page. Over 121,000 pages (~10% of all pages in the HTTP Archive) are affected with multiple Base64-encoded downloads of the same object.
Conclusion
Despite many blog posts describing Base64 encoding as an anti-pattern, it is still widely used on the internet today (31% of the HTTP Archive). Base64-encoded files are larger than a self contained file – often by 20-30% larger. Overuse of Base64 encoding results in larger files that slow page render times. Additionally, 10% of all the sites in the HTTP Archive have duplicate entries for the same Base64-encoded content, further bloating and slowing down the pages.
If we go back to the original #1 rule for faster websites – “Make Fewer Requests,” it is pretty clear that many websites should audit their Base64-encoded fonts and images to determine how much Base64 content is blocking the rendering path of their HTML and CSS. Perhaps, now that we have HTTP2 and multiple file downloads per connection, we can finally retire the anti-pattern of Base64 encoding from the web.