Why Inlining Everything Is NOT The Answer

3rdDec 2011 by Guy Podjarny

ABOUT THE AUTHOR

Guy Podjarny (guypod) is Web Performance and Security expert, specializing in Mobile Web Performance, CTO at Blaze. Guy spent the last decade prior to Blaze as a Software Architect and Web Application Security expert, driving the IBM Rational AppScan product line from inception to being the leading Web Application Security assessment tool. Guy has filed over 15 patents, presented at numerous conferences, and has published several professional papers.

Every so often I get asked if the best Front-End Optimization wouldn’t be to simply inline everything. Inlining everything means embedding all the scripts, styles and images into the HTML, and serving them as one big package.

This question is a great example of taking a best practice too far. Yes, reducing the number of HTTP requests is a valuable best practice. Yes, inlining everything is the ultimate way to reduce the number of requests (in theory to one). But NO, it’s not the best way to make your site faster.

While reducing requests is a good practice, it’s not the only aspect that matters. If you inline everything, you fulfill the “Reduce Requests” goal, but you’re missing many others. Here are some of the specific reasons you shouldn’t inline everything.

No Browser Caching

The most obvious problem with inlining everything is the loss of caching. If the HTML holds all the resources, and the HTML is not cacheable by itself, the resources are re-downloaded every time. This means the first page load on a new site may be faster, but subsequent pages or return visitors would experience a slower page load.

For example, let’s look at the Repeat Visit of the New York Times’ home page. Thanks to caching, the original site loads in 2.7 seconds. If we inline the JavaScript files on that page, the repeat visit load time climbs to 3.2 seconds, and the size doubles. Visually, the negative impact is much greater, due to JavaScript’s impact on rendering.

Site: www.nyt.com IE8; DSL; Dulles, VA; Repeat View	Load Time	# Request	# Bytes
Original Site	2.701 seconds	46	101 KB
Inlined External JS Files	3.159 seconds	36	212 KB

Even if the HTML is cacheable, the cache duration has to be the shortest duration of all the resources on the page. If your HTML is cacheable for 10 minutes, and a resource in the page is cacheable for a day, you’re effectively reducing the cacheability of the resource to be 10 minutes as well.

No Edge Caching

The traditional value of CDNs is called Edge Caching: caching static resources on the CDN edge. Cached resources are served directly from the edge, and thus delivered much faster than routing all the way to the origin server to get them.

When inlining data, the resources are bundled into the HTML, and from the CDN’s perspective the whole thing is just one HTTP response. If the HTML is not cacheable, this entire HTTP response isn’t cacheable either. Therefore, the HTML and all of its resources would need to be fetched from the origin every time a user requests the page, while in the standard case many of the resources could have been served from the Edge Cache.

As a result, even first-time visitors to your site are likely to get a slower experience from a page with inlined resources than from a page with linked resources. This is especially true when the client is browsing from a location far from your server.

For example, let’s take a look at browsing the Apple home page from Brazil, using IE8 and a Cable connection. Modifying the site to inline images increased the load time from about 2.4s to about 3.1s, likely since the inlined image data had to be fetched from the original servers and not the CDN. While the number of requests decreased by 30%, the page was in fact slower.

Site: www.apple.com IE8; Cable; Sao Paolo, Brazil; First View	Load Time	# Requests	# Bytes
Original Site	2.441 seconds	36	363 KB
Inlined Images	3.157 seconds	26	361 KB

No Loading On-Demand

Loading resources on-demand is an important category of performance optimizations, which attempt to only load a resource when it’s actually required. Resources may be referenced, but not actually downloaded and evaluated until the conditions require it.

Browsers offer a built-in loading on demand mechanism for CSS images. If a CSS rule references a background image, the browser would only download it if at least one element on the page matched the rule. Another example is loading images on-demand, which only downloads page images as they scroll into view. The Progressive Enhancement approach to Mobile Web Design uses similar concepts for loading JavaScript and CSS only as needed.

Since inlining resources is a decision made on the server, it doesn’t benefit from loading on-demand. This means all the images (CSS or page images) are embedded, whether they’re needed by the specific client context or not. More often than not, the value gained by inlining is lower than the value lost by not having these other optimizations.

As an example, I took The Sun’s home page and applied two conflicting optimizations to it. The first loads images on demand, and the second inlines all images. When loading images on demand, the page size added up to about 1MB, and load time was around 9 seconds. When inlining images, the page size grew to almost 2MB, and the load time increased to 16 seconds. Either way the page makes many requests, but the load and size differences between inlining images and images on-demand are very noticeable.

Site: www.thesun.co.uk IE8; DSL; Dulles, VA; First View	Load Time	# Requests	# Bytes
Loading Images On-Demand	9.038 seconds	194	1,028 KB
Inlined Images	16.190 seconds	228	1,979 KB

Invalidates Browser Look-Ahead

Modern browsers use smart heuristics to try and prefetch resources at the bottom of the page ahead of time. For instance, if your site references http://www.3rdparty.com/code.js towards the end of the HTML, the browser is likely to resolve the DNS for www.3rdparty.com, and probably even start downloading the file, long before it can actually execute it.

In a standard website, the HTML itself is small, and so the browser only needs to download a few dozen KB before it sees the entire HTML. Once it sees (and parses) the entire HTML, it can start prefetching as it sees fit. If you’re making heavy use of inlining, the HTML itself becomes much bigger, possibly over 0.5MB in size. While downloading it, the browser can’t see and accelerate the resources further down the page – many of which are 3rd party tools you couldn’t inline.

Flawed Solution: Inline Everything only on First Visit

A partial solution to the caching problem works as follows:

The first time a user visits your site, inline everything and set a cookie for the user
Once the page loads, download all the resources as individual files
- Or store the data into a Scriptable Cache
IF a user visits the page and has the cookie, assume it has the files in the cache, and don’t inline the data.

While better than nothing, the flaw in this solution is that it assumes a page is either entirely cached or entirely not cached. In reality, websites and cache states are extremely volatile. A user’s cache can only hold less than a day’s worth of browsing data: An average user browses 88 pages/day, an average page weighs 930KB, and most desktop browsers cache no more than 75MB of data. For mobile, the ratio is even worse.

Cookies, on the other hand, usually live until their defined expiry date. Therefore, using a cookie to predict the cache state becomes pointless very quickly, and then you’re just back to not inlining at all.

One of the biggest problems with this solution is that it demos better than it really is. In synthetic testing, like WebPageTest tests, a page is indeed either fully cached (i.e. all its resources are cached), or it’s not cached at all. These tests therefore make the inline-on-first-visit approach look like the be all and end all, which is just plain wrong.

Another significant problem is that not all CDNs support varying cache by a cookie. Therefore, if some of your pages are cacheable, or if you think you might make them cacheable later, it may be hard to impossible to get the CDN to cache two different versions of your page, and choose the one to serve based on a cookie.

Summary & Recommendations

Our world isn’t black and white. The fact that reducing the number of requests is a good way to accelerate your site doesn’t mean it’s the only solution. If you take it too far, you’ll end up slowing down your site, not speeding it up.

Despite all these limitations, Inlining is still a good and important tool in the world of Front-End Optimization. As such, you should use it, but be careful not to abuse it. Here are some recommendations about when to use Inlining, but keep in mind you should verify they get the right effect on your own site:

Very small files should be inlined.
The HTTP Overhead of a request & response is often ~1KB, so files smaller than that should definitely be inlined. Our testing shows you should almost never inline files bigger than 4KB.
Page Images (i.e. images referenced from the page, not CSS) should rarely be inlined.
Page Images tend to be big in size, they don’t block other resources in the normal use, and they tend to change more frequently than CSS and Scripts. To optimize image file loading, load images on-demand instead.
Anything that isn’t critical for the above-the-fold page view should not be inlined.
Instead, it should be deferred till after page load, or at least made async.
Be careful with inlining CSS Images.
Many CSS files are shared across many pages, where each page only uses a third or less of the rules. If that’s the case for your site, there’s a decent chance your site will be faster if you don’t inline those images.
Don’t rely only on synthetic measurements – use RUM.
Tools like WebPageTest are priceless, but they don’t show everything. Measure real world performance and use that information alongside your synthetic test results.

Web Performance Calendar