DOM access optimization

18thDec 2009 by Stoyan Stefanov

ABOUT THE AUTHOR

Stoyan (@stoyanstefanov) is a frontend engineer, writer ("JavaScript Patterns", "Object-Oriented JavaScript"), speaker (JSConf, Velocity, Fronteers), toolmaker (Smush.it, YSlow 2.0) and a guitar hero wannabe.

This blog series has sailed from the shores of networking, passed down waterfalls and reflows, and arrived in ECMAScriptland. Now, turns out there’s one bridge to cross to get to DOMlandia.

(OK, I need to get some sleep, evidently. Anyway.) Ara Pehlivanian talked about strategies for loading JavaScript code. Yesterday’s post was about rendering and how you can prevent making things worse in JavaScript. Today’s post will be about DOM access optimizations and, if all is good, tomorrow’s post will round up the JavaScript discussion with some techniques for extreme optimization.

What’s with the DOM

Document Object Model (DOM) is a language-independent API for accessing and working with a document. Could be an HTML document, or XML, SVG and so on. DOM is not ECMAScript. ECMAScript is just one way to work with the DOM API. They both started in the web browser but now things are different. ECMAscript has many other uses and so has the DOM. You can generate a page server side, using the DOM is you like. Or script Photoshop with ECMAScript.

All that goes to show that ECMAScript and DOM are now separate, they make sense on their own, they don’t need each other. And they are kept separate by the browsers.

For example WebCore is the layout, rendering and DOM library used by WebKit, while JavaScriptCore (most recently rewritten as SquirrelFish) is the implementation of ECMAScript. In IE – Trident (DOM) and JScript. In Firefox – Gecko (DOM) and SpiderMonkey (ECMAScript).

The toll bridge

An excellent analogy I heard in this video from John Hrvatin of MSIE is that we can think of the DOM as a piece of land and JavaScript/ECMAScript as another piece of land. Both connected via a toll bridge. I tried to illustrate this analogy here.

All your JavaScript code that doesn’t require a page – code such as loops, ifs, variables and a handful of built-in functions and objects – lives in ECMALand. Anything that starts with document.* lives in DOMLand. When your JavaScript needs to access the DOM, you need to cross that bridge to DOMlandia. And the bad part is that it’s a toll bridge and you have to pay a fee every time you cross. So, the more you cross that bridge, the more you pay your performance toll.

How bad?

So, how serious is that performance penalty? Pretty serious actually. DOM access and manipulations is probably the most expensive activity you do in your JavaScript, followed by layouting (reflowing and painting activities). When you look for problems in your JavaScript (you use a profile instead of shooting in the dark, of course, but still) most likely it’s the DOM that’s slowing you down.

As an illustration, consider this bad, bad code:

// bad
for (var count = 0; count < 15000; count++) {
    document.getElementById('here').innerHTML += 'a';
}

This code is bad because it touches the DOM twice on every loop tick. It doesn’t cache the reference to the DOM element, it looks for that element every time. Then this code also updates the live DOM which means it causes a reflow and a repaint (which are probably buffered by the browsers and executed in batches, but still bad).

Compare with the following code:

// better
var content = '';
for (var count = 0; count < 15000; count++) {
    content += 'a';
}
document.getElementById('here').innerHTML += content;

Here’s we only touch the DOM twice at the end. The whole time otherwise we work in ECMAland with a local variable.

And how bad is the bad example? It’s over 100 times worse in IE6,7 and Safari, over 200 times worse in FF3.5 and IE8 and about 50 times worse in Chrome. We’re not talking percentages here – we talk 100 times worse.

Now obviously this is a bad and made up example, but it does show the magnitude of the problem with DOM access.

Mitigating the problem – don’t touch the DOM

How to speed up DOM access? Simply do less of it. If you have a lot of work to do with the DOM, cache references to DOM elements so you don’t have to query the DOM tree every time to find them. Cache the values of the DOM properties if you’ll do a chunk of of work with them. And by cache I mean simply assign them to local variables. Use selectors API where available instead of crawling the DOM yourself (upgrade your JavaScript library if it’s not taking advantage of the selectors API). Be careful with HTML Collections.

// bad
document.getElementById('my').style.top = "10px";
document.getElementById('my').style.left = "10px";
document.getElementById('my').style.color = "#dad";

// better
var mysty = document.getElementById('my').style;
mysty.top = "10px";
mysty.left = "20px";
mysty.color = "#dad";

// better
var csstext = "; top: 10px; left: 10px; color: #dad;";
document.getElementById('my').style.cssText += csstext

Basically, every time you find you’re accessing some property or object repeatedly, assign it to a local variable and work with that local variable.

HTMLCollections

HTMLCollections are objects returned by calls to document.getElementsByTagName(), document.getElementsByClassName() and others, also by accessing the old-style collections document.links, document.images and the like. These HTMLCollection objects are array-like, list-like objects that contain pointers to DOM elements.

The special thing about them is that they are live queries against the underlying document. And they get re-run a lot, for example when you loop though the collection and access its length. The fact that you touch the length requires re-querying of the document so that the most up-to-date information is returned to you.

Here’s an example:

// slow
var coll = document.getElementsByTagName('div');
for (var count = 0; count < coll.length; count++) {
    /* do stuff */
}

// faster
var coll = document.getElementsByTagName('div'),
    len = coll.length;
for (var count = 0; count < len; count++) {
    /* do stuff */
}

The slower version requeries the document, the faster doesn’t because we use the local value for the length. How slower is the slower? Depends on the document and how many divs in it, but in my tests anywhere between 2 times slower (Safari) to 200 times slower (IE7)

Another thing you can do (especially if you’ll loop the collection a few times) is to copy the collection into an array beforehand. Accessing the array elements will be significantly faster than accessing the DOM elements in the collection, again 2 to 200 times faster.

Here’s an example function that turns the collection into an array:

function toArray(coll) {
    for (var i = 0, a = [], len = coll.length; i < len; i++) {
        a[i] = coll[i];
    }
    return a;
}

If you do that you also need to account for the one-off cost of copying that collection to an array.

Using event delegation

Event delegation is when you attach event listener to a parent element and it handles all the events for the children because of the so-called event bubbling It’s a graceful way to relieve the browser from a lot of extra work. The benefits:

You need to write less event-attaching code.
You will usually use fewer functions to handle the events because you’re attaching one function to handle parent events, not individual function for each child element. This means less functions to store in memory and keep track of.
Less events the browser needs to monitor
Easier to detach event handlers when an element is removed and therefore easier to prevenk IE memory leaks. Sometimes you don’t even need to detach the event handler if children change, but the event-handling parent stays the same.

Thanks for reading!

Don’t touch the DOM when you can avoid it, cache DOM access to local references
Cache length of HTMLCollections to a local variable while looping (good practice for any collections or arrays looping anyway). Copy the collection to an array if you’ll be looping several times.
Use event delegation

Web Performance Calendar