Losing Your Head With PhantomJS and confess.js

30thDec 2011 by James Pearce

ABOUT THE AUTHOR

James (@jamespearce) is Head of Mobile Developer Relations at Facebook. He lives in California and airports around the world.

We yearn for powerful and reliable ways to judge the performance and user experience of web applications. But for many years, we’ve had to rely on a variety of approximate techniques to do so: protocol-level synthesis and measurement, cranky browser automation, fragile event scripting – all accompanied with a hunch that we’re still not quite capturing the behavior of real users using real browsers.

Enter one of this year’s most interesting open-source projects: PhantomJS. Thanks to Ariya Hidayat, there’s a valuable new tool for every web developer’s toolbox, providing a headless, yet fully-featured, WebKit browser that can easily be launched off the command-line, and then scripted and manipulated with JavaScript.

I’ve used PhantomJS to underpin confess.js, a small library that makes it easy to analyze web pages and apps for various purposes. It currently has two main functions: to provide simple page performance profiles, and to generate app cache manifests. Let’s take them for a quick spin.

Performance summaries

Once installed, the simplest thing to do with confess.js is generate a simple performance profile of a given page. Using the PhantomJS browser, the URL is loaded, its timings taken, and a summary output emitted – all with one single command:

$> phantomjs confess.js https://calendar.perfplanet.com/2011/ performance

Here, the confess.js script is launched with the PhantomJS binary, directed to go to the PerfPlanet blog page, and then expected to generate something like the following:

Elapsed load time:   6199ms
   # of resources:       30

 Fastest resource:    408ms; https://calendar.perfplanet.com/wp-content/themes/wpc/style.css
 Slowest resource:   3399ms; https://calendar.perfplanet.com/photos/joshua-70tr.jpg
  Total resources:  69080ms

Smallest resource:    2061b; https://calendar.perfplanet.com/wp-content/themes/wpc/style.css
 Largest resource:    8744b; https://calendar.perfplanet.com/photos/joshua-70tr.jpg
  Total resources:  112661b; (at least)

Nothing revolutionary about this simple output – apart from the fact that of course, under the cover, this is coming from a real WebKit browser. We’re getting solid scriptable access to every request and response that the browser is making and receiving, without having to make any changes to the page under test.

So already you might be able to imagine there’s a lot more that can be done with this instrumentation. I had some light-hearted fun getting confess.js (with a verbose flag) to emit waterfall charts of a page and its resources, for example – all in technicolor ASCII-art:

  1|-------                                                         |
  2|       ------------                                             |
  3|                 -----------                                    |
  4|                 ---------------------                          |
  5|                  -----------                                   |
  6|                  -------                                       |
  7|                  -------                                       |
  8|                  -------                                       |
  9|                  -------                                       |
 10|                                     ----------                 |
 11|                                     ----------------------     |
 12|                                     ----                       |
    ...

  1:   1679ms;       -b; http://cnn.com/
  2:   3115ms;       -b; http://www.cnn.com/
  3:   2716ms;       -b; http://z.cdn.turner.com/...css/hplib-min.css
  4:   5465ms;       -b; http://z.cdn.turner.com/...5/js/hplib-min.js
  5:   2952ms;       -b; http://z.cdn.turner.com/.../globallib-min.js
  6:   1681ms;      21b; http://content.dl-rms.co...r/5721/nodetag.js
  7:   1698ms;       -b; http://icompass.insightexpressai.com/97.js
  8:   1743ms;       -b; http://ad.insightexpress...px?publisherID=97
  9:   1706ms;       -b; http://js.revsci.net/gat...gw.js?csid=A09801
 10:   2494ms;    7732b; http://i.cdn.turner.com/...ader/hdr-main.gif
 11:   5694ms;   44091b; http://i2.cdn.turner.com...quare-t1-main.jpg
 12:   1023ms;     858b; http://i.cdn.turner.com/...earch_hp_text.gif
    ...

While this might seem a poor alternative to the rich diagnostics that can be gained from, say, the WebKit Web Inspector tools, it does provide a nice way to get a quick overview of the performance profile – and potential bottlenecks – of a page. And, of course, and more importantly, it can be easily extended, run from the command-line, automated and integrated as you wish.

App cache manifest

Similarly, we can also use a headless browser to analyze the application’s actual content in order to perform a useful task. Although there’s a run-time ‘chinese wall’ in PhantomJS between the JavaScript of the harness and the JavaScript of the page, it’s permable enough to allow us to evaluate script functions against the DOM and have simple results structures returned to confess.js.

Why might we want to analyze a page’s DOM in an automated way? Well, take the app cache manifest mechanism, for example: it provides a way to mandate to a browser which resources should be explicitly cached for a given application, but, despite a deceptively simple syntax, it can be frustrating to keep track of all the assets you’ve used. To maximize the benefits of using app cache, you want to ensure that every resource is considered: whether it’s an image, a script, a stylesheet – or even resources further referred to from inside those.

This is the perfect job for a headless browser: once a document is loaded, we can examine it to identify the resources it actually uses. Doing this against the real DOM in a real browser makes it far more likely to identify dependencies required by the app at run-time than would be possible through statically analyzing web markup.

And again, something like this could easily become part of an automated build and deploy process. For example:

$> phantomjs confess.js https://calendar.perfplanet.com/2011/ appcache

…will result in the following manifest being generated:

CACHE MANIFEST

# This manifest was created by confess.js, http://github.com/jamesgpearce/confess
#
# Time: Fri Dec 23 2011 13:46:42 GMT-0800 (PST)
# Retrieved URL: https://calendar.perfplanet.com/2011/
# User-agent: Mozilla/5.0 (Macintosh; Intel Mac OS X) AppleWebKit/534.34 (KHTML, like Gecko) PhantomJS/1.4.0 Safari/534.34

CACHE:
/photos/aaron-70tr.jpg
/photos/alex-70tr.jpg
/photos/alois-70tr.jpg
[...]
https://calendar.perfplanet.com/wp-content/themes/wpc/globe.png
https://calendar.perfplanet.com/wp-content/themes/wpc/style.css

NETWORK:
*

Depending on your app, there might be a lot of output here. But the key parts, as far as the eventual user’s browser will be concerned, are the CACHE and NETWORK blocks. The latter is always set to the * wildcard, but the former list of explicit resources is built up automatically from the URL you ran the tool against.

For app cache nirvana, you’d simply need to pipe this output to a file, link to it from the <html> element of your target document, and of course ensure that the file, when deployed, is generated with a content type of text/cache-manifest.

As an aside, the list of dependant resources itself is harvested by confess.js in four ways. Firstly, once the document is loaded in PhantomJS, the DOM is traversed, and URLs sought in src and href attributes on script, img, and link elements. Secondly, the CSSOM of the document’s stylesheets is traversed, and property values of the CSS_URI type are sought. Thirdly, the entire DOM is traversed, and the getComputedStyle method picks up any remaining resources. And finally, the tool can be configured to watch for additional network requests – just in case, say, some additional content request has been made by a script in the page that would not have been predicted by the contents of the DOM or CSSOM.

(Naturally, there are many useful ways to configure the manifest generation as whole. You can filter in or out URLs in order to, say, exclude certain file types or resources from remote domains. You can also wait for a certain period after the document loads before performing the extraction, in case you know that a deferred script might be adding in references to other resources. There’s information about all this in the docs.)

Onwards and upwards

We’ve just touched on the two simple examples of what can be done with a headless browser approach in general. The technique provides a powerful way to analyze web applications, and get closer to being able to understand real users’ experience and real apps’ behaviour.

I’d certainly urge you to check out PhantomJS, try scripting some simple activities, and think about how you can use it to understand and automate web site and application behaviour. (I’m not even sure I mentioned yet that it has a screen-shotting capability too…)

And of course, feel free to give confess.js a try too – with its humble goal of making it easier to help automate some of those common tasks. I’m always accepting pull requests!

But whatever your tools of choice… do have fun on your performance adventures, push the envelope, make the web a wonderful place – and I hope you all have an excellent 2012.

Web Performance Calendar