Nowadays websites ranging from a craftsman’s business site to large portals embed third-party content. This content can be obvious or just plain invisible but in every case it can take your site down. Let’s look at three short examples of third-party content on your respective website:
Example #1: Facebook Like button
Probably the most well known third-party embed is the Facebook Like button. If you have a blog or an image gallery you might want to allow quick sharing of your content on Facebook. To implement the share button you can include the following snippet on your website:
<!-- Include the Facebook SDK --> <div id="fb-root"></div> <script> (function(d, s, id) { var js, fjs = d.getElementsByTagName(s)[0]; if (d.getElementById(id)) return; js = d.createElement(s); js.id = id; js.src = "//connect.facebook.net/de_DE/all.js#xfbml=1"; fjs.parentNode.insertBefore(js, fjs); }(document, 'script', 'facebook-jssdk'));</script> <!-- Include the button being displayed on the website --> <div class="fb-like" data-href="http://www.xing.com" data-send="true" data-width="450" data-show-faces="true"></div>
After doing this every page load will also request assets for the share button. In this case it will load those asynchronously. Nevertheless this postpones the page load event and makes it dependent on the load time of the Facebook server. window:load
will only fire, once the share button will be loaded and interpreted completly. Having a delayed window:load
event can be nasty. The browser may still show loading indicators and code triggering that event(tracking, UI components or monitorings) will be delayed as well.
Example #2: advertising
At XING we offer advertising and our best placements appear on the most frequented pages. These are the logged-in homepage, the logout-success page and our hub pages, including jobs and events. Thus we integrate these ads using our own JS wrapper:
<script> new xing.controls.RenderFrame({ src: URL_TO_AD_SERVER, target: DIV_CONTAINER, height: HEIGHT, width: WIDTH }); </script>
This function generates an iframe and uses the src parameter to set the source URL of the iframe. This forces our ads to run in a safe sandboxed environment and loads them asynchronously. This technique nevertheless still delays the page load event as the browser waits for finished onload
events of the iframe.
Example #3: tracking
Tracking scripts can be quite complex. Every commercial website uses some form of tracking ranging from Google Analytics to SiteCatalyst or some kind of home brewed solution. If we take a look at this example on Webmetrics, such a tracking system will most likely look a bit like the following snippet:
<!-- Load the JS Lib for your tracking suite --> <script src="http://.../tracking.js"></script> <!-- Gather information about the stuff you want to track --> <script> x.site="Test Page"; x.server="http://tracking.example.com/beacon.url"; //to be continued </script> <!-- Initiate the transfer of gathered information to the server --> <script> var x_code = collectTrackingInfo(); if(x_code) document.write(x_code); </script>
Everything here happens completely synchronously, thus blocking succeeding code and therefore also influences the window:load
.
So we have seen two ways of loading 3rd party JavaSript: synchronously (#3 Tracking) and asynchronously but postponing the window:load event (#1 Like Button) in each case. There is a third one which is probably the best solution, which I will touch upon at the end of this article. But first I want to explain how things can go bad when embedding third party content.
SPOF
SPOF stands for single point of failure. Nowadays with all the cross-site communication that goes on due sharing, embedding and tracking, developers have to become very mindful of using third party snippets. One of those third party snippets can be such a single point of failure. Whenever you embed third-party content, which may be any of the above solutions, you have to be aware of two well-known facts:
- Unless marked with “async” or “defer” script tags will load and execute synchronously and thus a blocking way. The subsequent content in the DOM always waits for the preceding JS block to finish loading and executing. The same goes for the
DOMContentLoaded
event. Tracking and ads wait for this event before submitting their data. If this event does not occur, you will lose whatever data you collected. The event itself always depends on the externally referenced JavaScript. If it takes long to load, the subsequent code will be delayed too. If the script is served by a third-party server and this server is not available the subsequent code will be delayed until the request times out. - Any server – be it your own or a third party – will have issues. Even Facebook or Google servers may be unavailable or have long delays. Going by Murphy’s Law this is always bound to occur when your own site runs just fine.
It’s important to remember that third-party servers can and will crash no matter how good you’ve done your homework or how well you’ve accounted for failures in your data center.
At this point I’d like to provide a simple but very effective example to check for a SPOF:
Example of third-party content on your website
<script src="//www.example.com/thirdparty/sharebutton.js"></script>
Let’s do some basic measurements here:
<script> var startTime = new Date().getTime(); </script> <script src="//www.example.com/thirdparty/sharebutton.js"></script> <script> alert((new Date().getTime() - startTime) + "ms used to download and execute the sharebutton code"); </script>
To simulate a failure we just replace the URL of the third-party plugin with the URL http://blackhole.webpagetest.org. This is an URL provided by @PatMeenan at webpagetest.org. The cool thing about this URL is that it will always respond with a timeout of at least 30 seconds. This is a great way to simulate a potential failure of a third-party content provider.
<script> var startTime = new Date().getTime(); </script> <script src="http://blackhole.webpagetest.org"></script> <script> alert((new Date().getTime() - startTime) + "ms used to download and execute the sharebutton code"); </script>
You will immediatly see and understand the difference once you have tried this out.
The “freeze incident”
Now that you know about possibilities of failure and how to simulate them I’d like to share with you one of the weirdest issues we’ve come across in a long time.
From October 17 on we’ve had users reporting that http://www.xing.com
freezes their webkit browser. As usual we tried to reproduce it in any way possible, but it simply didn’t happen for us. Our tracking JS is delivered from our own servers, so we didn’t see any risk there. Only the request to transfer the tracking data went directly to our tracking provider’s server, which didnt look harmful to us. Ads were rendered in iframes, so no chance they could freeze the whole page.
So we reported back to our User Care department to put out the usual statement saying “Please check your browser, plugins and system. We can’t reproduce this error”. But as reports kept coming in we kept on looking. After about the 100th reload I noticed a request to our external tracking domain that had been running for a long time. While waiting for the DOMContentLoaded event and not hitting the reload button I noticed that the page had frozen and was in fact unresponsive. What a true WTF?! moment!
This is when I remembered a talk that Pat Meenan gave at the about SPOFs on a webperformance meetup in Hamburg. I immediatly gave the black hole technique a try by changing my /etc/hosts/ and mapping our tracking domain to http://blackhole.webpagetest.org. I then checked our production site and was astonished. The whole page froze, exactly as the users reported. Even after 30 seconds of timeout the page kept freezing and it was reproducable. It was another WTF?! moment. We don’t block any third-party content, so what happened?
We spotted two potential pitfalls:
- a tracking server that wont respond to a request
- our newly updated tracking library
The newly updated tracking library which is served by our own working servers implemented some magic called “link tracking” in webkit browsers. There was a tiny little entry in the libs’ update notes.
With this update the tracking JS put an observer on every link of the rendered page. That observer was in turn waiting for the tracking server’s response to release them again. During this time clicks on these links were captured by the tracking script which would be waiting for the response of the tracking server. When the server took more than 500ms to respond the observers would never be released. They would catch every click event and thus make the page unresponsive.
With the above setup after packaging the tracking information, submitting it to the server and while waiting for the server’s response, the page was not responsive at all.
But as the tracking server was not within our control and we couldnt simply turn off the whole tracking, we solved this by manually overriding the new link tracking feature. With this change the tracking lib would never disable any links by catching all clicks, even if the tracking server was not reachable. What made this issue so hard to track down was that none of us would have suspected that there was anything waiting for the request to the tracking server to complete.
Tools and Tricks to the rescue
There are many tools out there that can make your life easier, especially with regard to spotting possible SPOFs. As mentioned above it is very wise to regularly check your 3rd party content against http://blackhole.webpagetest.org. Luckily there is a Chrome plugin that does the trick for you: SPOF-O-MATIC.
That plugin shows a warning whenever there is content in a document that can block or take down your site. It can even simulate a down-time of 3rd party servers.
Another option is to use webpagetest.org
The tests include SPOFs in their measurements.
It is a great idea to take Pat Meenan’s advice on (testing frontend SPOFs). It is always worth surfing your own websites while having adjusted your /etc/hosts. Just point some popular 3rd party domains to http://blackhole.webpagetest.org. This is how we actually confirmed that “the curious case” was actually a SPOF.
In the end, you should always load your 3rd party scripts completely asynchronously and without affecting the DOMContentLoaded
event.
To summarize, there are essentially three ways of loading your 3rd party code.
Method | synchronous/ asynchronous |
delays window:load |
---|---|---|
Example #1 Facebook Like Button |
asynchronous | yes |
Example #3 Tracking |
synchronous | oh yes |
Load JS in iframe (acc. to Stoyan/Meebo’s approach) |
asynchronous | no 🙂 |
Method 1 is actually fine but it still influences window:load. It already decouples the loading completely from the 3rd party server. If these servers have performance issues or downtime you wont be affected. This in itself is already very valuable.
After our recent experiences we now plan to migrate our tracking and advertising wrappers according to the third method and adjust our monitoring to properly report such incidents.
Given these adjustments and now being very aware of this topic we are very positive to prevent such failures in the future.
To recap, the following things can be recommended:
- Replace the URL of any of your third-party references with, e.g. http://blackhole.webpagetest.org to check for SPOFs
- Use tools like SPOF-O-MATIC to be constantly (even when doing private surfing) aware of SPOFs
- Most importantly: Do this with content you embed, as well as with things you send somewhere else. Be aware something can wait for a proper response even for a simple transfer of tracking data
- Be aware of updates to third party scripts, be it GA, Omniture or others. Make sure to carefully read update notes. They can easily contain surpises your product guys love but all of a sudden place observers on any link on your respective website