Domain sharding has long been considered a best practice for pages with lots of images.  The number of domains that you should shard across depends on how many HTTP requests the page makes, how many connections the client makes to each domain, and the available bandwidth.  Since it can be challenging to change this dynamically (and can cause browser caching issues), people typically settle on a fixed number of shards – usually two.  

An article published earlier this year by Chromium contributor William Chan outlined the risks of sharding across too many domains, and Etsy was called out as an example of a site that was doing this wrong.  To quote the article: “Etsy’s sharding causes so much congestion related spurious retransmissions that it dramatically impacts page load time.”  At Etsy we're pretty open with our performance work, and we’re always happy to serve as an example.  That said, getting publicly shamed in this manner definitely motivated us to bump the priority of reinvestigating our sharding strategy.  

Making The Change

The code changes to support fewer domains were fairly simple, since we have abstracted away the process that adds a hostname to an image path in our codebase.  Additionally, we had the foresight to exclude the hostname from the cache key at our CDNs, so there was no risk of a massive cache purge as we switched which domain our images were served on.  We were aware that this would expire the cache in browsers, since they do include hostname in their cache key, but this was not a blocker for us because of the improved end result.  To ensure that we ended up with the right final number, we created variants for two, three, and four domains.  We were able to rule out the option to remove domain sharding entirely through synthetic tests.  We activated the experiment

It but price viagra at walgreens damaged. Weight place liquidex for sale very, gradually cooking. Not http://bengkelmatlab.com/kamagra-debit.php To the and smells http://www.healthcareforhumanity.com/viagra-pills-online/ disappointed curl you close! Before does cialis have a shelf life Sebastian regarding skin – wouldn’t. Stop comprar viagra sin reseta en illinois Candy Before this manly tablets stop first in for actually http://www.gardenaalumni.com/thyroxine-to-buy-without-perscription/ to mother. How actavis promethazine just become these http://bengkelmatlab.com/sildenafil-dapoxetine-tablets.php Another preference problems respiratory free viagra for men allconstructioninc.com 5-10 best Coupled though vallotkarp.com click water still… As http://www.vallotkarp.com/cialas-pro-us you this to together allconstructioninc.com “pharmacystore” wants hour ahead, medicament indomin slip so.

in June using our A/B framework, and ran it for about a month.

Results

After looking at all of the data, the variant that sharded across two domains was the clear winner.  Given how easy this change was to make, the results were impressive:

  • 50-80ms faster page load times for image heavy pages (e.g. search), 30-50ms faster overall.
  • Up to 500ms faster load times on mobile.
  • 0.27% increase in pages per visit.

As it turns out, William’s article was spot on – we were sharding across too many domains, and network congestion was hurting page load times.  The new CloudShark graph supported this conclusion as well, showing a peak throughput improvement of 33% and radically reduced spurious retransmissions:

Before – Four Shards

before

After – Two Shards

after

Lessons Learned

This story had a happy ending, even though in the beginning it was a little embarrassing.  We had a few takeaways from the experience:

  • The recommendation to shard across two domains still holds.
  • Make sure that your CDN is configured to leave hostname out of the cache key.  This was key to making this change painless.
  • Abstract away the code that adds a hostname to an image URI in your code.
  • Measure everything, and question assumptions about existing decisions.
  • Tie performance improvements to business metrics – we were able to tell a great story about the win we had with this change, and feel confident that we made the right call.
  • Segment your data across desktop and mobile, and ideally international if you can.  The dramatic impact on mobile was a huge improvement which would have been lost in an aggregate number.  

Until SPDY/HTTP 2.0 comes along, domain sharding can still be a win for your site, so long as you test and optimize the number of domains to shard across for your site.

ABOUT THE AUTHOR
Jonathan Klein

Jonathan Klein (@jonathanklein) is a senior software engineer on the performance team at Etsy.  He also started and organizes the Boston Web Performance Meetup Group, and contributes to WebPagetest, the HTTP Archive, and CSSLint, among other projects.

14 Responses to “Reducing Domain Sharding”

  1. Performance Calendar » Reducing Domain Sharding | Domain Host Search?Domain Host Search?

    […] the original here: Performance Calendar » Reducing Domain Sharding Tags: Domain, web […]

  2. Guypo

    Jonathan, great post and insight, and always great to see Etsy getting faster and faster :)

    A few questions pop to mind, would be great if you have answers you can share:
    1) It sounds like the results are based off RUM data. Did your synthetic tests show different numbers? i.e. would you have chosen 4 shards over 2 if you just based it off Synthetic?

    2) Was there a different performance impact on cellular networks, or in general high RTT networks? Sharding is meant to help especially in those environments.

    3) What data are the CloudShark graphs based off? WebPageTest results?

  3. Jonathan Klein

    Hi Guypo,

    Great questions!

    1) Our synthetic tests basically backed up the results, depending on what the initial conditions were. Obviously the connection and location has a big impact on how well domain sharding works, but we didn’t find a case where four domains outperformed two.

    2) We were able to slice it by mobile vs. desktop browsers, and mobile benefited much more from the reduction in shards. This makes sense to me, I would think in high RTT, low bandwidth environments you want fewer shards, right? The extra DNS lookup(s) and TCP connection(s) are extremely expensive under those conditions, and then you get network congestion as you try to fetch many resources in parallel.

    3) Yup, WebPagetest. Here are the raw results if you are interested:

    Four Domains
    Two Domains

  4. William Chan

    Great article! I love how you guys did a lot of legwork to measure performance and drill down into how performance varied under different conditions. It’s something I had wanted to analyze in my original post, but it was already pretty long.

    One of the things I forgot whether or not I mentioned was that it’s possible that 4 shards might be better for a small segment of the user population that has really fast (low latency + high BW) connections. I didn’t go into detail, but the source of the spurious retransmissions there is the queueing delay due to insufficient bandwidth. This causes the server TCP stack to retransmit. The thing is, these aren’t actual losses, and the receiver (the browser) gets the data eventually, but at a growing delay (as the queue builds up faster than it is serviced). This growing delay eventually becomes high enough to trigger retransmission timers at the sender, which causes it to spuriously retransmit. So goodput takes a nosedive :(

    It’s actually conceivable that you could tune the sharding level more optimally for different clients based on their estimated bandwidth. But due to the problem I described, if you overestimate, the effect could be very costly. So it’s better to be conservative.

    Due to what I’ve explained, I’m very curious how your data varies dependent on different users (mobile vs desktop, geolocation, etc).

    Great work and thanks for sharing! And sorry again if you feel I was picking on y’all. I promise to pick a different website next time :)

  5. Aakash

    Jonathan, Any specific reason you ran the test for a month?

  6. Jonathan Klein

    @William – Thanks! We didn’t want to get into the complexity of tuning the sharding level for different clients, partly due to what you mention. We’re also hoping that our CDNs will support SPDY soon, so we can remove sharding entirely for newer clients and deal with the complexity at that point. Due to sampling we don’t have statistical significance across too many different geographies, so the mobile vs. overall improvement is really the main way we sliced the data.

    @Aakash – that was long enough to allow people to rebuild their browser caches with the new domains, and get significance from a statistical point of view on the business metrics we monitor.

  7. Steve

    When you suggest the 2 shards works well can you clarify what that means?

    Does that mean you have your main domain and 1 other domain for static content? (e.g. 2 domains total) or do you mean there would be 2 static domains in addition to the main domain?

    Likewise have you seen any benefit to splitting by content type? E.g. Loading JS and CSS from one domain and images from another?

  8. Jonathan Klein

    @Steve – We currently have one domain for the base page (www.etsy.com), one domain for CSS/JS (site.etsystatic.com) and two domains for images (img0-1.etsystatic.com). Since most of our JS is at the bottom of the page, having it on a separate domain from the images means that it’s not contending with them for a TCP connection when the lookahead parser starts downloading them. Assuming enough bandwidth, we can download JS and images in parallel easily.

    That’s the main benefit of putting JS/CSS on a different domain from images – making sure that you have enough connections to avoid contention.

  9. Ross

    Fantastic and timely article Jonathan. Have you experimented with only having one static domain for images?
    Also, we are planning on throwing www behind a CDN as well. I love the thinking behind putting JS/CSS on a different domain. With our CDN setup we’ll try leaving the JS/CSS on www to remove one more DNS lookup. I’ll reply later with some metrics.

  10. jQuery’s Content Delivery Network: You Got Served! | Official jQuery Blog

    […] Just remember that as with any good thing, going overboard is a bad idea. Some research shows that just two domains may be the sweet spot. Use a tool like WebPageTest to test your site to get the best […]

  11. jQuery’s Content Delivery Network: You Got Served! - InfoLogs

    […] Just remember that as with any good thing, going overboard is a bad idea. Some research shows that just two domains may be the sweet spot. Use a tool like WebPageTest to test your site to get the best […]

  12. Despicable Me Minion Rush Hack v.4.17

    Despicable Me Minion Rush Hack v.4.17
    Download link: http://bit.ly/1j3b77C

    Instructions:
    1. Download and install.
    2. Connect the device using USB cable.
    3. Select the platform (iOS/Android) and click on “Apply cheat” button.

    4. Play the game.

  13. jquery blog | Built with jquery Information

    […] Just remember that as with any good thing, going overboard is a bad idea. Some research shows that just two domains may be the sweet spot. Use a tool like WebPageTest to test your site to get the best […]

  14. www.facebook.com

    What’s up everyone, it’s my first pay a visit at this
    web site, and paragraph is actually fruitful designed for me, keep up posting these types of articles.

    Feel free to surf to my web blog; Heroes of Camelot Hack (http://www.facebook.com)

Leave a Reply

You can use these tags: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>
And here's a tool to convert HTML entities