Cliff Crocker (@cliffcrocker) is Chief Product Officer at SpeedCurve. As an active contributor in the web performance community he evangelizes the importance of speed as it relates to user behavior and ultimately business ROI. He spends his time building product strategy, understanding customer challenges and helping to build performance culture at organizations large and small. He has been working in performance for the last 15 years, with experience at companies like Keynote Systems, Walmartlabs, SOASTA and Akamai. In his spare time, Cliff enjoys skiing and camping in the mountains of Colorado where he resides with his wife and two boys.
I confess, I’m not a statistician. While I pride myself on the ‘A’ I received in my college statistics class, admittedly it was on a pretty steep curve. That said, I’ve been looking at performance data for many years and have found myself on both sides of the debate about whether or not the practice of sampling performance data is inherently a good or bad idea.
When it comes to real user monitoring (RUM), I’m convinced that the marginal cost of collection, computation, storage, etc. is not always great enough to warrant a practice of collecting ALL THE THINGS by default.
Like any experiment, how you sample RUM data – as well as how much data to sample – depends on the answers you seek. While certainly not an exhaustive list, here are some questions you might ask when looking at implementing a sampled approach to real user monitoring…
Are sessions important to you?
I’m a big fan of not only looking at individual page performance, but also looking at performance in the context of a user’s entire visit to the site. This helps do things like:
- correlate various performance metrics with user behavior,
- understand the user journey and popular pages so you can better model your synthetic testing, and
- identify bottlenecks impacting a site’s business goals.
However, you may want the option to look at individual pages in isolation if you’re part of a product team just focused on one area. That’s fine, too!
A lot of performance users really love that they can track business metrics specific to a session alongside their performance data. While real user monitoring tools aren’t typically looked at as a replacement for marketing analytics tools, having session metrics available with your performance data is pretty great for anyone who cares about performance. This way you can easily base your performance budgets on a business outcome, such as a cart conversion or a click on a “call to action” link.
In the case where sessions are important to you, finding a RUM solution (or creating one) that does session-based sampling is key. This does mean that the size of your sample for page views will likely not be exact.
Where should you do the sampling?
Another big thing to think about when applying sampling is where you want to sample. Having the ability to sample at the collection point as well as in the client/browser itself provides you with a lot of flexibility.
In most cases, sampling from the collection endpoint (either through a vendor, your CDN, or your origin) gets the job done and is a fine approach. However, for some more sophisticated users who want to be very sensitive to how they are spending their “beacon budget”, you may elect to sample at the client. This allows you to do things like increase the sample rate for an A/B test group for comparison with a control group when the test group is extremely small. Potentially you want to sample based on the location of the user agent, the type of user agent, or even the type of user you are serving content to.
One thing to be careful of: If you do manipulate your sampling to collect more data for specific segments, make sure you are able to exclude these segments appropriately or keep them completely separate when drawing conclusions about the entire population of your users. One great way to do this is to create a separate flag for these groups that you can use to isolate from your greater population. Having this flexibility is really great, and helps you make sure you are getting the most out of your monitoring budget.
Considerations for determining your sample size
How distributed is your traffic?
I’ve heard great accounts of businesses that assume their traffic is 100% regional, only to find they have an entire contingent of users in another country! (Watch this great talk from Harry Roberts at performance.now() to hear such a tale. It’s at minute 10:45, but the entire talk is fantastic, so make sure to watch the whole thing!)
For most sites, you want to understand where your users are. If sampled too heavily, the dataset for these important cohorts of users is often too small to get relevant insight. That said, when you have to make decisions about where to spend your monitoring budget, drawing insights from your primary geographic region in isolation may be warranted.
Are you running experiments in production?
This is a rapidly growing practice that is becoming more and more important for product teams. Whether you’re pushing canary builds or running A/B test variants, you want to ensure that you have enough data in those buckets to factor performance into the decision criteria. We’ve seen sites make business decisions based on performance data that was so inconclusive that it would have been better if it had been ignored.
Are you only interested in a specific type of traffic?
Most organizations want to understand how their site is performing across multiple form factors (desktop/tablet/mobile) as well as browser types. Oftentimes an important subset of users using a technology may represent a significantly smaller user base. Given the massive fragmentation of mobile devices – namely Android – understanding performance is crucial to creating a more inclusive web. Ensuring a proper sample size for these cohorts is crucial and often hard to do if you aren’t paying attention. That said, for some teams the focus may be more specifically targeted toward a certain technology or user agent, so a smaller sample for a more popular user agent or device may work just fine.
Do you have a relatively small population of REALLY important users?
Whether you’re a retail site, financial services, media or other, you likely have a smaller set of users that you are VERY interested in. Getting an adequate dataset for a retailer when the rate of conversion is really low (say 1-2%) may be really hard to do if your sample size is too small. Similarly, media sites care a lot about registered users on the other side of a metered paywall, but most likely they make up a MUCH smaller segment of the population. These examples exist across all of our sites, and it’s important that we have the ability to represent them in our datasets.
The important thing is to get started
Whether you’re considering getting started with RUM or changing the approach you use today, don’t let the decision of how much to sample send you into paralysis. Start small, see what you can learn, and continue to expand as necessary.