An engineer's guide to optimization

An engineer’s guide to optimization

25thDec 2013 by Tony Gentilcore

ABOUT THE AUTHOR

Tony (@tonygentilcore) is a hopeless speed junkie and the lead of Google's Chrome Speed Team.

A solid engineering mindset is key to successful performance optimization.

This has been driven into me by the wisdom of engineers such as Bharat Mediratta, Darin Fisher, James Simonsen, Nat Duca, Pat Meenan, Simon Hatch and Steve Souders (to name a few), as well as by trial-and-error during almost 10 years of performance work.

As a naive developer, Fasterfox taught me not to go to 11. Later, trying to codify the initial set of rules into PageSpeed Insights highlighted the power of using tools, not rules. More recently, time spent racking up smaller performance wins in Google Web Search and Chrome encouraged me to think bigger.

Here’s the 5-step framework I now use to approach any significant engineering optimization problem. Hopefully you’ll find it useful too.

Identify a scenario which optimizing moves the needle of a business metric.
Let’s be realistic, we don’t live in a Whuffie based economy yet. If what you plan to optimize is not tied to a business metric (read: dollars), figure out a way to establish correlation or better yet, causation.The #webperf community has convincingly demonstrated that a webapps’ initial load time leads directly to dollars for many businesses. There are lots of ways to accomplish this for all kinds of scenarios. Be creative.

Not only does this research help get resources for your work, it also clarifies where to invest effort and gives you more satisfaction in your accomplishments. If you truly can’t explain how your work applies to business metrics, then work on a more pressing problem first. I assure you, there’s no shortage of them.
Measure the scenario.
The scenario must be precise. Speed Index of a cold app load is one such scenario.

Then, establish a repeatable synthetic benchmark of the scenario. It must be possible to rapidly test ideas against it during development. This allows your crazy ideas to fail with minimal effort (unlike the full buildout required for early flying machines).

The benchmark must also run continuously against your shared development prototype and monitored for regressions. Advocate for an organizational no-regression policy on key benchmarks to ensure improvements don’t evaporate. Such a policy enables the effort to scale up to be shared by a larger team while mitigating the tragedy of the commons.

Finally, monitor the real world. On the web this is commonly referred to as Real User Monitoring (RUM). Continuously validate your synthetic benchmarks against reality. When one graph moves, so should the other. If they don’t move together, improve the models used in your benchmarks.
Calculate light speed.
Establish a set of constraints that you’re willing to work within for the initial investigation. This is your universe. For example, network latency and bandwidth may be one such set of constraints. Then forget about your current architecture and estimate the absolute performance bound of your scenario. This is your universe’s speed of light.

Sometimes this calculation can be extremely challenging. You might be stuck with something that requires guesstimates of key terms (like the Drake equation). So don’t worry if you’re unable to establish the absolute light speed on your first iteration. It took science over 200 years to do this for the actual speed of light. Be like Galileo and err on the conservative side and focus on establishing something that you’re sure is theoretically possible.
Approach the speed of light.
Deeply profile your scenario and identify the largest bottlenecks that are holding it back from the speed of light. Apply Amdahl’s Law rigorously. This may require considerable investment in learning about or building new tools. Don’t skimp, your investment will pay dividends!

Estimate how close you think you can get to light speed before reaching the point of diminishing returns (where an additional unit of effort yields lower per-unit returns).

Ruthlessly fix all bottlenecks that are holding you back from reaching the point of diminishing returns. Bottlenecks may lurk at any level of the stack so you must be willing to go wherever the problem is. In this pursuit, keep a healthy disregard for organizational politics and boundaries. Never blame a performance bottleneck on another system and walk away. Instead either directly fix the problem, show how to fix the problem or shine a spotlight on it (remember your spotlight equates to dollars now).

Once the point of diminishing returns is reached, it’s time to either move on to a new scenario (goto step 1) or to change the laws of physics by thinking bigger and widening the constraints used for your light speed calculation (goto step 3).
Profit! Literally.

Web Performance Calendar

An engineer’s guide to optimization

Search

Planet Performance

Archives