I thought I’d take the opportunity this year to give a little bit of visibility into how WebPagetest gathers the performance data from browsers. Other tools on windows use similar techniques but the information here may not be representative of how other tools work.
First off, it helps to understand the networking stack on Windows from a browser’s perspective:
It doesn’t matter what the browser is, if it runs on Windows the architecture pretty much HAS to look like the diagram above where all of the communications go through the Windows socket API’s (for that matter, just about any application that talks TCP/IP on Windows looks like the picture above).
The key to how WebPagetest works is its ability to intercept arbitrary function calls and inspect or alter the request or response before passing it on to the original implementation (or choosing not to pass it on at all). Luckily someone else did most of the heavy lifting and provided a nice open source library that can take care of the details for you but it basically works like this:
- Find the target function in memory (trivial if it is exported from a dll)
- Copy the first several bytes from the function (making sure to keep x86 instructions intact)
- Overwrite the function entry with a jmp to the new function
- Provide a replacement function that includes the bytes copied from the original function along with a jmp to the remaining code
It’s pretty hairy stuff and things tend to go VERY wrong if you aren’t extremely careful but with well-defined functions (like all of the Windows API’s) you can pretty much intercept anything you’d like.
One catch is that you can only redirect calls to code running in the same process as the original function which is fine if you wrote the code but doesn’t help a lot if you are trying to spy on software that you don’t control which leads us to…
Lucky for me, Windows provides several ways to inject arbitrary code into processes. There is a good overview of several different techniques here (and there are actually more ways to do it than that even but it covers the basics). Some of the techniques insert your code into every process but I wanted to be a lot more targeted and just instrument the specific browser instances that we are interested in so after a bunch of experimentation (and horrible failures) I ended up using the CreateRemoteThread/LoadLibrary technique which essentially lets you force any process to load an arbitrary dll and execute code in it (assuming you have the necessary rights).
Resulting Browser Architecture
Now that we can intercept arbitrary function calls, it just becomes a matter of identifying the “interesting” functions, preferably ones that are used by all of the browsers so you can re-use as much code as possible. In WebPagetest we intercept all of the Winsock calls that have to do with resolving host names, connecting sockets and reading or writing data:
This gives us access to all of the network access from the browser and we essentially just keep track of what the browsers are doing. Other than having to decode the raw byte streams it is pretty straightforward and it gives us a consistent way to do the measurements across all browsers. SSL does add a bit of a wrinkle so we also intercept calls to the various SSL libraries that the browsers use so that we can see the unencrypted version of the data (which is a little more difficult for Chrome since the library is compiled into the Chrome code itself but luckily they make debug symbols available for every build so we can still find the code in memory).
The same technique is used to intercept drawing calls from the browser so we can tell when it paints to the screen (for the start render measurement).
Get the Code
Since WebPagetest is under a BSD license you are welcome to re-use any of the code for whatever purposes you’d like. The project lives on Google Code here: http://code.google.com/p/webpagetest/ and some of the more interesting files are: