Web Performance Calendar

The speed geek's favorite time of year
2021 Edition
ABOUT THE AUTHOR
Leon Fayer photo

Leon Fayer (@papa_fire) currently leads engineering at Teaching Strategies. Leon has over two decades of expertise concentrated on architecting and operating complex, web-based systems to withstand crushing traffic (often unexpectedly). Over the years, he's had a somewhat unique opportunity to design and build systems that run some of the most visited websites in the world and has the opinion that nothing really works until it works for at least a million people.


Every type of content was not created equal. Nor should it be served the same way. At a high level, you can separate your content into 3 groups:

  • infinitely static — image is a perfect example of this. Once an image has been published, you should never update it (because multiple reasons) and should be able to cache it almost indefinitely.
  • temporarily static — this covers the content that renders into static output. CSS, articles, etc. The content that may change in the future (requiring you to purge cache) but generally can be cached on demand for extended period of time.
  • dynamic — the content that is individual for each user/load. Personalization, user profile/history, top/recent content — all these cannot be cached for any extended period of time and require re-computation or rendering on each request.

Not only those groups should be treated differently from cache strategy perspective, but they should also be treated differently when serving them from origin.

The overhead

This applies to any web server (nginx, node.js, Apache), but for the ease of example, let’s use Apache. Chances are, your httpd.conf starts with something like this (with correction to language used).

# load modules
LoadModule php_module /opt/apache22/libexec/amd64/mod_php.so
LoadModule apreq_module /opt/apache22/libexec/amd64/mod_apreq2.so
LoadModule apreq_module /opt/apache22/libexec/amd64/mod_ssl.so

# serve on port 80 for non-ssl content
Listen 0.0.0.0:80

# setup mod_rewrite engine
RewriteEngine On
RewriteLogLevel 4
RewriteLog /www/logs/apache/rewrites.log

# validate against rewrite checks
Include /www/etc/httpd-rewrite-global.conf

# define server/thread limits
StartServers 5
ServerLimit 40
MaxClients 40
MinSpareServers 5
MaxSpareServers 10

# if the file doesn't exist is static directory 
# process dynamic content through the dispatcher function
RewriteCond /www/htdocs/static/$1 !-f
RewriteRule /(.*)$ /www/htdocs/dynamic/dispatcher.php [L]

# if file is static and is found - serve it
RewriteRule /(.*)$ /static/$1 [L]

Most people don’t think about it, but with this setup, every request is going to load (or use) required modules and try to compare the requested URL to a (usually) long list of rewrite rules (that have a tendency to accumulate rather quickly). So the question you have to ask yourself is — does serving logo.png require loading mod_php module and does it need to be matched against dozens of legacy URL strings the business decided to keep active after the last redesign? Or, in fewer words, does every static asset need the same overhead as dynamic content? The answer is — no, no it does not.

Furthermore, is the service/thread configuration you use for your dynamic content is the same as you would need to serve your assets from origin? And again, the answer is likely no.

Separation of responsibility

Separating the serving of different types of content allows you to optimize and tune serve time for each individual type of content. The way to accomplish it is to have individual web servers running for each type of content that requires individual optimization. For example, instead of having a generic httpd.conf that listens to requests on port 80, you need to create httpd-static.conf listening on port 80 and httpd-dynamic.conf listening on different port (let’s say 8081).

httpd-static.conf

Listen 0.0.0.0:80

# static content configuration parameters
StartServers 2
ServerLimit 7
ThreadLimit 1200
ThreadsPerChild 200
MaxClients 1200
MaxRequestsPerChild 0
MaxSpareThreads 1200

RewriteEngine On

# If it’s a local static file, go ahead and serve it, 
# otherwise send to dynamic web server
RewriteCond /www/htdocs/static/%{REQUEST_URI} !-f
RewriteRule ^(.*)$ http://localhost:8081/$1 [P,L]

httpd-dynamic.conf

# load modules
LoadModule php_module /opt/apache22/libexec/amd64/mod_php.so
LoadModule apreq_module /opt/apache22/libexec/amd64/mod_apreq2.so
LoadModule apreq_module /opt/apache22/libexec/amd64/mod_ssl.so

# serve on port 8081 for non-ssl dynamic content
Listen 0.0.0.0:8081

# setup mod_rewrite engine
RewriteEngine On
RewriteLogLevel 4
RewriteLog /www/logs/apache/rewrites.log

# validate against rewrite checks
Include /www/etc/httpd-rewrite-global.conf

# define server/thread limits
StartServers 5
ServerLimit 40
MaxClients 40
MinSpareServers 5
MaxSpareServers 10

# serve content through dispatcher
RewriteRule /(.*)$ /www/htdocs/dynamic/dispatcher.php [L]

Use httpd-static.conf as a gateway, to quickly serve static content and pass on the dynamic content to httpd-dynamic.conf to do the processing with minimal proxy overhead.

You can configure static content to also honor individual extensions instead of files from certain directory. And in addition to having separate configuration parameters, tuned to individual types of content, this allows to easily add different logging/monitoring/caching rules to different types of content.

One Response to “Content Separation”

Leave a Reply

You can use these tags: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong>
And here's a tool to convert HTML entities