We are in the era where everyone involved in SDLC understands their role in building high performance system. It’s no more the core responsibility of Performance Engineering team in silo to ensure the system performance is assessed & certified few days before production move.
Actually, the onset of Agile & DevOps development practices has brought a lot of emphasis on early performance analysis & thereby has led to several changes in the job responsibilities of a Performance Engineer. But irrespective of traditional waterfall or agile/devops development model, adopting early performance testing & analysis practices brings in exponential benefits (have shared more details about the key focus areas during early performance analysis here “Key focus areas during Sprint level (early) performance tests” ).
While Performance Engineer sails through the objective of assessing the system performance from early software life cycle, sometimes priorities gets changed due to organizational constraints & several other environmental limitations. Many key facts are forgotten while measuring & certifying the system for its performance. Some of the prevalent forgotten facts while measuring system performance are covered here.
Web Performance measurement as part of CI environment
With the advent of Web 2.0, the web pages have become very dynamic & client-side heavy and hence there should be a strong continuous focus to achieve best web performance. Minimizing the response times perceived by the end user is very important for an e-business win.
During system performance assessment, due to the complexity of distributed web application, accessibility modes & high user scalability targets to be achieved, etc, many a time web performance (client-side browser performance) is not considered in the priority list or sometimes forgotten until production move happens. The performance impact due to browsers or mobile devices or network types is not taken so seriously during performance analysis. Focusing only on server-side performance for certifying it for a multi-user load levels is just not enough. It is equally important to do web performance tests as part of regression suites in continuous integration performance testing to measure the web performance deviations across builds. The trends can be stored in time series databases like Graphite/InfluxDB & visualized in Grafana for intuitive real time reporting. There are many good online tools available to perform web performance test analysis.
Measurement of Performance Test Environment differences from Production Environment
While assessing the server performance, it is important to be aware of the performance test environment architecture & server capacity and how different it is compared to production environment architecture & capacity. Having an isolated performance test environment is a must & using a production similar or at least knowing the capacity ratio differences between performance test environment versus production environment is very much essential.
Below test environment options are used,
- Production environment for carrying out performance tests (during off peak hours)
- Cloud based production similar test environment
- Scaled down test environment
Third option is more prevalent though there is a risk involved to map the performance test results from downsized environment to production environment. Performance Engineer is responsible to adopt appropriate techniques to extrapolate the performance metrics for production environment. There are several factors to be considered to perform the mapping of test results to production environment which are detailed here in this article, “How to wisely handle Performance Test Environment Challenges”
Performance Testing Tool is just an enabler. It’s not as intelligent as you.
Many Performance Engineers think knowing a few performance testing tools is all about performance testing. But the reality is that such thoughts lead to devastating performance problems in production in spite of carrying out performance test assessments. Performance Engineers should study the system architecture, technology stack, & other system internals to derive the right performance test strategy to certify the SUT. Reviewing the application architecture & infrastructure architecture can reveal several performance & capacity problems which can be validated through performance tests. Performance should be thought from early software development life cycle & not just few days before production move.
There are various matured practices about what kind of performance engineering activities can be performed throughout SDLC phases including initial (early) sprints when the system is not even available. Real time traffic & usage patterns has to be studied to finalize on the various types of tests including load tests, stress test, volume tests, spike test, endurance test, capacity test, etc. Not all applications would require all type of tests. Based on the application usage pattern & test objectives, appropriate type of tests need to be included to certify the application for performance, scalability, availability & capacity. Here are Simple tips to run realistic performance tests
Use of accurate & realistic Workload model for Performance testing
Often the server workload used for carrying out performance tests is not realistic. Production performance failures post mortem analysis has revealed this truth. The workload model is neither realistic nor mathematically validated with the help of queuing theory principles before conducting performance tests. Analyzing production traffic logs can reveal the real user access patterns & traffic intensity levels.
At one side we see there are lots of research papers & scientific studies about how to closely mimic virtual users to simulate realistic user distribution patterns & build highly accurate workload model but on the other side we see performance tests for business critical applications are executed without even proper user distribution across key business use case flows. Many scientific techniques & methodologies have been published in Computer Measurement Group (CMG), which is a non-profit organization for Performance & Capacity professionals by several pioneers & in International Conference on Performance Engineering (ICPE) which can help in build accurate workload model.
Ongoing Performance Management & Real User Experience Monitoring
There is a need for measuring end to end performance on ongoing basis (after production move) in order to be immediately aware of any potential performance bottlenecks across tiers, network, browser, mobile devices, etc. End user monitoring to understand the end user navigation patterns & experiences can help in creating an important feedback loop to design & development teams. Establishing appropriate feedback loops from production monitoring to improve the performance test workload model & performance test strategy will help in conducting accurate performance assessment. With the onset of APM tools, there is better awareness observed & production environment performance problem diagnosis time has reduced from days to hours. Use of artificial intelligence/machine learning techniques on performance monitoring, anomaly detection & performance prediction can further strengthen your strategy.
Connecting the dots – Performance testing results to Application Capacity Planning
Though there is awareness seen on early performance testing (early development sprints by including performance in the ‘definition of done’), when it comes to application capacity planning, still there is a prevalent misconception that cloud service provider will take care of it. Irrespective of in-house data center or private or public cloud hosted applications, performance benchmark results needs to be used to do application capacity planning analysis to assess the server capacity room for projected business load. Adopting simple queuing theory principles can help in accelerating performance test analysis which can provide an input for application capacity planning. A simple case study is available here How to use Queuing Theory to accelerate performance test analysis
The application capacity planning activities should provide inputs to infrastructure capacity planning. As we know system performance is not linear always, there needs to be appropriate models & techniques in place to predict & forecast different what-if scenarios to manage capacity effectively. It’s all about the art of connecting the dots to get the right picture to do proactive capacity planning by bringing in effective capacity management strategies.
Use of Performance Modelling to compliment Performance Testing
Many Performance Engineers think Performance modeling is a replacement for performance testing & hence they tend to argue about the accuracy of performance models compared to actual performance test results. Some think it’s not for them due to the fear of mathematics. Knowing simple equations that relates various performance metrics can do wonders in performance analysis. Performance Engineers need to understand how simple performance modeling activities can help in accelerating performance problem analysis early & quickly. A performance model available at design or development phase of SDLC with prediction accuracy of even 50-60% is good enough to provide early performance feedbacks. Here are some basic Performance Modelling essentials for Performance Engineers.
Though the above facts seem to be very important, some of these tend to be missed or considered as low priority, which leads to incomplete or unrealistic performance measurement.
The key finding from “State of Performance Engineering 2015-2016” user survey conducted by HP reveals, “There is a clear trend towards much broader application of the term ‘Performance Engineering’ & its everyone’s job”
Andreas Grabner talks about changes that are happening & key traits Performance Engineer should have in Trades of a Performance Engineer in 2020!.
No doubt, the roles & responsibilities of Performance Engineers are expanding & we need to be geared up to face the new challenges by adopting smart , innovative & continuous performance measurement techniques that assures accurate performance assessment.