28 KiB
Load Test
Freitag, 8. November 2019
19:52
Additional/Recommended additional configuration
When using a multi-core agent workstation, configure the agent to use Server Garbage Collection:
-
Open in Notepad the file: <Program Files>\Microsoft Visual Studio 9.0 Team Test Load Agent\LoadTest\QTAgent.exe.config
-
In the XML document, locate the /configuration/runtime tag.
-
Create a new child tag of: <gcServer enabled="true"/>
A Guide to SharePoint Performance Testing (Part I) -- What is "Acceptable" Anyway?
Recently I was involved in a couple of engagements to performance test SharePoint for a couple of companies. In both cases the basic questions was, "Will the performance be acceptable with our current configuration?". The question for us was really how do we answer that question to our satisfaction, and to the satisfaction of the client?
To answer the client's question, we had to be able to answer the the following questions:
-
What was "acceptable" performance?
-
What are the normal usage profiles?
-
What is "normal" user load?
-
How will we test the usage profiles?
-
How do we know when the farm is "overloaded"?
Acceptable Is in the Eye of the Beholder
Oddly the answer to the first question is not as easy as it would seem to be. The simple answer is that you pick a page load time and then set that as the SLA (say 3 seconds). But, is that for every page, or for a specific page? For one client the "meat" of the implementation was a set of standard searches for documents, and so the SLA was based upon how quickly a user could get a response to a set of standard searches. For the other client, the measure was a but more nebulous. They wanted the system to be responsive, but had no hard standard as to what that meant. So, in their case we chose a couple of basic transactions (logging in, viewing some common pages) and tracked the response time for those testing steps.
Another measure of "acceptable" performance is how stressed the servers will be in handling the load. So, not only do we measure the page load time for various transactions, but we will also be looking at a series of counters on the Web Front End Servers, and the SQL database to ensure that we are not over utilizing those servers (more on that in a later entry).
Establishing an Acceptable User Model
The next challenge in both cases was to establish a basic mix of user roles, and determine what they do on a regular basis. For one client, the average user would be spending most of their time searching for documents, but they would also spend a small portion of their time browsing the system looking for other data. In that case the usage profile concentrated on the four main types of searches (ID, Name, Multi Field, Sorted Results). For the other client our usage profiles were more about browsing the system and using it to look up and search for other employees. Those usage profiles were focused on that activity (Lookup Employee, Browse org chart, View employee schedule, link to external application).
Once we had established the types of usage profiles, we had to determine the relative weight of each profile as it was being used. This basically meant that we took each profile and assigned a % value as to how often it would be run by the test rigs. For example it might look like:
-
ID Lookup 40%
-
Name Lookup 30%
-
Multi Field Lookup 15%
-
Sorted Results 10%
-
Browse Content 5%
Establishing a Baseline
Once this was programmed into the test rigs, we were able to fire up some tests and see what our performance looked like. So in both cases a set of tests were run and we took a look at the results. In this initial phase of testing the primary goal was to establish a baseline of performance with no load on the system so that we could measure the load on the servers with no one using the system. This usually meant programming a test run with only 1 vUser per test rig instance.
Make sure that you save the results of these baselines and also save the user profiles and their usage model because this will be the starting point for our performance tests after we troubleshoot.
Troubleshoot and Tweak
Now that we have run our initial tests, we might not be completely pleased with the results. We might see that even with no load on the system certain pages or transactions are not performing to our SLA. This is a strong indicator that we need to Put Our Landing Page on a Diet. While this type of testing is technically performance testing, these tests and their results are really more like traditional debugging and we won't be including these results into out final report (we will talk about that in a later post as well). This is still vital and critical testing and we should approach this with the same amount of rigor that we do all testing (we do approach all testing with rigor don't we?). Once we have improved and optimized our system performance, we may need to re-baseline the tests. This may be because our original tests are no longer valid due to changes in user profile mix, or because out initial tests actually showed poor performance that we had to correct. No matter the reason we need to have a solid baseline for out next set of tests.
We will look at how to estimate user load in Part II
References
Step-by-Step: Configure Performance Testing using Visual Studio 2008
How SharePoint is like a Chinese Buffet
step-by-step-configure-performance-testing-using-visual-studio-2008.aspx
SharePoint Performance Counters
Performance metric | Warning | Metric Description |
---|---|---|
SharePoint Publishing Cache: Number of cache Compactions | >2 | Indicates that the SharePoint cache may lack in size. |
SharePoint Publishing Cache: Object cache flushes/Sec | >0 | Cache flushing during peak-use hours slows down performance. |
SharePoint Publishing Cache: Object Cache Hit Ratio | <1 | Small number indicates the searches for non-published items. |
SharePoint Publishing Cache: BLOB Cache % full | > 80% | Indicates that the SharePoint cache may lack in size. |
ASP.NET Applications: Cache API trims | >1 | Indicates the lack of allocated memory for ASP.NET output cache. |
ASP.NET: Requests Queued | >400 | Indicates the need for more Web Servers since a high number of requests can cause slow page load. |
ASP.NET: Requests Rejected | >2 | Indicates the need for more Web Servers due to server busyness. |
ASP.NET: Worker Processes Restarts | >1 | Any restarted worker process can indicate a potential problem. |
Memory: Pages/Sec | >20 | Large value for page writes to memory indicates the lack of RAM memory. *Note: Tends to jump over 200 often. |
Microsoft SQL Server Performance Counters
Performance metric | Warning | Metric Description |
---|---|---|
SQL Server General Statistics: User Connections | N/A | Number of users connected to the SQL server at the moment. Warning value is server dependent. |
SQL Server Buffer Manager: Page life expectancy | <400 | Time the page is saved in the buffer cache, a low number indicates possible low RAM memory. |
SQL Server Buffer Manager: Lazy Writes/Sec | >15 | Number of bad pages being removed from buffer per second. Zero is ideal. |
SQL Server Buffer Manager: Buffer cache hit ratio | <90% | Percentage of data successfully fetched from a cached page. Low RAM memory indicator. |
SQL Server SQL Statistics: Batch Requests/Sec | >1000 | Indicator of CPU usage by SQL Server affected by number of queries. |
SQL Server SQL Statistics: SQL Compilations/Sec | >100 | Large number of compilations per second indicates high resource usage. |
SQL Server Access Methods: Page Splits/Sec | >200 | Large number of page splits indicates high resource usage |
SQL Server Access Methods: Full Scans/Sec |
>100 | Full scans value can be ignored unless it peaks proportionally to CPU. |
SQL Server Locks: Lock Waits/Sec |
>1 | Indicates time passed for one server lock. |
SQL Server Locks: Number of Deadlocks/Sec | >1 | Number of deadlocks on the SQL Server per second. |
SQL Server Cache Manager: Cache Hit Ratio • > 85% • | <85% | Indicates the ratio between cache hits and lookups for plans. |
SQL Server Databases: Transactions/Sec | N/A | Number of transactions per second on the SQL server, warning value is server dependent. |
Learn more about SharePoint performance monitoring and metrics on Microsoft TechNet.
Aus <https://www.syskit.com/blog/sharepoint-performance-monitoring/>
A Guide to SharePoint Performance Testing (Part II) -- How many users can we really handle?
This is the second in a series of entries about Performance Testing in SharePoint. You can read Part I here.
To recap, we want to test the performance of a SharePoint intranet so, we created a set of user profiles and determined how often each profile will be executed. Then we ran a set of tests using those profiles and made some performance improvements on the system. However, we have yet to answer the question of is our system powerful enough to actually handle our user load. Worse yet some of our test results might have been reviewed by the business users and they saw that page load time when we had say 300 users was over 30 seconds and panic has set in because they think that the system won't handle more then 300 users and we have 10,000!
Avoiding the First Pitfall of Testing -- Concurrent versus Total Users
We have been trapped by the first pitfall that performance tests usually encounter. The pitfall of confusing virtual user load (or vUsers) with the number of users who can access the system. If we run a test with 100 vUsers, that does not equate to 100 users on the system. Since the test rigs are going to he configured to make as many hits as possible on the system, it more closely resembles 100 concurrent users hitting the system.
So, if we have 100 concurrent users hitting the system, how many users does that map to? That is a darn good question. To answer that, we need to do one of two things. We could figure out our current user base, and then determine how often they click into the site being tested in a given period of time (say an hour). That would give us the Requests Per Second (RPS) and a target for your performance testing.
So, let's say that we have 10,000 users that are going to hit a system. If we look at their current usage and determine that between 8AM and 9AM (the heaviest usage period, the system has there are about 75,000 hits to the system, we can work backwards to a RPS of about 20.8. Then we want to ensure that our system will handle approximately 21+ RPS at an "acceptable" page load time.
But what do we do when you don't have empirical data that can be uses to measure hits to a system? The simple answer is that we make an estimate of average user activity and active user load on a system and use that to back into a RPS number. In this case, take your total number of users (10,000) and make an estimate based on their roles how many of them will be actively using SharePoint at any given time. In the first client example where SharePoint is the primary application that they will be using to get their work done, then that percentage will be fairly high (say 50%), but for most implementations it will be much lower (say 10%).
Company 1 | 10,000 total users | 50% active users | 5,000 user count = 10,000 * .50 |
---|---|---|---|
Company 2 | 10,000 total users | 10% active users | 1,000 user count = 10,000 * .10 |
That means that at any given moment for out two companies, the first will have 5,000 active users, and the second will have 1,000 active users.
Once we have an estimate of active users, we next need to determine how often does a user actually click on a link and expect a result.
For Company 1, we can look at usage patterns and determine that the average time spent looking at a document that is retrieved before the next search is about 2 minutes. This means that of the 5,000 active users, each one is clicking and searching every 120 seconds. That translates into 5,000 / 120 seconds = 41.67 requests per second.
For Company 2 our usage scenario is a bit different. They are really browsing for information and so while fewer requests are made they are made a bit faster. We estimate that users will spend 60 seconds per request so our 1,000 users / 60 seconds = 16.67 requests per second.
Finally we have a goal that is definite and measureable for our tests. We know what our RPS needs to be, and that will enable us to perform the next set of tests that will attempt to determine just how much traffic our servers can handle. It also lets us set a goal for the number of concurrent or vUsers that we want to use for our testing. For Company 1 we want to have about 40-50 vUsers and ensure that the system can maintain 40-50 RPS inside of our page load SLA. For Company 2, that drops to 15-20 vUsers and a 15-20 RPS for our page load SLA.
Next, Part III, comparing Apples to Apples
References
Microsoft Performance and Capacity Estimation Guide -- http://technet.microsoft.com/en-us/library/cc261795.aspx
Aus <https://www.catapultsystems.com/blogs/a-guide-to-sharepoint-performance-testing-part-ii/>
A Guide to SharePoint Performance Testing (Part III) -- Apples to Apples
This is part three in a continuing series, you should read Part I and Part II first.
So far, in our quest to performance test SharePoint we have established a number of important criteria. First off, we know how users will actually be using our system and from that we have been able to create some test scripts and usage profiles. We then determined about how much stress our farm will have to be able to endure to satisfy our user load requirements. Now for the best part...we get to test the system and see if it meets our needs.
Test It 'Till It Breaks?
For many projects this means that someone will fire up a test harness, take the user scripts, and then pick some level of load that is higher than the expected requirement, see if the page load times meet the SLA, and declare that the farm is adequate. While this process does meet the basic requirements for our testing, it fails to answer another very important question. How much more load can the farm take? To answer that question we have to keep testing the farm with high user loads and see how that affects performance. One of the Myths of Performance Testing (from the Performance Testing Zone) is that performance testing is about stressing the system until it breaks. As he points out
The principle objective of performance testing is to get an insight as to how an application would behave when it would go live.
So, our primary goal should be to test the system at it's expected user load and see if it meets out SLA. Then, as another aspect of that testing, we will test it with increasing user loads to see when and more importantly how the application breaks down. In doing so, we often fall into another pitfall of performance testing
Apples to Apples
When performance testing is ongoing, there are multiple competing goals. One of those goals is to ensure that the farm is sized properly, and another is to identify trouble areas and improve them (optimization). I talked about this process in How SharePoint is like a Chinese Buffet. These goals have competing testing methodologies. As we test for optimization, we will tweak settings and configurations and run tests with varying loads to see how our improvements work. This will not work for our stress tests. For that we need to ensure that we compare Apples to Apples. This means that we run the exact same tests without changing the configuration of the farm. Theoretically this testing is done after all optimization is completed, but experience tells me that it often happens concurrently so beware.
My strategy for this type of testing is to first run a baseline test that has minimal load on the system (say 1 vUser per test rig or load generator). This test becomes our baseline against which we will compare future tests and tells us immediately if there is a problem with our farm (we should not because we have already optimized our farm). We should see very fast page load times, and very low CPU utilization. After the first test with very low user load I run another test with about half the user load that I expect to see on the system, then another test with the expected user load and then I increase by 50% until I run a test with at least double the expected user load. So, looking at Company 1 and Company 2 from our prior examples, Company 1 needs about 41 RPS and Company 2 needs about 15. Their user load test plans would look something like this (with 4 PCs being used as Test Rigs)
Baseline | 50% | 100% | 150% | 200% | |
---|---|---|---|---|---|
Company 1 | 4 Users | 21 users | 42 users | 63 users | 84 users |
Company 2 | 4 users | 8 Users | 16 Users | 24 users | 32 users |
Obviously this is more work then just running a test at 42 or 16 users and seeing if the page load time was within our SLA and saying that we are done, but the benefits, as we shall see, are quite meaningful.
Data Overload
As we run our tests with the test package (in our case Visual Studio Team Foundation, or LoadRunner) we are the proud recipients of a ton of information about the status of out servers as you can see from this LoadRunner screenshot.
{width="5.375in" height="5.0in"}
If we are trying to spot a specific performance issue, knowing Disk Queue Lengths, and Processor Queue Length can be very helpful, but when we are trying to show that our farm is sized correctly, all that data is too much and we need to pick out just the important bits. Assuming that we have optimized for performance, and everything is running well, then, when it all boils down, we really want to know the following things:
-
Requests Per Second
-
Page Load Time
-
CPU Utilization
RPS as we saw in Part II is really a measure of if the system can handle a concurrent user load and not start to queue up transactions. It should closely mirror or exceed our vUser load until we exceed the ability of the farm to serve transactions.
Page Load Time as we talked about in Part I is really the foundation of our SLA. We will likely see that as RPS gets queued up and backlogged that the Page Load Time will increase dramatically. We want to run our baseline test to see what optimal Page Load Time is so that we know what the best case scenario will be for our users. This is useful in comparisons to our later tests with increased load.
CPU utilization (and Memory utilization) will show us when our servers are starting to show strain under the user load. As we increase the load, we will see those values climb and this will actually be the first indicator that our system is approaching its limit.
What About Load Balanced Servers?
A number of SharePoint farms will use load balanced front ends to help distribute load across the servers and improve user performance. So, should the performance tests be against the full farm or against a fail over state.
{width="4.885416666666667in" height="5.0625in"}
The answer (at least in my opinion) is that it should be both. Why? Well, we want to test against the full farm because we want to see what the performance will be like in the normal state of use. We also want to run our tests against the farm in a failed over state (Number of WFE's minus 1) because that will tell us what performance will be like if the farm ever loses a component server. When we can provide to the business units this information (farm performance in a fail over state), they are able to make a decision on whether or not to overbuild the farm because they are not satisfied with the performance when running N-1. They may choose to accept that lower performance, but sometimes it is a business requirement that performance not degrade even when a server is down and then this test is crucial.
What Next?
So, we have run out battery of tests. We have run against the full server load and the N-1 state. All that is left is to put it all together, which we do in Part IV of the series.
References
http://performancetestingzone.wordpress.com/2009/08/22/performance-testing-myths/
A Guide to SharePoint Performance Testing (Part IV) -- Putting it all Together
This is part three in a continuing series, you should read Part I, Part II, and Part III first.
To recap our SharePoint testing adventure, we have determined what our users are going to do, we have established load criteria, and we have tested our farm changing only the number of users hitting the system to see when it breaks. The end result is a ton of test data that tells us a lot of information. What it doesn't do well is show us out results in a clear and defined manner. Sure, we could take out raw data to the business users and say something to the effect of "Look here, as we added the 85th vUser, the % of Committed Bytes In Use went from 24 to 30. Just look at this table of raw data that is 100 pages long". Yeah, that is going to work.
What we need is to pick out the relevant information that anyone can look at and see when the farm starts to break down and we need to add more hardware. In part III we boiled it down to three main measures:
-
Requests Per Second
-
Page Load Time (for a given transaction or a single page, usually the Landing Page)
-
CPU Utilization
Now, what we need to do is extract those measures from our raw data and put them into a easy to consume tabular format.
{width="6.666666666666667in" height="4.697916666666667in"}
Here I have set up a row for each test run and tracked the relevant data for each run. # of vUsers, Hits Per Second, Page Load Time for the Landing Page, CPU Utilization on the WFEs etc. Just looking at this data for these runs it is fairly clear to see that Hits Per Second track closely to number of vUsers until the farm hits about 150 vUsers (which from Part II we have learned will tell us what our expected user load is, in this case about 150,000 users). The HPS gets even flatter when we push to 200,000 users. We can also see that the load time for the default.aspx rises from a .58 second avg up to a 3.65 second average.
All of that data is good, but even this is not great for a presentation to management on farm performance. So we turn it into a graph. Here is an example of this graph for Hits Per Second. It is showing the farm being run with 2 WFEs and 1 WFE. The 2WFEs is to simulate a 3 WFE farm (remember that pesky N-1 standard from Part III). This graph gives us a quick an easy way to show management what performance of the farm looks like as we ramp up users.
{width="6.666666666666667in" height="4.59375in"}
This graph is easy to see that with 1WFE (2WFE farm) we cap out in these tests at about 50,000 user (50 HPS). 2WFEs (a 3WFE farm) is fine at 100,000 users, and starts to tail off at 150,000 users.
Here is a similar graph that shows CPU utilization for 1WFE and 2WFEs
{width="6.666666666666667in" height="4.302083333333333in"}
Here we can easily see that the 1WFE farm is starting to show CPU strain at 50,000 users and that the 2WFE farm shows the same level at 100,000 users. This graph makes it easy to see that each WFE (in this usage scenario) will give us about 50 Hits Per Second or 50,000 users.
Other Considerations
Try to remember that the goal of our testing was to show how well the farm performs. This meant that we first established what good performance meant, then we established metrics and criteria, then we ensured that our testing was consistent, and lastly we put it into an easy to use graph.
Its also important to remember that these numbers were for one specific farm and one specific testing scenario. If this farm changes in topology, and more importantly in usage or implementation, then the performance tests will likely need to be run again.
Its interesting that in other tests that I saw from Microsoft, load is 50 RPS/WFE in a collaborative environment and 100 RPS/WFE in a publishing environment. I would highly recommend extensive testing in this manner in your environment.