LTP – Real World Example: Calculating Work Load From Historical Data
Recently, I saw a question on an internal alias that spawned a very useful dialog I wish to share below. This example shows how I approach calculating the work load and load pattern when using a mix of historical data and the need to project future data. Below is the relevant part of the email thread:
teammate: Do we have any real examples that you can share for the max concurrent users that you tested for the various external facing applications along with the user base for the application (e.g. You tested for 500 concurrent users to support a website to be used by 1.5 million customers.)?
me: This sounds like a VERY slippery slope. What type of app (xbox live login, eCommerce, Medical RecordsApp, social media, etc.)? I would suspect that every one of these would have vastly diverse profiles even if the data within each group was consistent, which I do not think it is. Also, why do you want to get this number? Is it for a specific customer or for general use?
teammate: Application is a government website used by all the citizens of the country for managing their retirement fund and insurance policies offered by the government. I also got to know that they have 50 million page view per year currently. I know the breakup of the page views by Month. We are expecting 10 % YoY Growth for 5 years and need to consider that too. I am trying to arrive at what will the concurrent user load that we should test for. We don’t know how long each user spends on the site.
me: I am not sure it matters how long each user spends on the site, as long as the ratio of work that each area receives is proper. For example (I am making up numbers, but you can substitute values from the web logs as needed):
- 5% – Login
- 40% – list current values
- 10% – change values
- 25% estimate future worth
- 20% – all other functionality
Then I could break down the tests to have a distribution like that and ramp up and down the user load accordingly.
teammate: I will get this user mix. Now to arrive at max concurrent users, can I do something like:
- Jan has 4.5 million view which is highest for any month
- I extrapolate say 10 % YoY growth for 5 years and lets say, the page view would become 7 million
- I can assume page views per day = 7 million / 30 days
- Page views per hour = 7 million / 30 days / 12 hours (assuming people will use this site only for 12 hours in a day)
- Page view per min = #4 /60 min
- Page views per sec = #5 / 60 sec
Can the outcome of the #6 be considered as concurrent user load and then spread them as per the user mix where different users are doing different operations?
me: Here’s how I would approach the layout of the test (NOTE, I switched from pages to tests, but for argument sake, assume that a test has only a single page call to the page type in question).
First, figure out your rate (as you did above), except stop at the hourly rate (no need for minute or second)
|Metric||Value||Excel Formula used|
|10% YoY – 5 years||7,247,295||=FV(10%,5,0,-4500000,0)|
|pages/day||241,577||=7247295 / 30|
|pages/hour (12 hour day)||20,131||=241577 / 12|
Second, use the percentage table below to build another table showing the rate for each page type:
|Type||Qty per hour|
|5% – Login||1,007|
|40% – list current values||8,053|
|10% – change values||2,013|
|25% estimate future worth||5,033|
|20% – all other functionality||4,026|
Third, use the login as an indication of the number of unique visitors during the hour. It is not precise, but it is as good a starting point as I can think of. Once you do that, you can structure a test as follows:
- Test Mix Model: Based on User Pace
- Initialization Test: Login
- % New Users: 0
- Test Mix:
|list current values||8|
|estimate future worth||5|
teammate: I thought about and get what you saying. The point you are making is that if 1000 concurrent users doing the following test in 1 hour then we will end up hitting 20 K pages. As these test will run really fast like say every test is done in 10 sec, wont I end up have lot of wait / idle time. In other words, would I have achieved this with say 100 concurrent users but increasing the no. of test they do per user per hour by 10.
So we can say that the number of concurrent users only doesn’t matter. What matters is “are we able to generate the page views per hour which is expected” by either having more concurrent user or running more test in 1 hours with less concurrent users.
me: EXACTLY. It took me a couple of years to grasp this concept because I kept getting push back from customers that the user count was so important. There are times when the # of concurrent users is important (if you are trying to max out a thread pool or a connection pool, etc.) but the important part is to mimic the perceived LOAD that the server receives, which is USERS * WORK PER USER ITERATION * # ITERATIONS/USER/HOUR.