Christian Bilien’s Oracle performance and tuning blog

January 28, 2008

An upper bound of the transactions throughputs

Filed under: Models and Methods — christianbilien @ 9:46 pm

Capacity planning fundamental laws are seldom used to identify benchmark flaws, although some of these laws are almost trivial. Worse, some performance assessments provide performance outputs which are individually commented without even realizing that physical laws bind them together.

Perhaps the simplest of them all is the Utilization law, which states that the utilization of a resource is equal to the product of its throughput and its average service time. The utilization is the portion of time the resource is busy serving requests. The cpu(s) utilizations are given by sar –u. The individual disk utilizations in a storage array by the storage vendors proprietary tools. sar -d or iostat can be used to collect data for internal disks.

Take a disk serving a fairly steady load I picked up on an HP-UX test system with no storage array attached . sar –d gave the following data:

Time Utilization(%) Service time(ms) Reads/s Writes/s
15:07 70.8 10 4 67
15:12 67.3 12.3 30.6 24.1
15:17 67.8 12 33.7 22.7

This formula can be verified for the 3 points I picked up:

Utilization = (Read/s+writes/s) x service time

This law can be used to define an asymptotic bound, which is an optimistic bound since it indicates the best possible performance. If each application transaction spends {D}_{k} seconds on disk k, and {X} is the application throughput, the utilization law can be rewritten for disk k as {U}_{k}=X*{D}_{k}. An increase in the arrival rate can be accommodated as long as none of the disks are saturated (i.e. has a utilization of 100%). The throughput bound {X}_{max} is therefore the arrival rate at which any of the disk centers saturates. If {D}_{max} is the maximum disk service time, the upper bound to the transaction throughput can be found when one of the disks has a utilization of 1 (100%):


{D}_{max}*{X}_{max}=1

therefore

{X}_{max}=\frac{1}{{D}_{max}}

Let’s replace our disk by a single volume which encompasses a whole raid group inside an array, and consider that this raid group is dedicated to a single batch. Other raid groups participate to the transactions but we’ll focus on the most accessed one. If our transaction needs to make 10 synchronous visits (meaning each of them has to wait for the previous one to complete) to the most accessed volume in the storage array, and each of the visits “costs” 10ms, we’ll have {D}_{max}=100ms=0.1s. The best possible throughput we can get is 10 transactions per seconds.

 

 

 

4 Comments »

  1. Christian,

    somehow I have the feeling you were not finished with this posting. After reading the first paragraph I expected to see an example of the “individual commenting” you mentioned or some other conclusion. Is this just me or did you really plan to take this further?

    Cheers

    Comment by Robert — January 29, 2008 @ 7:59 am

  2. Hi Robert,

    I wrote this post after reading yet another performance reports where the analyst was commenting utilizations, service times, queue lengths and I/O rates without realizing that they are all linked by simple maths.

    The sar -d example shows an example of the Utilization law: it would not make much sense to say that the utilization is “too high” without realizing that it is the product of an I/O rate and service time. The conclusion would be very different for a given utilization rate if the service time was 80ms (RAID5 write access) of 4ms (read sequential).

    However I plan to write another post on little’s law and another asymptotic bound, which will give another bound to throughput.

    Christian

    Comment by christianbilien — January 29, 2008 @ 8:54 am

  3. Christian,

    Good little post. I too have seen similar lack of realisation on the part of IT people of the link between the high level performance goals they want and the low level actions that must happen to achieve it. Most people just do not seem to realise that a target of 1,000 transactions per second, corresponds to each transaction taking only 1 millisecond (assuming no parallelism for the moment). And that if that transaction involves any disk I/Os then it needs to happen really, really fast, or just not at all.

    In my case the 1,000 transactions per second were spread over 16 CPUs, but we were only getting a 10x throughput increase not 16x. So the 1,000 tx/sec was equivalent to 100 tx/sec on a single CPU. Which equates to 10 milliseconds for each transaction. In my view this is not enough time to do any real disk I/Os, which are around 5 to 10 ms anyway (varies by disk array).

    I had a hard job explaining to the customer what achieving 1,000 transactions per second actually meant in real underlying hardware performance, and how it was a very tough target to achieve. They were also annoyed that they had bought the latest and greatest computer system with the latest and fastest CPUs that were 4 x faster than those of 5 years ago, but it was all for nothing. The ultimate throughput was determined by the disk access speed, which has only improved marginally during the same period of time. And the solution to improving performance was to buy more and faster disks, and not just throw more CPUs at it.

    John

    Comment by John — January 29, 2008 @ 10:11 am

  4. John, I’m afraid the story is a bit more complex: 10ms is just an average. It /might/ be enough considering buffer cache and the fact that updates are often written asynchronously. Just a simple example: if ten changes go into a single block there are just one or two write operations for them in the best case. This gives us 10 x 10ms = 100ms which is already quite a bit to do the writes especially when data and TX log (or undo / rollback) go to physically different devices.

    To determine the max throughput for constant load one would have to know how many blocks on average are updated per change and what the IO subsystem can sustain with regard to continuous write performance (i.e. not considering caches because caches will fill up after some time). So, if the disks can only sustain 10MB/sec write with permanent load a system that requires 50MB/sec writes for its load is bound to fail…

    Comment by Robert — January 29, 2008 @ 2:16 pm


RSS feed for comments on this post. TrackBack URI

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

Create a free website or blog at WordPress.com.

%d bloggers like this: