Christian Bilien’s Oracle performance and tuning blog

June 26, 2007

Log file write time and the physics of distance

Filed under: HP-UX,Oracle,Solaris,Storage — christianbilien @ 7:46 pm

I already wrote a couple of notes about the replication options available when a production is made of different storage arrays (see “Spotlight on Oracle replication options within a SAN (1/2)” and Spotlight on Oracle replication options within a SAN (2/2)).

These posts came from a real life experience, where both storage arrays were “intuitively” close enough to each other to ignore the distance factor. But what if the distance is increased? The trade-off seems obvious: the greater the distance, the lower the maximum performance. But what is the REAL distance factor? Not so bad in theory.

I’m still interested in the first place by synchronous writes, namely log file writes and associated “log file sync” waits. I want to know how distance influences the log file write time in a Volume manager (HP-UX LVM, Symantec VxVM, Solaris VM or ASM) mirroring. EMC SRDF and HP ‘s Continuous Access (XP or EVA) synchronous writes could also be considered but their protocol seems to need 2 round trips per host I/O. I’ll leave this alone pending some more investigation.

The remote cache must in both cases acknowledge the I/O to the local site to allow the LGWR’s I/O to complete.

1. Load time and the zero distance I/O completion time.

Load time:

The speed of light in fiber is about 5 microseconds per kilometer, which means 200km costs 1ms one way. The load time is the time for a packet to completely pass any given point in a SAN. A wider pipe allows a packet to be delivered faster than a narrow pipe.

The load time can also be thought as the length of the packet in kilometers: the greater the bandwidth, the smaller the packet length, and the smaller the packet load time. At 2Gb/s, a 2KB packet (the typical log write size) is about 2kms long, but it would be 2600 km long for a 1.5Mb/s slow link.

Zero distance I/O completion time

The zero distance I/O completion time is made of two components:

  • A fixed overhead, commonly around 0.5 ms (the tests made in the Spotlight on Oracle replication options within a SAN (2/2) and reproduced below on fig.1 corroborates the fact that the I/O time on a local device is only increased by 10% when the packet size more than doubles). This represents storage array processor time and any delay on the host ports for the smallest packet.
  • The load time, a linear function of the packet size.

At the end of the day, the zero distance I/O completion time is :

Slope x Packet size + overhead

Here is one of the measurements I reported in the “Spotlight on Oracle replication post” :

Figure 1 : Measured I/O time as a function of the write size for log file writes

Write size (k) I/O time (ms)
   
2 0,66
5 0,74

A basic calculation gives :

Slope = (5-2)/(0,74-0,66)=0,027
Overhead = 0,6 ms

Figure 2 : Effect of the frame size on zero distance I/O completion time :

Frame size (k)

Time to load

2

0,65

16

1,03

32

1,46

64

2,33

128

4,06

 

A small frame such as a log write will heavily depend upon the overhead, while the slope (which itself is a linear function of the throughput) is predominant for large frames.

2. Synchronous I/O time

The transfer round trip (latency) is the last component of the time to complete a single I/O write over distance. It is equal to

2xDistance (km) x 5µsec/km

Figure 3: Time to complete a 2K synchronous write (in ms)

km

Round trip latency

Time to load

Overhead

Time to complete the log write

10

0,1

0,654

0,6

1,354

20

0,2

0,654

0,6

1,454

30

0,3

0,654

0,6

1,554

40

0,4

0,654

0,6

1,654

50

0,5

0,654

0,6

1,754

60

0,6

0,654

0,6

1,854

70

0,7

0,654

0,6

1,954

80

0,8

0,654

0,6

2,054

90

0,9

0,654

0,6

2,154

100

1

0,654

0,6

2,254

110

1,1

0,654

0,6

2,354

120

1,2

0,654

0,6

2,454

130

1,3

0,654

0,6

2,554

140

1,4

0,654

0,6

2,654

150

1,5

0,654

0,6

2,754

This is quite interesting as the log writes are only about twice as slow when you multiply by 15 the distance.

1 Comment »

  1. […] Filed under: HP-UX, Oracle, Solaris, Storage — christianbilien @ 5:15 pm The first post ( “Log file write time and the physics of distance” ) devoted to the physic of distance was targeting log file writes and “log file sync” waits. It […]

    Pingback by Asynchronous checkpoints (db file parallel write waits) and the physics of distance « Christian Bilien’s Oracle performance and tuning blog — July 25, 2007 @ 8:14 pm


RSS feed for comments on this post. TrackBack URI

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

Create a free website or blog at WordPress.com.

%d bloggers like this: