Christian Bilien’s Oracle performance and tuning blog

December 24, 2007

11g ASM preferred reads for RAC extended clusters

Filed under: Oracle,RAC — christianbilien @ 10:14 am

I expressed this summer my delight over the 11g fast resync option for extended (geographical) RAC clusters, for which the mirrored failure groups have to be on different sites, like having one of the failure group on array 1 on site 1 and the mirrored copy on array 2 on site 2. The major 10g limitations are twofold in this area:

  • The loss of an array means the mandatory reconstruction of the whole data base.
  • Failgroups have to be reconstructed serially (no more than one rebalance activity per node, per ASM group).

This is really painful as any scheduled shutdown of one array while the other one is still active is analogous to an outage. Imagine what it takes to rebuild 100TB.

Normal redudancy (mirrored) reconstruction is much slower than either the SAN or the storage layout allows (there is room for “research” here – the best I have seen so far is 150GB/Hrs, that’s 43 MB/s on 64 disks 15000 rpm on the highest DMX storage array. The FC ports where 2x 4Gb/s).

The second problem with RAC 10g extended clusters is linked to the way the ASM storage allocator will create primary and mirrored extent. In a disk group made of two failgroups, notions of “primary fail group” and “secondary fail group” will not exist. Each failgroup contains an equal number of primary and mirrored copies. Let’s take the example of a diskgroup made of two failgroups, each on a different storage array. For each of the storage arrays, here is the number of primary and mirrored copies for a given group (it is made of just one datafile here):

We’ll use the X$KFFXP table. The columns of interest here are:

GROUP_KFFXP : disk group number
LXN_KFFXP: 0: primary extent/ 1: mirrored extent
DISK_KFFXP: disk where the extent has been created
NUMBER_KFFXP : ASM file number

select disk_kffxp, lxn_kffxp, count(*) from x$kffxp x, v$asm_file v
where group_kffxp=1 and x.number_kffxp=v.file_number and xnum_kffxp!=65534
group by disk_kffxp, lxn_kffxp;

 

 DISK_KFFXP  LXN_KFFXP   COUNT(*)
 1          0       1050

 1          1       1050

 2          0       1050

 2          1       1050

The number of primary and mirrored extents is balanced on each of the disks of group 1. The logic is that whilst write access will equally hit both types of extents, reads will only access the primary extents. It makes sense to spread primary extents over the two failure groups to maximize throughput when the two failure groups are on an equal foot: same throughput and same I/O response time.

But what happens when the boxes are in different locations? As each instance will read on both arrays, the distance factor will introduce delays on half of the reads. The delay is a function of the distance, but we also need to take into account that the number of storage hops the reads must traverse will increase when remote sites are to be reached. Even worse, the inter sites links are bound to congestion as they must accommodate oversubscription and link contention.

Whenever possible, a clever use of a RAC data locality will reduce the RAC interconnect burden and reduce reliance on cache fusion. It can also in 11G reduce the inter site link throughput requirements and possibly speed up reads by reading data locally, assuming the local array is not itself overloaded.

The preferred read is an instance wide initialization parameter:

ASM_PREFERRED_READ_FAILURE_GROUPS = diskgroup_name.failure_group_name,…

ASM extent layout reference: Take a look at Luca Canali’s work: a peek inside Oracle ASM metadata

— Updated: check the comment below for Luca’s updated links – very interesting indeed —

About these ads

5 Comments »

  1. Hello Christina,

    I have recently updated the document you referenced on ASM internals. The latest version and a recent presentation are:

    http://twiki.cern.ch/twiki/pub/PSSGroup/Presentations2007/Inside_Oracle_ASM_LC_CERN_UKOUG07.ppt

    and

    http://twiki.cern.ch/twiki/bin/view/PSSGroup/ASM_Internals

    Cheers,
    Luca

    Comment by Luca — December 28, 2007 @ 6:02 pm

  2. PS: sorry Christian I seem to have misspelled your name in my previous comment

    Comment by Luca — December 28, 2007 @ 6:04 pm

  3. Hi Luca,

    I updated the post to draw the reader’s attention to your new links.

    Christian

    Comment by christianbilien — December 29, 2007 @ 11:36 am

  4. [...] Christian Bilien:  Post on Extended RAC [...]

    Pingback by Extended RAC on Pure Oracle Stack Summary | Oracle — July 24, 2008 @ 7:18 am

  5. [...] The purpose of this post is not to explain the ASM Preferred Read feature or the way to put it in place (for such purpose you can have a look to this oracle-base post or Christian Bilien’s one). [...]

    Pingback by ASM Preferred Read: Collect performance metrics « bdt's oracle blog — February 18, 2013 @ 7:04 am


RSS feed for comments on this post. TrackBack URI

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

The Rubric Theme. Blog at WordPress.com.

Follow

Get every new post delivered to your Inbox.

%d bloggers like this: