Christian Bilien’s Oracle performance and tuning blog

May 14, 2007

Oracle ISM and DISM: more than a no paging scheme (1/2)

Filed under: Oracle,Solaris — christianbilien @ 12:54 pm

This post only deals with ISM. I’ll write second one about Dynamic ISM (DISM) .

A long standing problem on any platform has been the probability that part of the Oracle memory segment gets swapped out and that what is a relatively memory fast access turns into a horrid bottleneck. Oracle 9i on Solaris made use of an interesting feature named Intimate Shared Memory (ISM) which in fact makes a lot more than what one may think of initially.

The very first benefit of ISM (not DISM for the time being) is that the shared memory is locked by the kernel when the segment is created: the memory cannot be paged out. A small price to pay to the locking mechanism is that sufficient available unlocked memory must exist for the allocation to succeed.

Because the SHM_SHARE_MMU flag is set in the shmat system call to set up the shared segment as ISM, there are less known benefits, which may be of a higher importance than the no paging scheme on CPU bounds systems.

 

Shared kernel virtual-to-physical translation

The virtual to physical mapping is one of the most consuming tasks any modern operating system has to perform. The hardware Translation Lookaside buffer (TLB) is a physical cache to the slower in-memory tables. The Translation Storage Buffer (TSB) is a further translation in memory cache. As even in Solaris 10 the standard System V algorithm is still to have a private virtual address space for each process, aliasing (several virtual addresses exist that map to the same physical address).

ISM allows the sharing of kernel virtual-to-physical memory between processes that attach to the shared memory, saving considerable translation slots in the hardware TLB. This can be monitored on Solaris 10 by trapstat:

# trapstat -T

cpu m size| itlb-miss %tim itsb-miss %tim | dtlb-miss %tim dtsb-miss %tim |%tim

———-+——————————-+——————————-+—-

512 u 8k| 1761 0.1 2841 0.2 | 2594 0.1 2648 0.2 | 0.5

512 u 64k| 0 0.0 0 0.0 | 8 0.0 0 0.0 | 0.0

512 u 512k| 0 0.0 0 0.0 | 0 0.0 0 0.0 | 0.0

512 u 4m| 20 0.0 1 0.0 | 4 0.0 0 0.0 | 0.0

512 u 32m| 0 0.0 0 0.0 | 11 0.0 0 0.0 | 0.0

512 u 256m| 0 0.0 0 0.0 | 0 0.0 0 0.0 | 0.0

trapstat show both instruction and data misses in both the TLB and the TSB.

Solaris 8 does not have trapstat, so the trick is to use cpustat:

On a non-idle Oracle system using ISM as seen below,

mpstat 5 5

CPU minf mjf xcal intr ithr csw icsw migr smtx srw syscl usr sys wt idl

0 0 0 282 728 547 1842 283 329 62 10 3257 40 8 25 27

1 0 0 122 227 2 1954 284 327 55 9 3639 39 6 29 26

2 0 0 257 1578 1399 1887 288 330 58 9 3287 35 11 27 27

3 1 0 313 1758 1501 1933 285 328 70 12 3437 36 8 29 27

cpustat -c pic0=Cycle_cnt,pic1=DTLB_miss 1

time cpu event pic0 pic1

1.010 3 tick 192523799 29658

1.010 2 tick 270995815 28499

1.010 0 tick 225156772 29621

1.010 1 tick 234603152 29034

psrinfo –v

Status of processor 3 as of: 05/14/07 12:48:53
Processor has been on-line since
03/11/07 10:35:22.
The sparcv9 processor operates at 1062 MHz,
and has a sparcv9 floating point processor.

cpustat shows that on processor 3, we have 29658 dTLB misses on this sample. UltraSparcIII will use somewhere between 50 cycles (most favourable case: no TLB entry miss) and 300 cycles (worst case: a memory load has to be performed to compute the translation) to handle dTLB accesses. It will take in the best scenario 1.5 million cycles per seconds and 8.9 millions in the worst to handle the misses. At 1062Mhz, the time spent handling dTLB misses is only between 0.14% and 0.84% !

Large pages.

From Solaris 2.6 through Solaris 8, large pages are only available through the use of the ISM (using SHM_SHARE_MMU).

Solaris 8

pagesize
8192

Solaris 10 :default pagesize

pagesize
8192

Supported page sizes:

pagesize -a
8192

65536

524288

4194304

ISM page size on Solaris 10 (look at the pgsz column). It looks like Oracle is using the largest page available

 

pmap -sx 25921

25921: oracleSID1 (LOCAL=NO)

Address Kbytes RSS Anon Locked Pgsz Mode Mapped File

00000001064D2000 24 24 24 8K rwx– [ heap ]

00000001064D8000 32 8 – rwx– [ heap ]

0000000380000000 1048576 1048576 1048576 4M rwxsR [ ism shmid=0x6f000078 ]

AMD 64/x64. The AMD Opteron processor supports both 4Kbyte and 2Mbyte page sizes:

pagesize -a

4096

2097152

x86. The implementation of Solaris on x86 processors provides support for 4Kbyte pages only.

 

This post will be followed up by a discussion about DISM, the differences with ISM and a word of caution about using DISM on Solaris 8:
Oracle ISM and DISM: more than a no paging scheme…but be careful with Solaris 8 (2/2)

Leave a Comment »

No comments yet.

RSS feed for comments on this post. TrackBack URI

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

Create a free website or blog at WordPress.com.

%d bloggers like this: