Christian Bilien’s Oracle performance and tuning blog

April 2, 2007

No age scheduling for Oracle on HP-UX: HPUX_SCHED_NOAGE (2/2)

Filed under: HP-UX,Oracle — christianbilien @ 8:31 pm

The first post on this topic was presenting the normal HP-UX timeshare scheduling, under which all processes will run. Its main objective was to give the rules for context switching (threads releasing CPUs and getting a CPU back), before comparing it to the timeshare no age scheduling.

The decaying priority scheme has not been altered since the very early HP-UX workstations, at a time when I assume data bases were a remote if not non existent consideration to the HP-UX designers.

As a thread gets weaker because of a decreasing priority, its probability of being switched out will increase. If this process holds a critical resource (an Oracle latch, not even mentioning the log writer), this context switch will cause resource contention should other threads wait for the same resource.

A context switch (you can see them with in the pswch/s column of sar -w) is a rather expensive task: the system scheduler makes a copy of all the switched out process registries and copy into a private area known as the UAREA. The reverse operation needs to be performed for the incoming thread: its UAREA will be copied in the CPU registries. Another problem will of course occurs when CPU switches also happen on top of thread switches: modified cache lines need to be invalidated and read lines reloaded on the new CPU. I once read that a context switch would cost around 10 microseconds (I did not verify it myself), which is far for negligible. The HP-UX internals manuals mentions 10000 CPU cycle, which would indeed be translated into 10 microseconds with a 1Ghz CPU.

So when does a context switch occurs? HP-UX considers two cases: forced and voluntary context switches. Within the Oracle context, voluntary context switches are most of the time disk I/Os. A forced context switch will occur at the end of a timeslice when a thread with an equal or higher priority is runnable, or when a process returns from a system call or trap and a higher-priority thread is runnable.

Oradebug helps diagnosing the LGWR forced and volontary switches:

SQL> oradebug setospid 2022
Oracle pid: 6, Unix process pid: 2022, image: oracle@mymachine12 (LGWR)
SQL> oradebug unlimit
Statement processed.
SQL> oradebug procstat
Statement processed.
SQL> oradebug tracefile_name
SQL> !more /oracle/product/10.2.0/admin/MYDB/bdump/mydb_lgwr_2022.trc

Voluntary context switches = 272
Involuntary context switches = 167

The sched no age policy was added in HP-UX11i: it is part of the 178-225 range (the same range as the normal time share scheduler user priorities), but a thread running in this scheduler will not experience any priority decay: its priority will remain constant over time although the thread may be timesliced. Some expected benefits:

  • Less context switches overhead (lower cpu utilization)
  • The cpu hungry transactions may complete faster in a CPU busy system (the less hungry transactions may run slower, but as they do not spend much time on the CPUs, it may not be noticeable).
  • May help resolve an LGWR problem: the log writer inherently experiences a lot of voluntary context switches, which helps keeping a high priority. However, it may not be enough on CPU-busy systems.

As the root user, give the RTSCHED and RTPRIO privileges to the dba group setprivgrp dba RTSCHED RTPRIO. Create the /etc/privgroup file, if it does not exist, and add the following line to it: dba RTSCHED RTPRIO. Finally update the spfile/init with hpux_sched_noage=178 to 255. See rtsched(1).


Blog at