UNIX Consulting and Expertise
Golden Apple Enterprises Ltd. » Archive of 'Nov, 2009'

Playing with Solaris processor sets Comments Off on Playing with Solaris processor sets

Looking for UNIX and IT expertise? Why not get in touch and see how we can help?

The idea behind processor sets has been around for a decade or so in the HPC arena. You’ve got certain jobs, that require a certain amount of CPU resources, or a certain IO profile, so you want to dedicate some CPUs just to them. Solaris has had processor controls in since the dark days of 2.6.

*Note:* I’m going to be freely talking about CPUs as the processing unit. This is all on T2ks and so I know that they’re not *real* CPUs – call them thread processing units or something, but for simplicity this document will just call them CPUs and be done with it.

The actual management of processor sets is very straightforward, and I’ll be playing about with them on one of my favourite bits of kit – the Sun T2000.

First of all we use the psrinfo command to view the status of our processors:

bash-3.00# psrinfo
0       on-line   since 11/21/2006 20:24:57
1       on-line   since 11/21/2006 20:24:58
2       on-line   since 11/21/2006 20:24:58
3       on-line   since 11/21/2006 20:24:58
4       on-line   since 11/21/2006 20:24:58
5       on-line   since 11/21/2006 20:24:58
6       on-line   since 11/21/2006 20:24:58
7       on-line   since 11/21/2006 20:24:58
8       on-line   since 11/21/2006 20:24:58
9       on-line   since 11/21/2006 20:24:58
10      on-line   since 11/21/2006 20:24:58
11      on-line   since 11/21/2006 20:24:58
12      on-line   since 11/21/2006 20:24:58
13      on-line   since 11/21/2006 20:24:58
14      on-line   since 11/21/2006 20:24:58
15      on-line   since 11/21/2006 20:24:58
16      on-line   since 11/21/2006 20:24:58
17      on-line   since 11/21/2006 20:24:58
18      on-line   since 11/21/2006 20:24:58
19      on-line   since 11/21/2006 20:24:58
20      on-line   since 11/21/2006 20:24:58
21      on-line   since 11/21/2006 20:24:58
22      on-line   since 11/21/2006 20:24:58
23      on-line   since 11/21/2006 20:24:58
24      on-line   since 11/21/2006 20:24:58
25      on-line   since 11/21/2006 20:24:58
26      on-line   since 11/21/2006 20:24:58
27      on-line   since 11/21/2006 20:24:58
28      on-line   since 11/21/2006 20:24:58
29      on-line   since 11/21/2006 20:24:58
30      on-line   since 11/21/2006 20:24:58
31      on-line   since 11/21/2006 20:24:58

Let’s do a quick network performance test with iperf to see what sort of throughput we can get when all processing units are able to process network IO:

bash-3.00# ./iperf --client np1unx0006 --time 60 --dualtest
------------------------------------------------------------
Server listening on TCP port 5001
TCP window size: 48.0 KByte (default)
------------------------------------------------------------
------------------------------------------------------------
Client connecting to np1unx0006, TCP port 5001
TCP window size: 48.0 KByte (default)
------------------------------------------------------------
[  5] local 192.168.105.62 port 37438 connected with 192.168.105.59 port 5001
[  4] local 192.168.105.62 port 5001 connected with 192.168.105.59 port 63459
[  5]  0.0-60.0 sec  3.77 GBytes    540 Mbits/sec
[  4]  0.0-60.0 sec  3.62 GBytes    518 Mbits/sec

At the same time, let’s have a look with mpstat to get an idea of what the processors are dealing with while this is going on.

The important colums here are intr, showing the amount of interrupts each CPU is handling. We also need to keep an eye on the number of system calls each CPU is fielding (syscl) and also the context switches and involuntary context switches (csw and icsw respectively) to make sure jobs are completely before the scheduler kicks them off the CPU.

CPU minf mjf xcal  intr ithr  csw icsw migr smtx  srw syscl  usr sys  wt idl
  0   96   0  248  3481    0 7126    4   34  439    0  6041    2  25   0  74
  1  122   0  171  1332    0 2796    2   24  340    0  2369    2  14   0  85
  2   79   0  216   646    0 1472    0   18  226    0   202    0   5   0  95
  3   30   0  143   356    0  829    0   16  137    0    23    0   2   0  98
  4   47   0  260   618    0 1514    0   18  163    0    74    0   3   0  97
  5   56   0  257   714    0 1662    1   19  234    0   311    1   6   0  94
  6   67   0  466  1085    0 2593    1   19  588    0  1234    0  17   0  82
  7   26   0  268   894    0 2031    0   18  202    0   136    0   4   0  96
  8  241   0  341   993    0 2286    0   22  258    0   358    1   7   0  91
  9  190   0  292  1431    0 3102    1   21  257    0  1551    1   9   0  90
 10  114   0  336  1155    0 2580    0   18  286    0   429    0   6   0  94
 11   28   0  283   837    0 1883    1   18  551    0  1283    1  15   0  84
 12    0   0    1     2    0    3    0    1    0    0     0    0   0   0 100
 13    0   0    1     3    0    4    0    1    1    0     1    0   0   0 100
 14    3   0    2     5    0    9    0    1    4    0     3    0   0   0 100
 15    0   0    0     9    0    0    8    0    4    0 534955   75  25   0   0
 16   64   0  423  1299  110 2418    0   18  286    0    59    0   5   0  95
 17   89   0  454  1473    0 3233    0   19  319    0   793    1   7   0  92
 18   46   0  397   960    1 2217    0   18  290    0    39    0   4   0  96
 19   79   0  321  1048    2 2340    2   19  494    0  2073    2  15   0  83
 20   79   0  205   852    1 1773    1   21  313    0  1493    1  14   0  85
 21   27   0 19965 41259 41036  635   15   28 2862    0   415    0  47   0  53
 22   65   0  129  1069    0 2274    1   21  139    0  1053    1   7   0  92
 23   62   0  134   681    0 1446    1   20  370    0   931    1  14   0  85
 24  115   0  260   799    0 1986    0   22  212    0   313    0   4   0  95
 25  113   0  273   962    1 2225    1   22  266    0   684    1   7   0  93
 26   73   0  312  1241    0 2862    0   23  271    0   663    0   6   0  94
 27  115   0  270   862    0 2017    0   22  201    0   209    1   5   0  95
 28  179   0  225   689    0 1548    0   17  213    0   302    1   5   0  94
 29   42   0  224   656    0 1507    0   15  163    0   134    0   3   0  97
 30   40   0  298   774    0 1821    1   14  459    0  1316    1  17   0  83
 31   27   0  227   649    0 1544    1   15  644    0  1418    1  18   0  82

From this we can see we’re getting fairly decent throughput over GigE, and that the interrupts are spread across all the CPUs.

Now let’s create a processor set, and stick half our CPUs in it.

The command is psrset with the -c option to create a set. As this is the first processor set it will be processor set 1 – the next would be 2, etc. etc.

Remember we can get the number of our CPUs from the psrinfo command.

bash-3.00# psrset -c 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
created processor set 1
processor 0: was not assigned, now 1
processor 1: was not assigned, now 1
processor 2: was not assigned, now 1
processor 3: was not assigned, now 1
processor 4: was not assigned, now 1
processor 5: was not assigned, now 1
processor 6: was not assigned, now 1
processor 7: was not assigned, now 1
processor 8: was not assigned, now 1
processor 9: was not assigned, now 1
processor 10: was not assigned, now 1
processor 11: was not assigned, now 1
processor 12: was not assigned, now 1
processor 13: was not assigned, now 1
processor 14: was not assigned, now 1
processor 15: was not assigned, now 1

Now that we’ve assigned half our CPUs to processor set 1, we want to disable interrupt handling for them. We could use the psradm command to do it on a per CPU basis, but it’s much easier to just apply the setting to the entire processor set.

bash-3.00# psrset -f 1

The -f option disables interrupt handling, and the 1 is the processor set we want to apply this to.

We can check the effect by calling psrinfo again:

bash-3.00# psrinfo
0       no-intr   since 12/19/2006 18:15:15
1       no-intr   since 12/19/2006 18:15:15
2       no-intr   since 12/19/2006 18:15:15
3       no-intr   since 12/19/2006 18:15:15
4       no-intr   since 12/19/2006 18:15:15
5       no-intr   since 12/19/2006 18:15:15
6       no-intr   since 12/19/2006 18:15:15
7       no-intr   since 12/19/2006 18:15:15
8       no-intr   since 12/19/2006 18:15:15
9       no-intr   since 12/19/2006 18:15:15
10      no-intr   since 12/19/2006 18:15:15
11      no-intr   since 12/19/2006 18:15:15
12      no-intr   since 12/19/2006 18:15:15
13      no-intr   since 12/19/2006 18:15:15
14      no-intr   since 12/19/2006 18:15:15
15      no-intr   since 12/19/2006 18:15:15
16      on-line   since 11/21/2006 20:24:58
17      on-line   since 11/21/2006 20:24:58
18      on-line   since 11/21/2006 20:24:58
19      on-line   since 11/21/2006 20:24:58
20      on-line   since 11/21/2006 20:24:58
21      on-line   since 11/21/2006 20:24:58
22      on-line   since 11/21/2006 20:24:58
23      on-line   since 11/21/2006 20:24:58
24      on-line   since 11/21/2006 20:24:58
25      on-line   since 11/21/2006 20:24:58
26      on-line   since 11/21/2006 20:24:58
27      on-line   since 11/21/2006 20:24:58
28      on-line   since 11/21/2006 20:24:58
29      on-line   since 11/21/2006 20:24:58
30      on-line   since 11/21/2006 20:24:58
31      on-line   since 11/21/2006 20:24:58

Rock on! psrinfo clearly shows that half our CPUs will no longer handle interrupts. Let’s kick off another iperf throughput test and see what happens:

bash-3.00# ./iperf --client np1unx0006 --time 60 --dualtest
------------------------------------------------------------
Server listening on TCP port 5001
TCP window size: 48.0 KByte (default)
------------------------------------------------------------
------------------------------------------------------------
Client connecting to np1unx0006, TCP port 5001
TCP window size: 48.0 KByte (default)
------------------------------------------------------------
[  4] local 192.168.105.62 port 37419 connected with 192.168.105.59 port 5001
[  5] local 192.168.105.62 port 5001 connected with 192.168.105.59 port 63457
[  4]  0.0-60.0 sec  3.36 GBytes    481 Mbits/sec
[  5]  0.0-60.0 sec  3.05 GBytes    436 Mbits/sec

Looking at mpstat we can clearly see the effects:

CPU minf mjf xcal  intr ithr  csw icsw migr smtx  srw syscl  usr sys  wt idl
  0    0   0    0     1    0    0    0    0    0    0     0    0   0   0 100
  1    0   0    0     1    0    0    0    0    0    0     0    0   0   0 100
  2    0   0    0     1    0    0    0    0    0    0     0    0   0   0 100
  3    0   0    0     1    0    0    0    0    0    0     0    0   0   0 100
  4    0   0    0     1    0    0    0    0    0    0     0    0   0   0 100
  5    0   0    0     1    0    0    0    0    0    0     0    0   0   0 100
  6    0   0    0     1    0    0    0    0    0    0     0    0   0   0 100
  7    0   0    0     1    0    0    0    0    0    0     0    0   0   0 100
  8    0   0    0     1    0    0    0    0    0    0     0    0   0   0 100
  9    0   0    0     1    0    0    0    0    0    0     0    0   0   0 100
 10    0   0    0     1    0    0    0    0    0    0     0    0   0   0 100
 11    0   0    0     1    0    0    0    0    0    0     0    0   0   0 100
 12    0   0    0     1    0    0    0    0    0    0     0    0   0   0 100
 13    0   0    0     1    0    0    0    0    0    0     0    0   0   0 100
 14    0   0    0     1    0    0    0    0    0    0     0    0   0   0 100
 15    0   0    0     1    0    0    0    0    0    0     0    0   0   0 100
 16  336   0  276  1854  112 3372   12   49  844    0 135008   24  37   0  39
 17  261   0  115  2236    1 4594    6   51 1050    0 27470    7  37   0  56
 18  104   0  105  1699    3 3506    9   36 1071    0 103500   17  38   0  44
 19  106   0  148   855    1 1800    2   28  286    0 10881    2   8   0  89
 20  290   0 19102 42492 42200  838   28   42 2676    0 74629   17  60   0  23
 21  256   0  801  1952    0 4397    5   39 1272    0  2196    2  29   0  68
 22  209   0  475  1191    0 2663    2   38  552    0   776    1  12   0  87
 23  260   0  500  1134    4 2540    2   38  597    0 13071    4  13   0  84
 24  455   0  752  2213    1 5038    4   41  916    0 10316    4  20   0  77
 25  500   0  803  2485    0 5499    4   45 1352    0 17171    5  31   0  64
 26  654   0  683  1773    0 4009    5   45  933    0  2119    8  19   0  73
 27  503   0  516  1812    0 3952    5   45  748    0 21682    6  16   0  79
 28  552   0  860  2332    0 5093    7   40 1065    0 12217   16  21   0  63
 29  480   0  688  2292    0 4996    4   47  924    0  1395    3  17   0  80
 30  663   0  476  1553    0 3357    5   45  658    0  2680    9  16   0  75
 31  485   0  445  1520    0 3297    4   47  716    0  1167    2  16   0  82

We can see the non-interrupt handling CPUs in processor set 1 are totally idle – they’re just sitting there, twiddling their thumbs, and laughing at the other 16 CPUs working their socks off.

Involuntary context switches aren’t causing us an issue, so we can see that even with the reduced number of CPUs handling the interrupts, they’re still managed to deal with the load.

Now let’s see what happens when we execute the single-thread iperf process inside processor set 1. We can control this by using the psrset command to launch our app.

bash-3.00# psrset -e 1 ./iperf --client np1unx0006 --time 60 --dualtest
------------------------------------------------------------
Server listening on TCP port 5001
TCP window size: 48.0 KByte (default)
------------------------------------------------------------
------------------------------------------------------------
Client connecting to np1unx0006, TCP port 5001
TCP window size: 48.0 KByte (default)
------------------------------------------------------------
[  4] local 192.168.105.62 port 37419 connected with 192.168.105.59 port 5001
[  5] local 192.168.105.62 port 5001 connected with 192.168.105.59 port 63457
[  4]  0.0-60.0 sec  3.36 GBytes    481 Mbits/sec
[  5]  0.0-60.0 sec  3.05 GBytes    436 Mbits/sec

And mpstat should give us an idea of what’s happening:

CPU minf mjf xcal  intr ithr  csw icsw migr smtx  srw syscl  usr sys  wt idl
  0    0   0    0  7751    0 16048    3    0  179    0 15235    3  42   0  55
  1    0   0    0     1    0    0    0    0    0    0     0    0   0   0 100
  2    0   0    0     1    0    0    0    0    0    0     0    0   0   0 100
  3    0   0    0     1    0    0    0    0    0    0     0    0   0   0 100
  4    0   0    0   201    0  403    6    0 2478    0  8254    3  81   0  16
  5    0   0    0     1    0    0    0    0    0    0     0    0   0   0 100
  6    0   0    0     1    0    0    0    0    0    0     0    0   0   0 100
  7    0   0    0     1    0    0    0    0    0    0     0    0   0   0 100
  8    0   0    0     1    0    0    0    0    0    0     0    0   0   0 100
  9    0   0    1     3    0    4    0    0    2    0     0    0   0   0 100
 10    0   0    0     1    0    0    0    0    0    0     0    0   0   0 100
 11    0   0    0     1    0    0    0    0    0    0     0    0   0   0 100
 12    0   0    0     1    0    0    0    0    0    0     0    0   0   0 100
 13    0   0    0     1    0    0    0    0    0    0     0    0   0   0 100
 14    0   0    0     1    0    0    0    0    0    0     0    0   0   0 100
 15    0   0    0     8    0    0    7    0    4    0 532794   75  25   0   0
 16  313   0  942  1346  114 2533    1   20  459    0   419    1  13   0  86
 17  306   0  450   687    1 1529    0   13  297    0   558    1  13   0  86
 18  399   0  467   653    3 1442    1   14  255    0  3559   13  13   0  74
 19  221   0  326   509    1 1164    0   11  196    0   401    1   7   0  92
 20  646   0 10825 47201 47171   90    7   18 2405    0  1210    4  55   0  40
 21  156   0  673   757    0 1769    0   15  346    0   201    1   9   0  91
 22  220   0  959  1055    0 2467    0   15  488    0   397    2  12   0  86
 23  204   0  844   791    2 1844    0   14  443    0   223    1  12   0  88
 24  341   0  718  1033    1 2439    1   13  401    0  2571   10  14   0  76
 25  205   0  570   804    0 1945    0   13  314    0   376    1   9   0  90
 26  262   0  369   584    0 1379    0   12  204    0   422    1   7   0  92
 27  199   0  348   519    0 1226    0   13  180    0   533    2   6   0  92
 28  356   0  434   726    0 1730    0   17  247    0   515    2   9   0  89
 29  393   0  267   428    0 1043    1   18  197    0   999    3  11   0  86
 30  367   0  491   812    0 1829    0   17  298    0   449    1  10   0  89
 31  302   0  424   675    0 1538    0   15  245    0   339    1   8   0  91

Well, that’s broken things. How come the processors in the set are now handling interrupts?

It looks like executing the binary inside the processor set still generates interrupts – but these are unlikely to be network I/O. Check out the number of syscalls being generated! It’s likely an artefact of my poor choice of application – iperf generates a huge amount of interrupts and can really cane your ethernet interfaces.

We could use dtrace to have a real poke around, but I think that should be the topic for another day.

Now we’ve finished playing around, we need to re-enable interrupt handling on those CPUs. As the -f flag to psrset disabled interrupt handling, -n is the option we need to re-enabled interrupt handling on a processor set.

bash-3.00# psrset -n 1

Now the CPUs are handling interrupts again, we need to delete the processor set. We do this by passing the psrset command the -d option, and giving it the processor set number:

bash-3.00# psrset -d 1
removed processor set 1

Finally let’s run psrinfo and double check the state of our CPUs:

bash-3.00# psrinfo
0       on-line   since 12/19/2006 18:23:42
1       on-line   since 12/19/2006 18:23:42
2       on-line   since 12/19/2006 18:23:42
3       on-line   since 12/19/2006 18:23:42
4       on-line   since 12/19/2006 18:23:42
5       on-line   since 12/19/2006 18:23:42
6       on-line   since 12/19/2006 18:23:42
7       on-line   since 12/19/2006 18:23:42
8       on-line   since 12/19/2006 18:23:42
9       on-line   since 12/19/2006 18:23:42
10      on-line   since 12/19/2006 18:23:42
11      on-line   since 12/19/2006 18:23:42
12      on-line   since 12/19/2006 18:23:42
13      on-line   since 12/19/2006 18:23:42
14      on-line   since 12/19/2006 18:23:42
15      on-line   since 12/19/2006 18:23:42
16      on-line   since 11/21/2006 20:24:58
17      on-line   since 11/21/2006 20:24:58
18      on-line   since 11/21/2006 20:24:58
19      on-line   since 11/21/2006 20:24:58
20      on-line   since 11/21/2006 20:24:58
21      on-line   since 11/21/2006 20:24:58
22      on-line   since 11/21/2006 20:24:58
23      on-line   since 11/21/2006 20:24:58
24      on-line   since 11/21/2006 20:24:58
25      on-line   since 11/21/2006 20:24:58
26      on-line   since 11/21/2006 20:24:58
27      on-line   since 11/21/2006 20:24:58
28      on-line   since 11/21/2006 20:24:58
29      on-line   since 11/21/2006 20:24:58
30      on-line   since 11/21/2006 20:24:58
31      on-line   since 11/21/2006 20:24:58

Solaris processor sets are the easiest to use of all the resource controls built into the OS. We can peg things like zones, individual applications, or even specific processes, to their own processor sets to control and manage resource usage. This gives us some really fine grained control over how the system is used, and with a machine like the T2000 it allows us to really scale performance.

Interview with LANL researcher about using GPUs Comments Off on Interview with LANL researcher about using GPUs

Looking for UNIX and IT expertise? Why not get in touch and see how we can help?

Over on their nTersect blog NVidia have post an interesting interview with Pat McCormick, a Research Computer Scientist, at Los Alamos National Lab (LANL). If you’ve ever wondered exactly how using GPUs for computation would work, or how much of a performance improvement it could bring to your workloads, you should watch this interview.



According to Pat, “Our research challenge is dealing with massive amounts of data, not only from the high performance computing aspect but how to analyze the data from simulations.”

This isn’t an HPC problem, it’s an issue that affects every business today. As storage expands and business needs grow, faster and more efficient methods of data analysis are needed – and GPUs seem to be offering the most cost-efficient way to solve this at the moment.

Timelapse video of Sandia’s Sun Constellation build Comments Off on Timelapse video of Sandia’s Sun Constellation build

Looking for UNIX and IT expertise? Why not get in touch and see how we can help?

Sandia’s Sun Constellation system, Red Sky, has been placed at number 10 on the latest Top 500 supercomputer list. It’s a monster cluster system – 70TB of memory, 47,232 cores, and built up of Sun x6275 blade systems hooked up in a 3D torus with Infiniband interconnects.

Marc Hamilton has posted up a timelapse video on his blog over at Sun showing the system being installed.



Red Sky has replaced Sandia’s existing Thunderbird system, and is actually built in the same place as ASCI Red used to live. The x6275 blades use Intel Nehalem EP processors running at 2.96Ghz, with no local disk fitted to the blades, allowing a much greater density.

Red Sky also features Sun’s new Cooling Door System which pumps cooled water through the cabinet doors. Sandia’s calculations reckon that this will save over 5 million gallons of water a year, compared to traditional air-cooled systems.

Free Solaris 10 security training Comments Off on Free Solaris 10 security training

Looking for UNIX and IT expertise? Why not get in touch and see how we can help?

Over on his blog at Sun Glen Brunett has announced he’s published a new version of the Solaris 10 Deep Dive security training. He’s updated it to cover new features and tools available in the latest 10/09 release of Solaris 10.

The updated Deep Dive includes things like nss_LDAP support for shadowAccount, ZFS quotas, and an example of using the Solaris Trusted Extensions. As usual it’s well written and aims to expose a huge amount of technology very quickly – so grab a copy and have a read through.

You can grab the PDF here or the OpenOffice version here.

Glenn’s blog at Sun is well worth subscribing to to keep on top of general security issues and discussions, and if you like the latest Deep Dive update be sure to drop him a line.

Sun’s research report on Hardware Transactional Memory Comments Off on Sun’s research report on Hardware Transactional Memory

Looking for UNIX and IT expertise? Why not get in touch and see how we can help?

Sun have released a technical report on Transactional Memory, based on their experiences with the (now sadly canned) ROCK processor. “Early Experience with a Commercial Hardware Transactional Memory Implementation” is available as a free download from Sun’s research website – you can grab it at http://research.sun.com/techrep/2009/abstract-180.html

From the abstract:

We report on our experience with the hardware transactional memory (HTM) feature of two revisions of a prototype multicore processor. Our experience includes a number of promising results using HTM to improve performance in a variety of contexts, and also identifies some ways in which the feature could be improved to make it even better. We give detailed accounts of our experiences, sharing techniques we used to achieve the results we have, as well as describing challenges we faced in doing so. This technical report expands on our ASPLOS paper [9], providing more detail and reporting on additional work conducted since that paper was written.

Anyone who’s interested in High Performance Computing (HPC) or performance gains from Transactional Memory should have a read through this paper – it’s interesting stuff.

Top of page / Subscribe to new Entries (RSS)