Home | History | Annotate | Download | only in tools
      1 Demonstrations of cpudist.
      2 
      3 This program summarizes task on-CPU time as a histogram, showing how long tasks
      4 spent on the CPU before being descheduled. This provides valuable information
      5 that can indicate oversubscription (too many tasks for too few processors),
      6 overhead due to excessive context switching (e.g. a common shared lock for
      7 multiple threads), uneven workload distribution, too-granular tasks, and more.
      8 
      9 Alternatively, the same options are available for summarizing task off-CPU
     10 time, which helps understand how often threads are being descheduled and how
     11 long they spend waiting for I/O, locks, timers, and other causes of suspension.
     12 
     13 # ./cpudist.py
     14 Tracing on-CPU time... Hit Ctrl-C to end.
     15 ^C
     16      usecs               : count     distribution
     17          0 -> 1          : 0        |                                        |
     18          2 -> 3          : 1        |                                        |
     19          4 -> 7          : 1        |                                        |
     20          8 -> 15         : 13       |**                                      |
     21         16 -> 31         : 187      |****************************************|
     22         32 -> 63         : 89       |*******************                     |
     23         64 -> 127        : 26       |*****                                   |
     24        128 -> 255        : 0        |                                        |
     25        256 -> 511        : 1        |                                        |
     26 
     27 This is from a mostly idle system. Tasks wake up occasionally and run for only
     28 a few dozen microseconds, and then get descheduled.
     29 
     30 Here's some output from a system that is heavily loaded by threads that perform
     31 computation but also compete for a lock:
     32 
     33 # ./cpudist.py
     34 Tracing on-CPU time... Hit Ctrl-C to end.
     35 ^C
     36      usecs               : count     distribution
     37          0 -> 1          : 51       |*                                       |
     38          2 -> 3          : 395      |***********                             |
     39          4 -> 7          : 259      |*******                                 |
     40          8 -> 15         : 61       |*                                       |
     41         16 -> 31         : 75       |**                                      |
     42         32 -> 63         : 31       |                                        |
     43         64 -> 127        : 7        |                                        |
     44        128 -> 255        : 5        |                                        |
     45        256 -> 511        : 3        |                                        |
     46        512 -> 1023       : 5        |                                        |
     47       1024 -> 2047       : 6        |                                        |
     48       2048 -> 4095       : 4        |                                        |
     49       4096 -> 8191       : 1361     |****************************************|
     50       8192 -> 16383      : 523      |***************                         |
     51      16384 -> 32767      : 3        |                                        |
     52 
     53 A bimodal distribution is now clearly visible. Most of the time, tasks were
     54 able to run for 4-16ms before being descheduled (this is likely the quantum
     55 length). Occasionally, tasks had to be descheduled a lot earlier -- possibly
     56 because they competed for a shared lock.
     57 
     58 If necessary, you can restrict the output to include only threads from a 
     59 particular process -- this helps reduce noise:
     60 
     61 # ./cpudist.py -p $(pidof parprimes)
     62 Tracing on-CPU time... Hit Ctrl-C to end.
     63 ^C
     64      usecs               : count     distribution
     65          0 -> 1          : 3        |                                        |
     66          2 -> 3          : 17       |                                        |
     67          4 -> 7          : 39       |                                        |
     68          8 -> 15         : 52       |*                                       |
     69         16 -> 31         : 43       |                                        |
     70         32 -> 63         : 12       |                                        |
     71         64 -> 127        : 13       |                                        |
     72        128 -> 255        : 0        |                                        |
     73        256 -> 511        : 1        |                                        |
     74        512 -> 1023       : 11       |                                        |
     75       1024 -> 2047       : 15       |                                        |
     76       2048 -> 4095       : 41       |                                        |
     77       4096 -> 8191       : 1134     |************************                |
     78       8192 -> 16383      : 1883     |****************************************|
     79      16384 -> 32767      : 65       |*                                       |
     80 
     81 You can also ask for output at predefined intervals, and include timestamps for
     82 easier interpretation. While we're at it, the -P switch will print a histogram
     83 separately for each process:
     84 
     85 # ./cpudist.py -TP 5 3
     86 Tracing on-CPU time... Hit Ctrl-C to end.
     87 
     88 03:46:51
     89 
     90 pid = 0
     91      usecs               : count     distribution
     92          0 -> 1          : 0        |                                        |
     93          2 -> 3          : 1        |**                                      |
     94          4 -> 7          : 17       |**********************************      |
     95          8 -> 15         : 11       |**********************                  |
     96         16 -> 31         : 20       |****************************************|
     97         32 -> 63         : 15       |******************************          |
     98         64 -> 127        : 9        |******************                      |
     99        128 -> 255        : 6        |************                            |
    100        256 -> 511        : 0        |                                        |
    101        512 -> 1023       : 0        |                                        |
    102       1024 -> 2047       : 0        |                                        |
    103       2048 -> 4095       : 1        |**                                      |
    104 
    105 pid = 5068
    106      usecs               : count     distribution
    107          0 -> 1          : 0        |                                        |
    108          2 -> 3          : 1        |*************                           |
    109          4 -> 7          : 3        |****************************************|
    110          8 -> 15         : 0        |                                        |
    111         16 -> 31         : 0        |                                        |
    112         32 -> 63         : 1        |*************                           |
    113 
    114 03:46:56
    115 
    116 pid = 0
    117      usecs               : count     distribution
    118          0 -> 1          : 0        |                                        |
    119          2 -> 3          : 1        |**                                      |
    120          4 -> 7          : 19       |****************************************|
    121          8 -> 15         : 11       |***********************                 |
    122         16 -> 31         : 9        |******************                      |
    123         32 -> 63         : 3        |******                                  |
    124         64 -> 127        : 1        |**                                      |
    125        128 -> 255        : 3        |******                                  |
    126        256 -> 511        : 0        |                                        |
    127        512 -> 1023       : 1        |**                                      |
    128 
    129 pid = 5068
    130      usecs               : count     distribution
    131          0 -> 1          : 1        |********************                    |
    132          2 -> 3          : 0        |                                        |
    133          4 -> 7          : 2        |****************************************|
    134 
    135 03:47:01
    136 
    137 pid = 0
    138      usecs               : count     distribution
    139          0 -> 1          : 0        |                                        |
    140          2 -> 3          : 0        |                                        |
    141          4 -> 7          : 12       |********************************        |
    142          8 -> 15         : 15       |****************************************|
    143         16 -> 31         : 15       |****************************************|
    144         32 -> 63         : 0        |                                        |
    145         64 -> 127        : 3        |********                                |
    146        128 -> 255        : 1        |**                                      |
    147        256 -> 511        : 0        |                                        |
    148        512 -> 1023       : 1        |**                                      |
    149 
    150 pid = 5068
    151      usecs               : count     distribution
    152          0 -> 1          : 0        |                                        |
    153          2 -> 3          : 1        |******                                  |
    154          4 -> 7          : 6        |****************************************|
    155          8 -> 15         : 0        |                                        |
    156         16 -> 31         : 0        |                                        |
    157         32 -> 63         : 2        |*************                           |
    158 
    159 This histogram was obtained while executing `dd if=/dev/zero of=/dev/null` with
    160 fairly large block sizes.
    161 
    162 You could also ask for an off-CPU report using the -O switch. Here's a
    163 histogram of task block times while the system is heavily loaded:
    164 
    165 # ./cpudist -O -p $(parprimes)
    166 Tracing off-CPU time... Hit Ctrl-C to end.
    167 ^C
    168      usecs               : count     distribution
    169          0 -> 1          : 0        |                                        |
    170          2 -> 3          : 1        |                                        |
    171          4 -> 7          : 0        |                                        |
    172          8 -> 15         : 0        |                                        |
    173         16 -> 31         : 0        |                                        |
    174         32 -> 63         : 3        |                                        |
    175         64 -> 127        : 1        |                                        |
    176        128 -> 255        : 1        |                                        |
    177        256 -> 511        : 0        |                                        |
    178        512 -> 1023       : 2        |                                        |
    179       1024 -> 2047       : 4        |                                        |
    180       2048 -> 4095       : 3        |                                        |
    181       4096 -> 8191       : 70       |***                                     |
    182       8192 -> 16383      : 867      |****************************************|
    183      16384 -> 32767      : 141      |******                                  |
    184      32768 -> 65535      : 8        |                                        |
    185      65536 -> 131071     : 0        |                                        |
    186     131072 -> 262143     : 1        |                                        |
    187     262144 -> 524287     : 2        |                                        |
    188     524288 -> 1048575    : 3        |                                        |
    189 
    190 As you can see, threads are switching out for relatively long intervals, even
    191 though we know the workload doesn't have any significant blocking. This can be
    192 a result of over-subscription -- too many threads contending over too few CPUs.
    193 Indeed, there are four available CPUs and more than four runnable threads:
    194 
    195 # nproc
    196 4
    197 # cat /proc/loadavg
    198 0.04 0.11 0.06 9/147 7494
    199 
    200 (This shows we have 9 threads runnable out of 147 total. This is more than 4,
    201 the number of available CPUs.)
    202 
    203 Finally, let's ask for a per-thread report and values in milliseconds instead
    204 of microseconds:
    205 
    206 # ./cpudist.py -p $(pidof parprimes) -mL
    207 Tracing on-CPU time... Hit Ctrl-C to end.
    208 
    209 
    210 tid = 5092
    211      msecs               : count     distribution
    212          0 -> 1          : 3        |                                        |
    213          2 -> 3          : 4        |                                        |
    214          4 -> 7          : 4        |                                        |
    215          8 -> 15         : 535      |****************************************|
    216         16 -> 31         : 14       |*                                       |
    217 
    218 tid = 5093
    219      msecs               : count     distribution
    220          0 -> 1          : 8        |                                        |
    221          2 -> 3          : 6        |                                        |
    222          4 -> 7          : 4        |                                        |
    223          8 -> 15         : 534      |****************************************|
    224         16 -> 31         : 12       |                                        |
    225 
    226 tid = 5094
    227      msecs               : count     distribution
    228          0 -> 1          : 38       |***                                     |
    229          2 -> 3          : 5        |                                        |
    230          4 -> 7          : 5        |                                        |
    231          8 -> 15         : 476      |****************************************|
    232         16 -> 31         : 25       |**                                      |
    233 
    234 tid = 5095
    235      msecs               : count     distribution
    236          0 -> 1          : 31       |**                                      |
    237          2 -> 3          : 6        |                                        |
    238          4 -> 7          : 10       |                                        |
    239          8 -> 15         : 478      |****************************************|
    240         16 -> 31         : 20       |*                                       |
    241 
    242 tid = 5096
    243      msecs               : count     distribution
    244          0 -> 1          : 21       |*                                       |
    245          2 -> 3          : 5        |                                        |
    246          4 -> 7          : 4        |                                        |
    247          8 -> 15         : 523      |****************************************|
    248         16 -> 31         : 16       |*                                       |
    249 
    250 tid = 5097
    251      msecs               : count     distribution
    252          0 -> 1          : 11       |                                        |
    253          2 -> 3          : 7        |                                        |
    254          4 -> 7          : 7        |                                        |
    255          8 -> 15         : 502      |****************************************|
    256         16 -> 31         : 23       |*                                       |
    257 
    258 tid = 5098
    259      msecs               : count     distribution
    260          0 -> 1          : 21       |*                                       |
    261          2 -> 3          : 5        |                                        |
    262          4 -> 7          : 3        |                                        |
    263          8 -> 15         : 494      |****************************************|
    264         16 -> 31         : 28       |**                                      |
    265 
    266 tid = 5099
    267      msecs               : count     distribution
    268          0 -> 1          : 15       |*                                       |
    269          2 -> 3          : 4        |                                        |
    270          4 -> 7          : 6        |                                        |
    271          8 -> 15         : 521      |****************************************|
    272         16 -> 31         : 12       |                                        |
    273 
    274 It looks like all threads are more-or-less equally busy, and are typically
    275 switched out after running for 8-15 milliseconds (again, this is the typical
    276 quantum length).
    277 
    278 
    279 USAGE message:
    280 
    281 # ./cpudist.py -h
    282 
    283 usage: cpudist.py [-h] [-O] [-T] [-m] [-P] [-L] [-p PID] [interval] [count]
    284 
    285 Summarize on-CPU time per task as a histogram.
    286 
    287 positional arguments:
    288   interval            output interval, in seconds
    289   count               number of outputs
    290 
    291 optional arguments:
    292   -h, --help          show this help message and exit
    293   -O, --offcpu        measure off-CPU time
    294   -T, --timestamp     include timestamp on output
    295   -m, --milliseconds  millisecond histogram
    296   -P, --pids          print a histogram per process ID
    297   -L, --tids          print a histogram per thread ID
    298   -p PID, --pid PID   trace this PID only
    299 
    300 examples:
    301     cpudist              # summarize on-CPU time as a histogram
    302     cpudist -O           # summarize off-CPU time as a histogram
    303     cpudist 1 10         # print 1 second summaries, 10 times
    304     cpudist -mT 1        # 1s summaries, milliseconds, and timestamps
    305     cpudist -P           # show each PID separately
    306     cpudist -p 185       # trace PID 185 only
    307