Home | History | Annotate | Download | only in tools
      1 Demonstrations of cpuunclaimed, the Linux eBPF/bcc version.
      2 
      3 
      4 This tool samples the length of the CPU run queues and determine when there are
      5 idle CPUs, yet queued threads waiting their turn. It reports the amount of idle
      6 (yet unclaimed by waiting threads) CPU as a system-wide percentage. For
      7 example:
      8 
      9 # ./cpuunclaimed.py
     10 Sampling run queues... Output every 1 seconds. Hit Ctrl-C to end.
     11 %CPU  83.00%, unclaimed idle 0.12%
     12 %CPU  87.25%, unclaimed idle 0.38%
     13 %CPU  85.00%, unclaimed idle 0.25%
     14 %CPU  85.00%, unclaimed idle 0.25%
     15 %CPU  80.88%, unclaimed idle 0.00%
     16 %CPU  82.25%, unclaimed idle 0.00%
     17 %CPU  83.50%, unclaimed idle 0.12%
     18 %CPU  81.50%, unclaimed idle 0.00%
     19 %CPU  81.38%, unclaimed idle 0.00%
     20 [...]
     21 
     22 This shows a system running at over 80% CPU utilization, and with less than
     23 0.5% unclaimed idle CPUs.
     24 
     25 Unclaimed idle CPUs can happen for a number of reasons:
     26 
     27 - An application has been bound to some, but not all, CPUs, and has runnable
     28   threads that cannot migrate to other CPUs due to this configuration.
     29 - CPU affinity: an optimization that leaves threads on CPUs where the CPU
     30   caches are warm, even if this means short periods of waiting while other
     31   CPUs are idle. The wait period is tunale (see sysctl, kernel.sched*).
     32 - Scheduler bugs.
     33 
     34 An unclaimed idle of < 1% is likely to be CPU affinity, and not usually a
     35 cause for concern. By leaving the CPU idle, overall throughput of the system
     36 may be improved. This tool is best for identifying larger issues, > 2%, due
     37 to the coarseness of its 99 Hertz samples.
     38 
     39 
     40 This is an 8 CPU system, with an 8 CPU-bound threaded application running that
     41 has been bound to one CPU (via taskset):
     42 
     43 # ./cpuunclaimed.py 
     44 Sampling run queues... Output every 1 seconds. Hit Ctrl-C to end.
     45 %CPU  12.63%, unclaimed idle 86.36%
     46 %CPU  12.50%, unclaimed idle 87.50%
     47 %CPU  12.63%, unclaimed idle 87.37%
     48 %CPU  12.75%, unclaimed idle 87.25%
     49 %CPU  12.50%, unclaimed idle 87.50%
     50 %CPU  12.63%, unclaimed idle 87.37%
     51 %CPU  12.50%, unclaimed idle 87.50%
     52 %CPU  12.50%, unclaimed idle 87.50%
     53 [...]
     54 
     55 It shows that 7 of the 8 CPUs (87.5%) are idle at the same time there are
     56 queued threads waiting to run on CPU. This is an artificial situation caused
     57 by binding threads to the same CPU, to demonstrate how the tool works.
     58 
     59 
     60 This is an 8 CPU system running a Linux kernel build with "make -j8", and -T
     61 to print timestamps:
     62 
     63 # ./cpuunclaimed.py -T
     64 Sampling run queues... Output every 1 seconds. Hit Ctrl-C to end.
     65 22:25:55 %CPU  98.88%, unclaimed idle 0.12%
     66 22:25:56 %CPU  99.75%, unclaimed idle 0.25%
     67 22:25:57 %CPU  99.50%, unclaimed idle 0.50%
     68 22:25:58 %CPU  99.25%, unclaimed idle 0.75%
     69 22:25:59 %CPU  99.75%, unclaimed idle 0.25%
     70 22:26:00 %CPU  99.50%, unclaimed idle 0.50%
     71 22:26:01 %CPU  99.25%, unclaimed idle 0.75%
     72 22:26:02 %CPU  99.25%, unclaimed idle 0.75%
     73 22:26:03 %CPU  99.01%, unclaimed idle 0.87%
     74 22:26:04 %CPU  99.88%, unclaimed idle 0.12%
     75 22:26:05 %CPU  99.38%, unclaimed idle 0.62%
     76 
     77 There's now a consistent, yet small, amount of unclaimed idle CPU. This is
     78 expected to be deliberate: CPU affinity, as mentioned earlier.
     79 
     80 
     81 The -j option will print raw samples: around one hundred lines of output
     82 every second. For the same system with a Linux kernel build of "make -j8":
     83 
     84 # ./cpuunclaimed.py -j
     85 TIMESTAMP_ns,CPU0,CPU1,CPU2,CPU3,CPU4,CPU5,CPU6,CPU7
     86 514606928954752,1,1,1,1,1,1,1,1
     87 514606939054312,1,1,1,1,1,1,1,2
     88 514606949156518,1,1,1,1,1,1,1,1
     89 514606959256596,2,2,1,1,1,1,1,1
     90 514606969357989,1,1,1,1,1,2,1,1
     91 514606979459700,1,2,1,1,1,2,1,1
     92 514606989560481,1,1,1,1,1,1,1,1
     93 514606999661396,1,1,1,1,1,1,2,1
     94 514607009795601,1,1,1,1,1,1,1,2
     95 514607019862711,1,1,1,1,1,1,1,1
     96 514607029963734,1,1,1,1,1,1,1,1
     97 514607040062372,1,1,1,1,1,1,1,1
     98 514607050197735,1,1,1,2,1,1,1,1
     99 514607060266464,1,1,1,1,1,1,1,2
    100 514607070368025,1,1,1,1,1,2,1,1
    101 514607080468375,1,1,1,1,1,1,1,2
    102 514607090570292,3,2,1,1,1,1,1,1
    103 514607100670725,1,1,1,1,1,2,1,1
    104 514607110771946,1,2,1,1,1,1,1,1
    105 514607120873489,1,1,1,1,2,1,2,1
    106 514607130973857,2,1,1,1,3,1,1,1
    107 514607141080056,0,1,1,1,1,2,1,3
    108 514607151176312,1,1,1,2,1,1,1,1
    109 514607161277753,1,1,1,1,1,1,2,1
    110 514607171379095,1,1,1,1,1,1,1,1
    111 514607181479262,1,1,1,1,1,1,1,1
    112 514607191580794,3,1,1,1,1,1,1,1
    113 514607201680952,1,1,1,1,1,1,2,1
    114 514607211783683,1,1,1,1,1,1,1,1
    115 514607221883274,1,1,1,1,1,1,0,1
    116 514607231984244,1,1,1,1,1,1,1,1
    117 514607242085698,1,1,1,1,1,1,1,1
    118 514607252216898,1,2,1,1,1,1,1,1
    119 514607262289420,1,1,1,1,1,2,1,1
    120 514607272389922,1,1,1,1,1,1,1,1
    121 514607282489413,1,1,1,1,1,1,1,1
    122 514607292589950,1,3,1,1,1,1,1,1
    123 514607302693367,1,1,1,1,2,1,1,1
    124 514607312793792,1,1,1,1,1,1,1,1
    125 514607322895249,1,1,1,3,1,1,3,1
    126 514607332994278,1,0,1,1,1,2,1,2
    127 514607343095836,1,1,1,1,1,2,1,1
    128 514607353196533,1,1,1,1,2,1,1,1
    129 514607363297749,1,1,1,1,1,1,1,2
    130 514607373399011,1,1,1,1,1,1,1,2
    131 514607383499730,1,1,1,1,1,1,1,2
    132 514607393601510,1,1,1,1,1,1,1,2
    133 514607403704117,2,1,1,1,1,1,1,2
    134 514607413802700,1,1,1,1,2,1,0,1
    135 514607423904559,1,1,1,1,1,1,1,1
    136 [...]
    137 
    138 The output is verbose: printing out a timestamp, and then the length of each
    139 CPU's run queue. The second last line, of timestamp 514607413802700, is an
    140 example of what this tool detects: CPU 4 has a run queue length of 4, which
    141 means one thread running and one thread queued, while CPU 6 has a run queue
    142 length of 0: idle. The very next sample shows all CPUs busy.
    143 
    144 
    145 The -J option prints raw samples with time offsets showing when the samples
    146 were collected on each CPU. It's mostly useful for debugging the tool itself.
    147 For example, during a Linux kernel build:
    148 
    149 # ./cpuunclaimed.py -J
    150 TIMESTAMP_ns,CPU0,CPU1,CPU2,CPU3,CPU4,CPU5,CPU6,CPU7,OFFSET_ns_CPU0,OFFSET_ns_CPU1,OFFSET_ns_CPU2,OFFSET_ns_CPU3,OFFSET_ns_CPU4,OFFSET_ns_CPU5,OFFSET_ns_CPU6,OFFSET_ns_CPU7
    151 514722625198188,1,1,1,1,1,1,1,2,0,28321,51655,73396,89654,111172,132803,159792
    152 514722635299034,1,1,1,1,1,2,1,1,0,28809,51999,74183,89552,110011,131995,153519
    153 514722645400274,1,1,1,1,1,1,1,2,0,28024,51333,73652,88964,110075,131973,153568
    154 514722655501816,1,2,1,1,1,1,1,1,0,28893,51671,75233,89496,109430,131945,153694
    155 514722665602594,1,1,2,1,1,2,1,1,0,28623,50988,73866,89383,109186,131786,154555
    156 514722675703498,1,1,1,1,1,1,1,1,0,27379,51031,73175,89625,110380,131482,104811
    157 514722685804942,1,1,1,1,1,2,1,1,0,27213,50501,72183,88797,108780,130659,152153
    158 514722695906294,1,1,1,1,1,1,1,1,0,27036,51182,73420,87861,109585,130364,155089
    159 514722706005778,1,1,1,1,1,1,1,1,0,28492,51851,74138,89744,110208,132462,154060
    160 514722716060705,1,1,1,1,1,1,1,1,0,154499,152528,155232,155046,154502,178746,200001
    161 514722726209615,1,1,1,1,1,1,1,1,0,28170,49580,72605,87741,108144,130723,152138
    162 514722736309475,1,2,1,1,1,1,1,1,0,27421,51386,73061,89358,109457,131273,153005
    163 514722746410845,1,2,1,1,1,2,1,1,0,27788,50840,72720,88920,109111,131143,152979
    164 514722756511363,1,1,1,1,1,1,2,1,0,28280,50977,73559,89848,109659,131579,152693
    165 514722766613044,1,1,1,1,1,1,1,1,0,28046,50812,72754,89160,110108,130735,152948
    166 514722776712932,1,1,1,2,1,1,1,1,0,28586,51177,73974,89588,109947,132376,154162
    167 514722786815477,1,1,1,1,1,1,1,1,0,27973,71104,72539,88302,108896,130414,152236
    168 514722796914955,1,1,1,1,1,1,1,1,0,29054,52354,74214,89592,110615,132586,153925
    169 514722807044060,1,1,1,1,1,1,1,1,1587130,0,24079,46633,61787,82325,104706,125278
    170 514722817117432,2,1,2,1,1,1,1,1,0,27628,51038,75138,89724,109340,132426,155348
    171 514722827218254,1,1,1,1,1,1,2,1,0,29111,51868,74347,88904,109911,132764,153851
    172 514722837340158,1,1,1,1,1,1,1,1,0,7366,30760,53528,68622,89317,111095,132319
    173 514722847421305,1,1,1,1,1,1,1,1,0,28257,51105,73841,89037,110820,131605,153368
    174 514722857521112,1,1,1,1,1,1,1,1,0,28544,51441,73857,89530,110497,131915,153513
    175 514722867626129,0,2,1,1,1,1,1,1,0,24621,47917,70568,85391,106670,128081,150329
    176 514722877727183,2,1,1,1,1,1,1,1,0,24869,47630,71547,84761,106048,128444,149285
    177 514722887824589,1,1,1,1,1,1,2,1,0,28793,51212,73863,89584,109773,132348,153194
    178 514722897925481,1,1,1,1,1,1,2,1,0,29278,51163,73961,89774,109592,132029,153715
    179 514722908026097,1,1,1,1,1,1,1,1,0,30630,35595,36210,188001,190815,190072,190732
    180 514722918127439,1,1,1,1,1,1,1,1,0,28544,51885,73948,89987,109763,132632,154083
    181 514722928227399,1,1,1,1,1,1,1,1,0,31882,51574,74769,89939,110578,132951,154356
    182 514722938329471,1,1,1,1,1,1,1,1,0,28498,51304,74101,89670,110278,132653,153176
    183 514722948430589,1,1,1,1,1,1,1,1,0,27868,50925,73477,89676,109583,132360,153014
    184 514722958531802,1,1,1,1,1,1,1,1,0,28505,50886,73729,89919,109618,131988,152896
    185 514722968632181,1,1,1,1,1,1,1,1,0,28492,51749,73977,90334,109816,132897,152890
    186 514722978733584,1,1,1,1,1,1,1,1,0,28847,50957,74121,90014,110019,132377,152978
    187 514722988834321,1,1,1,1,1,1,1,1,0,28601,51437,74021,89968,110252,132233,153623
    188 514722998937170,1,1,2,1,1,1,1,1,0,27007,50044,73259,87725,108663,132194,152459
    189 514723009036821,1,2,1,2,1,1,1,1,0,28226,50937,73983,89110,110476,131740,153663
    190 514723019137577,1,1,1,1,1,1,1,1,0,30261,52357,75657,87803,61823,131850,153585
    191 514723029238745,1,1,1,1,1,1,1,1,0,28030,50752,74452,89240,110791,132187,153327
    192 514723039339069,1,1,1,1,1,1,1,1,0,29791,52636,75996,90475,110414,132232,154714
    193 514723049439822,1,1,1,1,2,1,1,1,0,29133,56662,74153,89520,110683,132740,154708
    194 514723059541617,1,1,1,1,1,1,1,1,0,27932,51480,74644,89656,109176,131499,153732
    195 514723069642500,1,1,2,1,1,1,2,1,0,27678,51509,73984,90136,110124,131554,153459
    196 514723079743525,2,1,1,1,1,1,1,1,0,28029,51424,74394,90056,110087,132383,152963
    197 514723089844091,2,1,1,2,1,1,1,1,0,28944,51692,74440,90339,110402,132722,154083
    198 514723099945957,1,1,2,1,1,1,1,1,0,28425,51267,73164,89322,115048,114630,115187
    199 514723110047020,1,1,2,0,1,1,1,2,0,28192,50811,76814,89835,109370,131265,153511
    200 514723120216662,1,1,2,1,1,2,1,1,29,34,0,4514,19268,40293,62674,84009
    201 [...]
    202 
    203 This is a Xen guest system, and it shows that CPU 0 usually completes first (an
    204 offset of 0), followed by CPU 1 around 28000 nanoseconds later, and so on.
    205 The spread of offsets is triggered by the bcc Python library that initializes
    206 the timers, which steps through the CPUs in sequence, with a small delay
    207 between them merely from executing its own loop code.
    208 
    209 Here's more output during a Linux kernel build:
    210 
    211 # ./cpuunclaimed.py -J
    212 TIMESTAMP_ns,CPU0,CPU1,CPU2,CPU3,CPU4,CPU5,CPU6,CPU7,OFFSET_ns_CPU0,OFFSET_ns_CPU1,OFFSET_ns_CPU2,OFFSET_ns_CPU3,OFFSET_ns_CPU4,OFFSET_ns_CPU5,OFFSET_ns_CPU6,OFFSET_ns_CPU7
    213 514722625198188,1,1,1,1,1,1,1,2,0,28321,51655,73396,89654,111172,132803,159792
    214 515700745758947,2,1,1,1,1,1,1,1,0,19835,34891,49397,59364,71988,87571,102769
    215 515700755860451,2,1,1,1,1,1,1,2,0,19946,34323,49855,59844,72741,87925,102891
    216 515700765960560,1,1,1,1,1,1,1,1,0,20805,35339,50436,59677,73557,88661,104796
    217 515700776061744,1,1,1,1,1,1,1,1,1626,77,0,190,153452,154665,178218,154116
    218 515700786162017,1,1,1,1,1,1,1,1,0,20497,35361,51552,59787,74451,147789,104545
    219 515700796262811,1,1,1,1,1,1,1,2,0,20910,35657,50805,60175,73953,88492,103527
    220 515700806364951,1,1,1,1,1,1,1,1,0,20140,35023,50074,59726,72757,88040,102421
    221 515700816465253,1,1,1,1,1,1,2,1,0,20952,34899,50262,60048,72890,88067,103545
    222 515700826566573,1,1,1,1,1,1,1,1,0,20898,35490,50609,59805,74060,88550,103354
    223 515700836667480,1,1,1,1,1,1,2,1,0,20548,34760,50959,59490,73059,87820,103006
    224 515700846768182,1,1,1,1,1,1,2,1,0,20571,35113,50777,59962,74139,88543,103192
    225 515700856869468,1,1,2,1,1,2,2,1,0,20932,35382,50510,60106,73739,91818,103684
    226 515700866971905,1,1,1,2,1,1,1,1,0,19780,33018,49075,58375,71949,86537,102136
    227 515700877073459,2,1,1,1,1,1,1,1,0,20065,73966,48989,58832,71408,85714,101067
    228 515700887172772,1,1,1,1,1,1,1,1,0,20909,34608,51493,59890,73564,88668,103454
    229 515700897273292,1,2,1,1,1,1,1,1,0,20353,35292,50114,59773,73948,88615,103383
    230 515700907374341,1,1,2,1,1,1,1,1,0,20816,35206,50915,60062,73878,88857,103794
    231 515700917475331,1,1,6,1,1,2,1,1,0,20752,34931,50280,59764,73781,88329,103234
    232 515700927576958,1,1,1,1,1,1,1,1,0,19929,34703,50181,59364,73004,88053,103127
    233 515700937677298,1,1,2,2,1,1,1,1,0,21178,34724,50740,61193,73452,89030,103390
    234 515700947778409,2,1,1,1,1,1,1,1,0,21059,35604,50853,60098,73919,88675,103506
    235 515700957879196,2,1,1,1,1,1,1,1,0,21326,35939,51492,60083,74249,89474,103761
    236 [...]
    237 
    238 Notice the tighter range of offsets? I began executing cpuunclaimed when the
    239 system was idle, and it initialized the CPU timers more quickly, and then I
    240 began the Linux kernel build.
    241 
    242 Here's some different output, this time from a physical system with 4 CPUs,
    243 also doing a kernel build,
    244 
    245 # ./cpuunclaimed.py -J
    246 TIMESTAMP_ns,CPU0,CPU1,CPU2,CPU3,OFFSET_ns_CPU0,OFFSET_ns_CPU1,OFFSET_ns_CPU2,OFFSET_ns_CPU3
    247 4429382557480,1,1,1,1,0,6011,10895,16018
    248 4429392655042,2,1,1,1,0,8217,13661,19378
    249 4429402757604,1,1,1,1,0,6879,12433,18000
    250 4429412857809,1,1,1,1,0,8303,13190,18719
    251 4429422960709,2,1,1,1,0,6095,11234,17079
    252 4429433060391,1,1,1,2,0,6747,12480,18070
    253 4429443161699,1,1,1,1,0,6560,12264,17945
    254 4429453262002,1,2,1,1,0,6992,12644,18341
    255 4429463363706,1,2,1,1,0,6211,12071,17853
    256 4429473465571,1,1,1,1,0,5766,11495,17638
    257 4429483566920,1,1,1,1,0,5223,11736,16358
    258 4429493666279,1,1,1,1,0,6964,12653,18410
    259 4429503769113,1,1,1,1,0,5161,11399,16612
    260 4429513870744,1,1,1,1,0,5943,10583,15768
    261 4429523969826,1,1,1,1,0,6533,12336,18189
    262 4429534070311,1,1,1,1,0,6834,12816,18488
    263 4429544170456,1,1,1,1,0,7284,13401,19129
    264 4429554274467,1,2,1,1,0,5941,11160,16594
    265 4429564372365,1,2,1,1,0,7514,13618,19190
    266 4429574474406,1,2,1,1,0,6687,12650,18248
    267 4429584574220,1,2,1,1,0,7912,13705,19136
    268 [...]
    269 
    270 If the offset range becomes too great, we can no longer conclude about when
    271 some CPUs were idle and others had queued work. The tool will detect this,
    272 and print an error message and exit.
    273 
    274 
    275 Some systems can power down CPUs when idle, and when they wake up again the
    276 timed samples may resume from different offsets. If this happens, this tool
    277 can no longer draw conclusions about when some CPUs were idle and others
    278 had queued work, so it prints an error, and exits. Eg:
    279 
    280 # ./cpuunclaimed.py 1
    281 Sampling run queues... Output every 1 seconds. Hit Ctrl-C to end.
    282 %CPU   0.25%, unclaimed idle 0.00%
    283 %CPU   0.75%, unclaimed idle 0.00%
    284 %CPU   0.25%, unclaimed idle 0.00%
    285 %CPU   0.00%, unclaimed idle 0.00%
    286 %CPU   0.00%, unclaimed idle 0.00%
    287 %CPU   0.12%, unclaimed idle 0.00%
    288 %CPU   0.00%, unclaimed idle 0.00%
    289 %CPU   0.25%, unclaimed idle 0.00%
    290 %CPU   0.00%, unclaimed idle 0.00%
    291 %CPU   0.12%, unclaimed idle 0.00%
    292 %CPU   0.13%, unclaimed idle 0.00%
    293 %CPU   0.12%, unclaimed idle 0.00%
    294 %CPU   0.00%, unclaimed idle 0.00%
    295 %CPU   0.00%, unclaimed idle 0.00%
    296 %CPU   0.00%, unclaimed idle 0.00%
    297 %CPU   0.00%, unclaimed idle 0.00%
    298 ERROR: CPU samples arrived at skewed offsets (CPUs may have powered down when idle), spanning 4328176 ns (expected < 4040404 ns). Debug with -J, and see the man page. As output may begin to be unreliable, exiting.
    299 
    300 It's expected that this will only really occur on idle systems.
    301 
    302 USAGE:
    303 
    304 # ./cpuunclaimed.py -h
    305 usage: cpuunclaimed.py [-h] [-j] [-J] [-T] [interval] [count]
    306 
    307 Sample CPU run queues and calculate unclaimed idle CPU
    308 
    309 positional arguments:
    310   interval         output interval, in seconds
    311   count            number of outputs
    312 
    313 optional arguments:
    314   -h, --help       show this help message and exit
    315   -j, --csv        print sample summaries (verbose) as comma-separated values
    316   -J, --fullcsv    print sample summaries with extra fields: CPU sample
    317                    offsets
    318   -T, --timestamp  include timestamp on output
    319 
    320 examples:
    321     ./cpuunclaimed            # sample and calculate unclaimed idle CPUs,
    322                               # output every 1 second (default)
    323     ./cpuunclaimed 5 10       # print 5 second summaries, 10 times
    324     ./cpuunclaimed -T 1       # 1s summaries and timestamps
    325     ./cpuunclaimed -j         # raw dump of all samples (verbose), CSV
    326