1 Demonstrations of cpuunclaimed, the Linux eBPF/bcc version. 2 3 4 This tool samples the length of the CPU run queues and determine when there are 5 idle CPUs, yet queued threads waiting their turn. It reports the amount of idle 6 (yet unclaimed by waiting threads) CPU as a system-wide percentage. For 7 example: 8 9 # ./cpuunclaimed.py 10 Sampling run queues... Output every 1 seconds. Hit Ctrl-C to end. 11 %CPU 83.00%, unclaimed idle 0.12% 12 %CPU 87.25%, unclaimed idle 0.38% 13 %CPU 85.00%, unclaimed idle 0.25% 14 %CPU 85.00%, unclaimed idle 0.25% 15 %CPU 80.88%, unclaimed idle 0.00% 16 %CPU 82.25%, unclaimed idle 0.00% 17 %CPU 83.50%, unclaimed idle 0.12% 18 %CPU 81.50%, unclaimed idle 0.00% 19 %CPU 81.38%, unclaimed idle 0.00% 20 [...] 21 22 This shows a system running at over 80% CPU utilization, and with less than 23 0.5% unclaimed idle CPUs. 24 25 Unclaimed idle CPUs can happen for a number of reasons: 26 27 - An application has been bound to some, but not all, CPUs, and has runnable 28 threads that cannot migrate to other CPUs due to this configuration. 29 - CPU affinity: an optimization that leaves threads on CPUs where the CPU 30 caches are warm, even if this means short periods of waiting while other 31 CPUs are idle. The wait period is tunale (see sysctl, kernel.sched*). 32 - Scheduler bugs. 33 34 An unclaimed idle of < 1% is likely to be CPU affinity, and not usually a 35 cause for concern. By leaving the CPU idle, overall throughput of the system 36 may be improved. This tool is best for identifying larger issues, > 2%, due 37 to the coarseness of its 99 Hertz samples. 38 39 40 This is an 8 CPU system, with an 8 CPU-bound threaded application running that 41 has been bound to one CPU (via taskset): 42 43 # ./cpuunclaimed.py 44 Sampling run queues... Output every 1 seconds. Hit Ctrl-C to end. 45 %CPU 12.63%, unclaimed idle 86.36% 46 %CPU 12.50%, unclaimed idle 87.50% 47 %CPU 12.63%, unclaimed idle 87.37% 48 %CPU 12.75%, unclaimed idle 87.25% 49 %CPU 12.50%, unclaimed idle 87.50% 50 %CPU 12.63%, unclaimed idle 87.37% 51 %CPU 12.50%, unclaimed idle 87.50% 52 %CPU 12.50%, unclaimed idle 87.50% 53 [...] 54 55 It shows that 7 of the 8 CPUs (87.5%) are idle at the same time there are 56 queued threads waiting to run on CPU. This is an artificial situation caused 57 by binding threads to the same CPU, to demonstrate how the tool works. 58 59 60 This is an 8 CPU system running a Linux kernel build with "make -j8", and -T 61 to print timestamps: 62 63 # ./cpuunclaimed.py -T 64 Sampling run queues... Output every 1 seconds. Hit Ctrl-C to end. 65 22:25:55 %CPU 98.88%, unclaimed idle 0.12% 66 22:25:56 %CPU 99.75%, unclaimed idle 0.25% 67 22:25:57 %CPU 99.50%, unclaimed idle 0.50% 68 22:25:58 %CPU 99.25%, unclaimed idle 0.75% 69 22:25:59 %CPU 99.75%, unclaimed idle 0.25% 70 22:26:00 %CPU 99.50%, unclaimed idle 0.50% 71 22:26:01 %CPU 99.25%, unclaimed idle 0.75% 72 22:26:02 %CPU 99.25%, unclaimed idle 0.75% 73 22:26:03 %CPU 99.01%, unclaimed idle 0.87% 74 22:26:04 %CPU 99.88%, unclaimed idle 0.12% 75 22:26:05 %CPU 99.38%, unclaimed idle 0.62% 76 77 There's now a consistent, yet small, amount of unclaimed idle CPU. This is 78 expected to be deliberate: CPU affinity, as mentioned earlier. 79 80 81 The -j option will print raw samples: around one hundred lines of output 82 every second. For the same system with a Linux kernel build of "make -j8": 83 84 # ./cpuunclaimed.py -j 85 TIMESTAMP_ns,CPU0,CPU1,CPU2,CPU3,CPU4,CPU5,CPU6,CPU7 86 514606928954752,1,1,1,1,1,1,1,1 87 514606939054312,1,1,1,1,1,1,1,2 88 514606949156518,1,1,1,1,1,1,1,1 89 514606959256596,2,2,1,1,1,1,1,1 90 514606969357989,1,1,1,1,1,2,1,1 91 514606979459700,1,2,1,1,1,2,1,1 92 514606989560481,1,1,1,1,1,1,1,1 93 514606999661396,1,1,1,1,1,1,2,1 94 514607009795601,1,1,1,1,1,1,1,2 95 514607019862711,1,1,1,1,1,1,1,1 96 514607029963734,1,1,1,1,1,1,1,1 97 514607040062372,1,1,1,1,1,1,1,1 98 514607050197735,1,1,1,2,1,1,1,1 99 514607060266464,1,1,1,1,1,1,1,2 100 514607070368025,1,1,1,1,1,2,1,1 101 514607080468375,1,1,1,1,1,1,1,2 102 514607090570292,3,2,1,1,1,1,1,1 103 514607100670725,1,1,1,1,1,2,1,1 104 514607110771946,1,2,1,1,1,1,1,1 105 514607120873489,1,1,1,1,2,1,2,1 106 514607130973857,2,1,1,1,3,1,1,1 107 514607141080056,0,1,1,1,1,2,1,3 108 514607151176312,1,1,1,2,1,1,1,1 109 514607161277753,1,1,1,1,1,1,2,1 110 514607171379095,1,1,1,1,1,1,1,1 111 514607181479262,1,1,1,1,1,1,1,1 112 514607191580794,3,1,1,1,1,1,1,1 113 514607201680952,1,1,1,1,1,1,2,1 114 514607211783683,1,1,1,1,1,1,1,1 115 514607221883274,1,1,1,1,1,1,0,1 116 514607231984244,1,1,1,1,1,1,1,1 117 514607242085698,1,1,1,1,1,1,1,1 118 514607252216898,1,2,1,1,1,1,1,1 119 514607262289420,1,1,1,1,1,2,1,1 120 514607272389922,1,1,1,1,1,1,1,1 121 514607282489413,1,1,1,1,1,1,1,1 122 514607292589950,1,3,1,1,1,1,1,1 123 514607302693367,1,1,1,1,2,1,1,1 124 514607312793792,1,1,1,1,1,1,1,1 125 514607322895249,1,1,1,3,1,1,3,1 126 514607332994278,1,0,1,1,1,2,1,2 127 514607343095836,1,1,1,1,1,2,1,1 128 514607353196533,1,1,1,1,2,1,1,1 129 514607363297749,1,1,1,1,1,1,1,2 130 514607373399011,1,1,1,1,1,1,1,2 131 514607383499730,1,1,1,1,1,1,1,2 132 514607393601510,1,1,1,1,1,1,1,2 133 514607403704117,2,1,1,1,1,1,1,2 134 514607413802700,1,1,1,1,2,1,0,1 135 514607423904559,1,1,1,1,1,1,1,1 136 [...] 137 138 The output is verbose: printing out a timestamp, and then the length of each 139 CPU's run queue. The second last line, of timestamp 514607413802700, is an 140 example of what this tool detects: CPU 4 has a run queue length of 4, which 141 means one thread running and one thread queued, while CPU 6 has a run queue 142 length of 0: idle. The very next sample shows all CPUs busy. 143 144 145 The -J option prints raw samples with time offsets showing when the samples 146 were collected on each CPU. It's mostly useful for debugging the tool itself. 147 For example, during a Linux kernel build: 148 149 # ./cpuunclaimed.py -J 150 TIMESTAMP_ns,CPU0,CPU1,CPU2,CPU3,CPU4,CPU5,CPU6,CPU7,OFFSET_ns_CPU0,OFFSET_ns_CPU1,OFFSET_ns_CPU2,OFFSET_ns_CPU3,OFFSET_ns_CPU4,OFFSET_ns_CPU5,OFFSET_ns_CPU6,OFFSET_ns_CPU7 151 514722625198188,1,1,1,1,1,1,1,2,0,28321,51655,73396,89654,111172,132803,159792 152 514722635299034,1,1,1,1,1,2,1,1,0,28809,51999,74183,89552,110011,131995,153519 153 514722645400274,1,1,1,1,1,1,1,2,0,28024,51333,73652,88964,110075,131973,153568 154 514722655501816,1,2,1,1,1,1,1,1,0,28893,51671,75233,89496,109430,131945,153694 155 514722665602594,1,1,2,1,1,2,1,1,0,28623,50988,73866,89383,109186,131786,154555 156 514722675703498,1,1,1,1,1,1,1,1,0,27379,51031,73175,89625,110380,131482,104811 157 514722685804942,1,1,1,1,1,2,1,1,0,27213,50501,72183,88797,108780,130659,152153 158 514722695906294,1,1,1,1,1,1,1,1,0,27036,51182,73420,87861,109585,130364,155089 159 514722706005778,1,1,1,1,1,1,1,1,0,28492,51851,74138,89744,110208,132462,154060 160 514722716060705,1,1,1,1,1,1,1,1,0,154499,152528,155232,155046,154502,178746,200001 161 514722726209615,1,1,1,1,1,1,1,1,0,28170,49580,72605,87741,108144,130723,152138 162 514722736309475,1,2,1,1,1,1,1,1,0,27421,51386,73061,89358,109457,131273,153005 163 514722746410845,1,2,1,1,1,2,1,1,0,27788,50840,72720,88920,109111,131143,152979 164 514722756511363,1,1,1,1,1,1,2,1,0,28280,50977,73559,89848,109659,131579,152693 165 514722766613044,1,1,1,1,1,1,1,1,0,28046,50812,72754,89160,110108,130735,152948 166 514722776712932,1,1,1,2,1,1,1,1,0,28586,51177,73974,89588,109947,132376,154162 167 514722786815477,1,1,1,1,1,1,1,1,0,27973,71104,72539,88302,108896,130414,152236 168 514722796914955,1,1,1,1,1,1,1,1,0,29054,52354,74214,89592,110615,132586,153925 169 514722807044060,1,1,1,1,1,1,1,1,1587130,0,24079,46633,61787,82325,104706,125278 170 514722817117432,2,1,2,1,1,1,1,1,0,27628,51038,75138,89724,109340,132426,155348 171 514722827218254,1,1,1,1,1,1,2,1,0,29111,51868,74347,88904,109911,132764,153851 172 514722837340158,1,1,1,1,1,1,1,1,0,7366,30760,53528,68622,89317,111095,132319 173 514722847421305,1,1,1,1,1,1,1,1,0,28257,51105,73841,89037,110820,131605,153368 174 514722857521112,1,1,1,1,1,1,1,1,0,28544,51441,73857,89530,110497,131915,153513 175 514722867626129,0,2,1,1,1,1,1,1,0,24621,47917,70568,85391,106670,128081,150329 176 514722877727183,2,1,1,1,1,1,1,1,0,24869,47630,71547,84761,106048,128444,149285 177 514722887824589,1,1,1,1,1,1,2,1,0,28793,51212,73863,89584,109773,132348,153194 178 514722897925481,1,1,1,1,1,1,2,1,0,29278,51163,73961,89774,109592,132029,153715 179 514722908026097,1,1,1,1,1,1,1,1,0,30630,35595,36210,188001,190815,190072,190732 180 514722918127439,1,1,1,1,1,1,1,1,0,28544,51885,73948,89987,109763,132632,154083 181 514722928227399,1,1,1,1,1,1,1,1,0,31882,51574,74769,89939,110578,132951,154356 182 514722938329471,1,1,1,1,1,1,1,1,0,28498,51304,74101,89670,110278,132653,153176 183 514722948430589,1,1,1,1,1,1,1,1,0,27868,50925,73477,89676,109583,132360,153014 184 514722958531802,1,1,1,1,1,1,1,1,0,28505,50886,73729,89919,109618,131988,152896 185 514722968632181,1,1,1,1,1,1,1,1,0,28492,51749,73977,90334,109816,132897,152890 186 514722978733584,1,1,1,1,1,1,1,1,0,28847,50957,74121,90014,110019,132377,152978 187 514722988834321,1,1,1,1,1,1,1,1,0,28601,51437,74021,89968,110252,132233,153623 188 514722998937170,1,1,2,1,1,1,1,1,0,27007,50044,73259,87725,108663,132194,152459 189 514723009036821,1,2,1,2,1,1,1,1,0,28226,50937,73983,89110,110476,131740,153663 190 514723019137577,1,1,1,1,1,1,1,1,0,30261,52357,75657,87803,61823,131850,153585 191 514723029238745,1,1,1,1,1,1,1,1,0,28030,50752,74452,89240,110791,132187,153327 192 514723039339069,1,1,1,1,1,1,1,1,0,29791,52636,75996,90475,110414,132232,154714 193 514723049439822,1,1,1,1,2,1,1,1,0,29133,56662,74153,89520,110683,132740,154708 194 514723059541617,1,1,1,1,1,1,1,1,0,27932,51480,74644,89656,109176,131499,153732 195 514723069642500,1,1,2,1,1,1,2,1,0,27678,51509,73984,90136,110124,131554,153459 196 514723079743525,2,1,1,1,1,1,1,1,0,28029,51424,74394,90056,110087,132383,152963 197 514723089844091,2,1,1,2,1,1,1,1,0,28944,51692,74440,90339,110402,132722,154083 198 514723099945957,1,1,2,1,1,1,1,1,0,28425,51267,73164,89322,115048,114630,115187 199 514723110047020,1,1,2,0,1,1,1,2,0,28192,50811,76814,89835,109370,131265,153511 200 514723120216662,1,1,2,1,1,2,1,1,29,34,0,4514,19268,40293,62674,84009 201 [...] 202 203 This is a Xen guest system, and it shows that CPU 0 usually completes first (an 204 offset of 0), followed by CPU 1 around 28000 nanoseconds later, and so on. 205 The spread of offsets is triggered by the bcc Python library that initializes 206 the timers, which steps through the CPUs in sequence, with a small delay 207 between them merely from executing its own loop code. 208 209 Here's more output during a Linux kernel build: 210 211 # ./cpuunclaimed.py -J 212 TIMESTAMP_ns,CPU0,CPU1,CPU2,CPU3,CPU4,CPU5,CPU6,CPU7,OFFSET_ns_CPU0,OFFSET_ns_CPU1,OFFSET_ns_CPU2,OFFSET_ns_CPU3,OFFSET_ns_CPU4,OFFSET_ns_CPU5,OFFSET_ns_CPU6,OFFSET_ns_CPU7 213 514722625198188,1,1,1,1,1,1,1,2,0,28321,51655,73396,89654,111172,132803,159792 214 515700745758947,2,1,1,1,1,1,1,1,0,19835,34891,49397,59364,71988,87571,102769 215 515700755860451,2,1,1,1,1,1,1,2,0,19946,34323,49855,59844,72741,87925,102891 216 515700765960560,1,1,1,1,1,1,1,1,0,20805,35339,50436,59677,73557,88661,104796 217 515700776061744,1,1,1,1,1,1,1,1,1626,77,0,190,153452,154665,178218,154116 218 515700786162017,1,1,1,1,1,1,1,1,0,20497,35361,51552,59787,74451,147789,104545 219 515700796262811,1,1,1,1,1,1,1,2,0,20910,35657,50805,60175,73953,88492,103527 220 515700806364951,1,1,1,1,1,1,1,1,0,20140,35023,50074,59726,72757,88040,102421 221 515700816465253,1,1,1,1,1,1,2,1,0,20952,34899,50262,60048,72890,88067,103545 222 515700826566573,1,1,1,1,1,1,1,1,0,20898,35490,50609,59805,74060,88550,103354 223 515700836667480,1,1,1,1,1,1,2,1,0,20548,34760,50959,59490,73059,87820,103006 224 515700846768182,1,1,1,1,1,1,2,1,0,20571,35113,50777,59962,74139,88543,103192 225 515700856869468,1,1,2,1,1,2,2,1,0,20932,35382,50510,60106,73739,91818,103684 226 515700866971905,1,1,1,2,1,1,1,1,0,19780,33018,49075,58375,71949,86537,102136 227 515700877073459,2,1,1,1,1,1,1,1,0,20065,73966,48989,58832,71408,85714,101067 228 515700887172772,1,1,1,1,1,1,1,1,0,20909,34608,51493,59890,73564,88668,103454 229 515700897273292,1,2,1,1,1,1,1,1,0,20353,35292,50114,59773,73948,88615,103383 230 515700907374341,1,1,2,1,1,1,1,1,0,20816,35206,50915,60062,73878,88857,103794 231 515700917475331,1,1,6,1,1,2,1,1,0,20752,34931,50280,59764,73781,88329,103234 232 515700927576958,1,1,1,1,1,1,1,1,0,19929,34703,50181,59364,73004,88053,103127 233 515700937677298,1,1,2,2,1,1,1,1,0,21178,34724,50740,61193,73452,89030,103390 234 515700947778409,2,1,1,1,1,1,1,1,0,21059,35604,50853,60098,73919,88675,103506 235 515700957879196,2,1,1,1,1,1,1,1,0,21326,35939,51492,60083,74249,89474,103761 236 [...] 237 238 Notice the tighter range of offsets? I began executing cpuunclaimed when the 239 system was idle, and it initialized the CPU timers more quickly, and then I 240 began the Linux kernel build. 241 242 Here's some different output, this time from a physical system with 4 CPUs, 243 also doing a kernel build, 244 245 # ./cpuunclaimed.py -J 246 TIMESTAMP_ns,CPU0,CPU1,CPU2,CPU3,OFFSET_ns_CPU0,OFFSET_ns_CPU1,OFFSET_ns_CPU2,OFFSET_ns_CPU3 247 4429382557480,1,1,1,1,0,6011,10895,16018 248 4429392655042,2,1,1,1,0,8217,13661,19378 249 4429402757604,1,1,1,1,0,6879,12433,18000 250 4429412857809,1,1,1,1,0,8303,13190,18719 251 4429422960709,2,1,1,1,0,6095,11234,17079 252 4429433060391,1,1,1,2,0,6747,12480,18070 253 4429443161699,1,1,1,1,0,6560,12264,17945 254 4429453262002,1,2,1,1,0,6992,12644,18341 255 4429463363706,1,2,1,1,0,6211,12071,17853 256 4429473465571,1,1,1,1,0,5766,11495,17638 257 4429483566920,1,1,1,1,0,5223,11736,16358 258 4429493666279,1,1,1,1,0,6964,12653,18410 259 4429503769113,1,1,1,1,0,5161,11399,16612 260 4429513870744,1,1,1,1,0,5943,10583,15768 261 4429523969826,1,1,1,1,0,6533,12336,18189 262 4429534070311,1,1,1,1,0,6834,12816,18488 263 4429544170456,1,1,1,1,0,7284,13401,19129 264 4429554274467,1,2,1,1,0,5941,11160,16594 265 4429564372365,1,2,1,1,0,7514,13618,19190 266 4429574474406,1,2,1,1,0,6687,12650,18248 267 4429584574220,1,2,1,1,0,7912,13705,19136 268 [...] 269 270 If the offset range becomes too great, we can no longer conclude about when 271 some CPUs were idle and others had queued work. The tool will detect this, 272 and print an error message and exit. 273 274 275 Some systems can power down CPUs when idle, and when they wake up again the 276 timed samples may resume from different offsets. If this happens, this tool 277 can no longer draw conclusions about when some CPUs were idle and others 278 had queued work, so it prints an error, and exits. Eg: 279 280 # ./cpuunclaimed.py 1 281 Sampling run queues... Output every 1 seconds. Hit Ctrl-C to end. 282 %CPU 0.25%, unclaimed idle 0.00% 283 %CPU 0.75%, unclaimed idle 0.00% 284 %CPU 0.25%, unclaimed idle 0.00% 285 %CPU 0.00%, unclaimed idle 0.00% 286 %CPU 0.00%, unclaimed idle 0.00% 287 %CPU 0.12%, unclaimed idle 0.00% 288 %CPU 0.00%, unclaimed idle 0.00% 289 %CPU 0.25%, unclaimed idle 0.00% 290 %CPU 0.00%, unclaimed idle 0.00% 291 %CPU 0.12%, unclaimed idle 0.00% 292 %CPU 0.13%, unclaimed idle 0.00% 293 %CPU 0.12%, unclaimed idle 0.00% 294 %CPU 0.00%, unclaimed idle 0.00% 295 %CPU 0.00%, unclaimed idle 0.00% 296 %CPU 0.00%, unclaimed idle 0.00% 297 %CPU 0.00%, unclaimed idle 0.00% 298 ERROR: CPU samples arrived at skewed offsets (CPUs may have powered down when idle), spanning 4328176 ns (expected < 4040404 ns). Debug with -J, and see the man page. As output may begin to be unreliable, exiting. 299 300 It's expected that this will only really occur on idle systems. 301 302 USAGE: 303 304 # ./cpuunclaimed.py -h 305 usage: cpuunclaimed.py [-h] [-j] [-J] [-T] [interval] [count] 306 307 Sample CPU run queues and calculate unclaimed idle CPU 308 309 positional arguments: 310 interval output interval, in seconds 311 count number of outputs 312 313 optional arguments: 314 -h, --help show this help message and exit 315 -j, --csv print sample summaries (verbose) as comma-separated values 316 -J, --fullcsv print sample summaries with extra fields: CPU sample 317 offsets 318 -T, --timestamp include timestamp on output 319 320 examples: 321 ./cpuunclaimed # sample and calculate unclaimed idle CPUs, 322 # output every 1 second (default) 323 ./cpuunclaimed 5 10 # print 5 second summaries, 10 times 324 ./cpuunclaimed -T 1 # 1s summaries and timestamps 325 ./cpuunclaimed -j # raw dump of all samples (verbose), CSV 326