1 Demonstrations of runqlen, the Linux eBPF/bcc version. 2 3 4 This program summarizes scheduler queue length as a histogram, and can also 5 show run queue occupancy. It works by sampling the run queue length on all 6 CPUs at 99 Hertz. 7 8 As an example, here is an idle system: 9 10 # ./runqlen.py 11 Sampling run queue length... Hit Ctrl-C to end. 12 ^C 13 runqlen : count distribution 14 0 : 1776 |****************************************| 15 16 This shows a zero run queue length each time it was sampled. 17 18 And now a heavily loaded system: 19 20 # ./runqlen.py 21 Sampling run queue length... Hit Ctrl-C to end. 22 ^C 23 runqlen : count distribution 24 0 : 1068 |****************************************| 25 1 : 642 |************************ | 26 2 : 369 |************* | 27 3 : 183 |****** | 28 4 : 104 |*** | 29 5 : 42 |* | 30 6 : 13 | | 31 7 : 2 | | 32 8 : 1 | | 33 34 Now there is often threads queued, with one sample reaching a queue length 35 of 8. This will cause run queue latency, which can be measured by the bcc 36 runqlat tool. 37 38 39 Here's an example of an issue that runqlen can indentify. Starting with the 40 system-wide summary: 41 42 # ./runqlen.py 43 Sampling run queue length... Hit Ctrl-C to end. 44 ^C 45 runqlen : count distribution 46 0 : 1209 |****************************************| 47 1 : 372 |************ | 48 2 : 73 |** | 49 3 : 3 | | 50 4 : 1 | | 51 5 : 0 | | 52 6 : 0 | | 53 7 : 237 |******* | 54 55 This shows there is often a run queue length of 7. Now using the -C option to 56 see per-CPU histograms: 57 58 # ./runqlen.py -C 59 Sampling run queue length... Hit Ctrl-C to end. 60 ^C 61 62 cpu = 0 63 runqlen : count distribution 64 0 : 257 |****************************************| 65 1 : 64 |********* | 66 2 : 5 | | 67 3 : 0 | | 68 4 : 0 | | 69 5 : 0 | | 70 6 : 1 | | 71 72 cpu = 1 73 runqlen : count distribution 74 0 : 226 |****************************************| 75 1 : 90 |*************** | 76 2 : 11 |* | 77 78 cpu = 2 79 runqlen : count distribution 80 0 : 264 |****************************************| 81 1 : 52 |******* | 82 2 : 8 |* | 83 3 : 1 | | 84 4 : 0 | | 85 5 : 0 | | 86 6 : 1 | | 87 7 : 0 | | 88 8 : 1 | | 89 90 cpu = 3 91 runqlen : count distribution 92 0 : 0 | | 93 1 : 0 | | 94 2 : 0 | | 95 3 : 0 | | 96 4 : 0 | | 97 5 : 0 | | 98 6 : 0 | | 99 7 : 327 |****************************************| 100 101 cpu = 4 102 runqlen : count distribution 103 0 : 255 |****************************************| 104 1 : 63 |********* | 105 2 : 9 |* | 106 107 cpu = 5 108 runqlen : count distribution 109 0 : 244 |****************************************| 110 1 : 78 |************ | 111 2 : 3 | | 112 3 : 2 | | 113 114 cpu = 6 115 runqlen : count distribution 116 0 : 253 |****************************************| 117 1 : 66 |********** | 118 2 : 6 | | 119 3 : 1 | | 120 4 : 1 | | 121 122 cpu = 7 123 runqlen : count distribution 124 0 : 243 |****************************************| 125 1 : 74 |************ | 126 2 : 6 | | 127 3 : 1 | | 128 4 : 0 | | 129 5 : 1 | | 130 6 : 2 | | 131 132 The run queue length of 7 is isolated to CPU 3. It was caused by CPU binding 133 (taskset). This can sometimes happen by applications that try to auto-bind 134 to CPUs, leaving other CPUs idle while work is queued. 135 136 137 runqlat accepts an interval and a count. For example, with -T for timestamps: 138 139 # ./runqlen.py -T 1 5 140 Sampling run queue length... Hit Ctrl-C to end. 141 142 19:51:34 143 runqlen : count distribution 144 0 : 635 |****************************************| 145 1 : 142 |******** | 146 2 : 13 | | 147 3 : 0 | | 148 4 : 1 | | 149 150 19:51:35 151 runqlen : count distribution 152 0 : 640 |****************************************| 153 1 : 136 |******** | 154 2 : 13 | | 155 3 : 1 | | 156 4 : 0 | | 157 5 : 0 | | 158 6 : 0 | | 159 7 : 0 | | 160 8 : 0 | | 161 9 : 0 | | 162 10 : 1 | | 163 164 19:51:36 165 runqlen : count distribution 166 0 : 603 |****************************************| 167 1 : 170 |*********** | 168 2 : 16 |* | 169 3 : 1 | | 170 4 : 0 | | 171 5 : 0 | | 172 6 : 0 | | 173 7 : 0 | | 174 8 : 0 | | 175 9 : 1 | | 176 177 19:51:37 178 runqlen : count distribution 179 0 : 617 |****************************************| 180 1 : 154 |********* | 181 2 : 20 |* | 182 3 : 0 | | 183 4 : 0 | | 184 5 : 0 | | 185 6 : 0 | | 186 7 : 0 | | 187 8 : 0 | | 188 9 : 0 | | 189 10 : 0 | | 190 11 : 1 | | 191 192 19:51:38 193 runqlen : count distribution 194 0 : 603 |****************************************| 195 1 : 161 |********** | 196 2 : 24 |* | 197 3 : 4 | | 198 199 The spikes in run queue length of 11 are likely threads waking up at the same 200 time (a thundering herd), and then are scheduled and complete their execution 201 quickly. 202 203 204 The -O option prints run queue occupancy: the percentage of time that there 205 was work queued waiting its turn. Eg: 206 207 # ./runqlen.py -OT 1 208 Sampling run queue length... Hit Ctrl-C to end. 209 210 19:54:53 211 runqocc: 41.09% 212 213 19:54:54 214 runqocc: 41.85% 215 216 19:54:55 217 runqocc: 41.47% 218 219 19:54:56 220 runqocc: 42.35% 221 222 19:54:57 223 runqocc: 40.83% 224 [...] 225 226 This can also be examined per-CPU: 227 228 # ./runqlen.py -COT 1 229 Sampling run queue length... Hit Ctrl-C to end. 230 231 19:55:03 232 runqocc, CPU 0 32.32% 233 runqocc, CPU 1 26.26% 234 runqocc, CPU 2 38.38% 235 runqocc, CPU 3 100.00% 236 runqocc, CPU 4 26.26% 237 runqocc, CPU 5 32.32% 238 runqocc, CPU 6 39.39% 239 runqocc, CPU 7 46.46% 240 241 19:55:04 242 runqocc, CPU 0 35.00% 243 runqocc, CPU 1 32.32% 244 runqocc, CPU 2 37.00% 245 runqocc, CPU 3 100.00% 246 runqocc, CPU 4 43.43% 247 runqocc, CPU 5 31.31% 248 runqocc, CPU 6 28.00% 249 runqocc, CPU 7 31.31% 250 251 19:55:05 252 runqocc, CPU 0 43.43% 253 runqocc, CPU 1 32.32% 254 runqocc, CPU 2 45.45% 255 runqocc, CPU 3 100.00% 256 runqocc, CPU 4 29.29% 257 runqocc, CPU 5 36.36% 258 runqocc, CPU 6 36.36% 259 runqocc, CPU 7 30.30% 260 261 19:55:06 262 runqocc, CPU 0 40.00% 263 runqocc, CPU 1 38.00% 264 runqocc, CPU 2 31.31% 265 runqocc, CPU 3 100.00% 266 runqocc, CPU 4 31.31% 267 runqocc, CPU 5 28.28% 268 runqocc, CPU 6 31.00% 269 runqocc, CPU 7 29.29% 270 [...] 271 272 273 USAGE message: 274 275 # ./runqlen -h 276 usage: runqlen [-h] [-T] [-O] [-C] [interval] [count] 277 278 Summarize scheduler run queue length as a histogram 279 280 positional arguments: 281 interval output interval, in seconds 282 count number of outputs 283 284 optional arguments: 285 -h, --help show this help message and exit 286 -T, --timestamp include timestamp on output 287 -O, --runqocc report run queue occupancy 288 -C, --cpus print output for each CPU separately 289 290 examples: 291 ./runqlen # summarize run queue length as a histogram 292 ./runqlen 1 10 # print 1 second summaries, 10 times 293 ./runqlen -T 1 # 1s summaries and timestamps 294 ./runqlen -O # report run queue occupancy 295 ./runqlen -C # show each CPU separately 296