1 Demonstrations of ext4slower, the Linux eBPF/bcc version. 2 3 4 ext4slower shows ext4 reads, writes, opens, and fsyncs, slower than a threshold. 5 For example: 6 7 # ./ext4slower 8 Tracing ext4 operations slower than 10 ms 9 TIME COMM PID T BYTES OFF_KB LAT(ms) FILENAME 10 06:35:01 cron 16464 R 1249 0 16.05 common-auth 11 06:35:01 cron 16463 R 1249 0 16.04 common-auth 12 06:35:01 cron 16465 R 1249 0 16.03 common-auth 13 06:35:01 cron 16465 R 4096 0 10.62 login.defs 14 06:35:01 cron 16464 R 4096 0 10.61 login.defs 15 06:35:01 cron 16463 R 4096 0 10.63 login.defs 16 06:35:01 cron 16465 R 2972 0 18.52 pam_env.conf 17 06:35:01 cron 16464 R 2972 0 18.51 pam_env.conf 18 06:35:01 cron 16463 R 2972 0 18.49 pam_env.conf 19 06:35:01 dumpsystemstat 16473 R 128 0 12.58 date 20 06:35:01 debian-sa1 16474 R 283 0 12.66 sysstat 21 06:35:01 debian-sa1 16474 R 128 0 10.39 sa1 22 06:35:01 dumpsystemstat 16491 R 128 0 13.22 ifconfig 23 06:35:01 DumpThreads 16534 R 128 0 12.78 cut 24 06:35:01 cron 16545 R 128 0 14.76 sendmail 25 06:35:01 sendmail 16545 R 274 0 10.88 dynamicmaps.cf 26 06:35:02 postdrop 16546 R 118 0 32.94 Universal 27 06:35:02 pickup 9574 R 118 0 21.02 localtime 28 [...] 29 30 This shows various system tasks reading from ext4. The high latency here is 31 due to disk I/O, as I had just evicted the file system cache for this example. 32 33 This "latency" is measured from when the operation was issued from the VFS 34 interface to the file system, to when it completed. This spans everything: 35 block device I/O (disk I/O), file system CPU cycles, file system locks, run 36 queue latency, etc. This is a better measure of the latency suffered by 37 applications reading from the file system than measuring this down at the 38 block device interface. 39 40 Note that this only traces the common file system operations previously 41 listed: other file system operations (eg, inode operations including 42 getattr()) are not traced. 43 44 45 The threshold can be provided as an argument. Eg, I/O slower than 1 ms: 46 47 # ./ext4slower 1 48 Tracing ext4 operations slower than 1 ms 49 TIME COMM PID T BYTES OFF_KB LAT(ms) FILENAME 50 06:49:17 bash 3616 R 128 0 7.75 cksum 51 06:49:17 cksum 3616 R 39552 0 1.34 [ 52 06:49:17 cksum 3616 R 96 0 5.36 2to3-2.7 53 06:49:17 cksum 3616 R 96 0 14.94 2to3-3.4 54 06:49:17 cksum 3616 R 10320 0 6.82 411toppm 55 06:49:17 cksum 3616 R 65536 0 4.01 a2p 56 06:49:17 cksum 3616 R 55400 0 8.77 ab 57 06:49:17 cksum 3616 R 36792 0 16.34 aclocal-1.14 58 06:49:17 cksum 3616 R 15008 0 19.31 acpi_listen 59 06:49:17 cksum 3616 R 6123 0 17.23 add-apt-repository 60 06:49:17 cksum 3616 R 6280 0 18.40 addpart 61 06:49:17 cksum 3616 R 27696 0 2.16 addr2line 62 06:49:17 cksum 3616 R 58080 0 10.11 ag 63 06:49:17 cksum 3616 R 906 0 6.30 ec2-meta-data 64 06:49:17 cksum 3616 R 6320 0 10.00 animate.im6 65 06:49:17 cksum 3616 R 5680 0 18.69 anytopnm 66 06:49:17 cksum 3616 R 2671 0 20.27 apport-bug 67 06:49:17 cksum 3616 R 12566 0 16.72 apport-cli 68 06:49:17 cksum 3616 R 1622 0 7.95 apport-unpack 69 06:49:17 cksum 3616 R 10440 0 2.37 appres 70 06:49:17 cksum 3616 R 48112 0 5.42 whatis 71 06:49:17 cksum 3616 R 14832 0 6.24 apt 72 06:49:17 cksum 3616 R 65536 0 24.74 apt-cache 73 06:49:17 cksum 3616 R 27264 0 1.68 apt-cdrom 74 06:49:17 cksum 3616 R 23224 0 5.31 apt-extracttemplates 75 06:49:17 cksum 3616 R 65536 0 8.08 apt-ftparchive 76 06:49:17 cksum 3616 R 65536 128 2.92 apt-ftparchive 77 06:49:17 cksum 3616 R 65536 0 9.58 aptitude-curses 78 06:49:17 cksum 3616 R 65536 128 44.25 aptitude-curses 79 06:49:17 cksum 3616 R 65536 384 1.69 aptitude-curses 80 [...] 81 82 This time a cksum(1) command can be seen reading various files (from /usr/bin). 83 84 85 A threshold of 0 will trace all operations. Warning: the output will be 86 verbose, as it will include all file system cache hits. 87 88 # ./ext4slower 0 89 Tracing ext4 operations 90 TIME COMM PID T BYTES OFF_KB LAT(ms) FILENAME 91 06:58:05 supervise 1884 O 0 0 0.00 status.new 92 06:58:05 supervise 1884 W 18 0 0.02 status.new 93 06:58:05 supervise 1884 O 0 0 0.00 status.new 94 06:58:05 supervise 1884 W 18 0 0.01 status.new 95 06:58:05 supervise 15817 O 0 0 0.00 run 96 06:58:05 supervise 15817 R 92 0 0.00 run 97 06:58:05 supervise 15817 O 0 0 0.00 bash 98 06:58:05 supervise 15817 R 128 0 0.00 bash 99 06:58:05 supervise 15817 R 504 0 0.00 bash 100 06:58:05 supervise 15817 R 28 0 0.00 bash 101 06:58:05 supervise 15817 O 0 0 0.00 ld-2.19.so 102 06:58:05 supervise 15817 R 64 0 0.00 ld-2.19.so 103 06:58:05 supervise 15817 R 392 0 0.00 ld-2.19.so 104 06:58:05 run 15817 O 0 0 0.00 ld.so.cache 105 06:58:05 run 15817 O 0 0 0.00 libtinfo.so.5.9 106 06:58:05 run 15817 R 832 0 0.00 libtinfo.so.5.9 107 06:58:05 run 15817 O 0 0 0.00 libdl-2.19.so 108 06:58:05 run 15817 R 832 0 0.00 libdl-2.19.so 109 06:58:05 run 15817 O 0 0 0.00 libc-2.19.so 110 06:58:05 run 15817 R 832 0 0.00 libc-2.19.so 111 06:58:05 supervise 1876 O 0 0 0.00 status.new 112 06:58:05 supervise 1876 W 18 0 0.01 status.new 113 06:58:05 supervise 1895 O 0 0 0.00 status.new 114 06:58:05 supervise 1895 W 18 0 0.02 status.new 115 06:58:05 supervise 1876 O 0 0 0.00 status.new 116 06:58:05 supervise 1876 W 18 0 0.01 status.new 117 06:58:05 supervise 1872 O 0 0 0.00 status.new 118 06:58:05 supervise 1872 W 18 0 0.02 status.new 119 06:58:05 supervise 1895 O 0 0 0.00 status.new 120 06:58:05 supervise 1895 W 18 0 0.01 status.new 121 06:58:05 supervise 15818 R 92 0 0.00 run 122 06:58:05 supervise 15818 O 0 0 0.00 bash 123 06:58:05 supervise 15818 R 128 0 0.00 bash 124 06:58:05 supervise 15818 R 504 0 0.00 bash 125 06:58:05 supervise 15818 R 28 0 0.00 bash 126 06:58:05 supervise 15818 O 0 0 0.00 ld-2.19.so 127 06:58:05 supervise 15818 R 64 0 0.00 ld-2.19.so 128 06:58:05 supervise 15818 R 392 0 0.00 ld-2.19.so 129 06:58:05 supervise 15818 O 0 0 0.00 run 130 06:58:05 supervise 1888 O 0 0 0.00 status.new 131 06:58:05 supervise 1888 W 18 0 0.01 status.new 132 06:58:05 supervise 1888 O 0 0 0.00 status.new 133 06:58:05 supervise 1888 W 18 0 0.02 status.new 134 06:58:05 supervise 15822 R 119 0 0.00 run 135 06:58:05 supervise 15822 O 0 0 0.00 bash 136 06:58:05 supervise 15822 R 128 0 0.00 bash 137 06:58:05 supervise 15822 R 504 0 0.00 bash 138 06:58:05 supervise 15822 R 28 0 0.00 bash 139 06:58:05 supervise 15822 O 0 0 0.00 ld-2.19.so 140 06:58:05 supervise 15822 R 64 0 0.00 ld-2.19.so 141 06:58:05 supervise 15822 R 392 0 0.00 ld-2.19.so 142 06:58:05 supervise 1892 O 0 0 0.00 status.new 143 06:58:05 supervise 1892 W 18 0 0.02 status.new 144 06:58:05 supervise 1892 O 0 0 0.00 status.new 145 06:58:05 supervise 1892 W 18 0 0.02 status.new 146 06:58:05 supervise 15820 O 0 0 0.00 run 147 [...] 148 149 The output now includes open operations ("O"), and writes ("W"). 150 151 152 A -j option will print just the fields (parsable output, csv): 153 154 # ./ext4slower -j 1 155 ENDTIME_us,TASK,PID,TYPE,BYTES,OFFSET_b,LATENCY_us,FILE 156 127200712278,bash,17225,R,128,0,14329,cksum 157 127200722986,cksum,17225,R,3274,0,8368,command-not-found 158 127200735581,cksum,17225,R,65536,0,10903,libbfd-2.24-system.so 159 127200738482,cksum,17225,R,65536,131072,2419,libbfd-2.24-system.so 160 127200749226,cksum,17225,R,65536,655360,8995,libbfd-2.24-system.so 161 127200776273,cksum,17225,R,55080,0,25297,libbind9.so.90.0.9 162 127200784688,cksum,17225,R,65536,0,7873,libblas.so.3.0 163 127200787551,cksum,17225,R,65536,131072,2386,libblas.so.3.0 164 127200795524,cksum,17225,R,18624,0,4947,libcpupower.so.3.13.0-49 165 127200802073,cksum,17225,R,65536,0,6410,libcwidget.so.3.0.0 166 127200808718,cksum,17225,R,65536,131072,6181,libcwidget.so.3.0.0 167 127200829518,cksum,17225,R,65536,0,14213,libdns.so.100.2.2 168 127200832916,cksum,17225,R,65536,131072,2911,libdns.so.100.2.2 169 127200841044,cksum,17225,R,65536,655360,6376,libdns.so.100.2.2 170 127200853646,cksum,17225,R,956,0,1022,libdumbnet.la 171 127200857814,cksum,17225,R,61096,0,4111,libdumbnet.so.1.0.1 172 127200869655,cksum,17225,R,65536,0,11252,libgettextlib-0.18.3.so 173 127200872985,cksum,17225,R,65536,131072,2882,libgettextlib-0.18.3.so 174 127200883063,cksum,17225,R,65536,0,9661,libgettextsrc-0.18.3.so 175 127200884767,cksum,17225,R,65536,131072,1251,libgettextsrc-0.18.3.so 176 127200904830,cksum,17225,R,65536,0,19571,libgirepository-1.0.so.1.0.0 177 127200906354,cksum,17225,R,65536,131072,1080,libgirepository-1.0.so.1.0.0 178 127200936047,cksum,17225,R,65536,0,28674,libGraphicsMagick.a 179 127200939091,cksum,17225,R,65536,131072,2576,libGraphicsMagick.a 180 127200947295,cksum,17225,R,65536,655360,6463,libGraphicsMagick.a 181 127200958793,cksum,17225,R,65536,1966080,7034,libGraphicsMagick.a 182 [...] 183 184 This may be useful for visualizing with another tool, for example, for 185 producing a scatter plot of ENDTIME vs LATENCY, to look for time-based 186 patterns. 187 188 189 USAGE message: 190 191 # ./ext4slower -h 192 usage: ext4slower [-h] [-j] [-p PID] [min_ms] 193 194 Trace common ext4 file operations slower than a threshold 195 196 positional arguments: 197 min_ms minimum I/O duration to trace, in ms (default 10) 198 199 optional arguments: 200 -h, --help show this help message and exit 201 -j, --csv just print fields: comma-separated values 202 -p PID, --pid PID trace this PID only 203 204 examples: 205 ./ext4slower # trace operations slower than 10 ms (default) 206 ./ext4slower 1 # trace operations slower than 1 ms 207 ./ext4slower -j 1 # ... 1 ms, parsable output (csv) 208 ./ext4slower 0 # trace all operations (warning: verbose) 209 ./ext4slower -p 185 # trace PID 185 only 210