Home | History | Annotate | Download | only in tools
      1 Demonstrations of nfsslower, the Linux eBPF/bcc version.
      2 
      3 nfsslower show NFS reads, writes, opens and getattrs, slower than a
      4 threshold. For example:
      5 
      6 ./nfsslower.py
      7 Tracing NFS operations that are slower than 10 ms
      8 TIME     COMM           PID    T BYTES   OFF_KB   LAT(ms) FILENAME
      9 11:25:16 dd             21295  W 1048576 15360      14.84 1.test
     10 11:25:16 dd             21295  W 1048576 16384      12.73 1.test
     11 11:25:16 dd             21295  W 1048576 17408      24.27 1.test
     12 11:25:16 dd             21295  W 1048576 18432      22.93 1.test
     13 11:25:16 dd             21295  W 1048576 19456      14.65 1.test
     14 11:25:16 dd             21295  W 1048576 20480      12.58 1.test
     15 11:25:16 dd             21297  W 1048576 6144       10.50 1.test.w
     16 11:25:16 dd             21297  W 1048576 7168       16.65 1.test.w
     17 11:25:16 dd             21297  W 1048576 8192       13.01 1.test.w
     18 11:25:16 dd             21297  W 1048576 9216       14.06 1.test.w
     19 
     20 This shows NFS writes from dd each 1MB in size to 2 different files. The
     21 writes all had latency higher than 10ms.
     22 
     23 This "latency" is measured from when the operation was issued from the VFS
     24 interface to the file system, to when it completed. This spans everything:
     25 RPC latency, network latency, file system CPU cycles, file system locks, run
     26 queue latency, etc. This is a better measure of the latency suffered by
     27 applications reading from a NFS share and can better expose problems
     28 experienced by NFS clients.
     29 
     30 Note that this only traces the common NFS operations (read,write,open and
     31 getattr). I chose to include getattr as a significant percentage of NFS
     32 traffic end up being getattr calls and are a good indicator of problems
     33 with an NFS server.
     34 
     35 The threshold can be provided as an argument. E.g. I/O slower than 1 ms:
     36 
     37 ./nfsslower.py 1
     38 Tracing NFS operations that are slower than 1 ms
     39 TIME     COMM           PID    T BYTES   OFF_KB   LAT(ms) FILENAME
     40 11:40:16 cp             21583  R 131072  0           4.35 1.test
     41 11:40:16 cp             21583  R 131072  256         1.87 1.test
     42 11:40:16 cp             21583  R 131072  384         2.99 1.test
     43 11:40:16 cp             21583  R 131072  512         4.19 1.test
     44 11:40:16 cp             21583  R 131072  640         4.25 1.test
     45 11:40:16 cp             21583  R 131072  768         4.65 1.test
     46 11:40:16 cp             21583  R 131072  1280        1.08 1.test
     47 11:40:16 cp             21583  R 131072  1408        3.29 1.test
     48 11:40:16 cp             21583  R 131072  1792        3.12 1.test
     49 11:40:16 cp             21583  R 131072  3712        3.55 1.test
     50 11:40:16 cp             21583  R 131072  3840        1.12 1.test
     51 11:40:16 cp             21583  R 131072  4096        3.23 1.test
     52 11:40:16 cp             21583  R 131072  4224        2.73 1.test
     53 11:40:16 cp             21583  R 131072  4352        2.73 1.test
     54 11:40:16 cp             21583  R 131072  4480        6.09 1.test
     55 11:40:16 cp             21583  R 131072  5120        4.40 1.test
     56 [...]
     57 
     58 This shows all NFS_READS that were more than 1ms. Depending on your
     59 latency to your fileserver, you might need to tweak this value to
     60 remove 
     61 
     62 A threshold of 0 will trace all operations. Warning: the output will be
     63 verbose, as it will include all file system cache hits.
     64 
     65 ./nfsslower.py 0
     66 Tracing NFS operations
     67 11:56:50 dd             21852  W 1048576 0           0.42 1.test
     68 11:56:50 dd             21852  W 1048576 1024        0.46 1.test
     69 11:56:50 dd             21852  W 1048576 2048        0.36 1.test
     70 11:56:50 cp             21854  G 0       0           0.35 1.test
     71 11:56:50 cp             21854  O 0       0           0.33 1.test
     72 11:56:50 cp             21854  G 0       0           0.00 1.test
     73 11:56:50 cp             21854  R 131072  0           0.07 1.test
     74 11:56:50 cp             21854  R 131072  128         0.02 1.test
     75 11:56:50 cp             21854  R 131072  256         0.02 1.test
     76 11:56:50 cp             21854  R 131072  384         0.02 1.test
     77 11:56:50 cp             21854  R 131072  512         0.02 1.test
     78 11:56:50 cp             21854  R 131072  640         0.02 1.test
     79 11:56:50 cp             21854  R 131072  768         0.02 1.test
     80 11:56:50 cp             21854  R 131072  896         0.02 1.test
     81 11:56:50 cp             21854  R 131072  1024        0.02 1.test
     82 11:56:50 cp             21854  R 131072  1152        0.02 1.test
     83 11:56:50 cp             21854  R 131072  1280        0.02 1.test
     84 11:56:50 cp             21854  R 131072  1408        0.02 1.test
     85 11:56:50 cp             21854  R 131072  1536        0.02 1.test
     86 11:56:50 cp             21854  R 131072  1664        0.02 1.test
     87 11:56:50 cp             21854  R 131072  1792        0.02 1.test
     88 11:56:50 cp             21854  R 131072  1920        0.02 1.test
     89 11:56:50 cp             21854  R 131072  2048        0.02 1.test
     90 11:56:50 cp             21854  R 131072  2176        0.04 1.test
     91 11:56:50 cp             21854  R 131072  2304        0.02 1.test
     92 11:56:50 cp             21854  R 131072  2432        0.03 1.test
     93 11:56:50 cp             21854  R 131072  2560        0.03 1.test
     94 11:56:50 cp             21854  R 131072  2688        0.02 1.test
     95 11:56:50 cp             21854  R 131072  2816        0.03 1.test
     96 11:56:50 cp             21854  R 131072  2944        0.02 1.test
     97 11:56:50 cp             21854  R 0       3072        0.00 1.test
     98 11:56:50 ls             21855  G 0       0           0.00 1.test
     99 11:56:50 ls             21856  G 0       0           0.36 music
    100 11:56:50 ls             21856  G 0       0           0.00 music
    101 11:56:50 ls             21856  G 0       0           0.00 test
    102 11:56:50 ls             21856  G 0       0           0.00 ff
    103 11:56:50 ls             21856  G 0       0           0.00 34.log
    104 11:56:50 ls             21856  G 0       0           0.00 vmlinuz-linux
    105 11:56:50 ls             21856  G 0       0           0.00 2.test
    106 11:56:50 ls             21856  G 0       0           0.00 rt.log
    107 11:56:50 ls             21856  G 0       0           0.00 1.lod
    108 11:56:50 ls             21856  G 0       0           0.00 COPYRIGHT.txt
    109 11:56:50 ls             21856  G 0       0           0.00 gg
    110 11:56:50 ls             21856  G 0       0           0.00 qw.log
    111 11:56:50 ls             21856  G 0       0           0.00 README.md
    112 11:56:50 ls             21856  G 0       0           0.00 1.log
    113 
    114 The output now includes open operations ("O"), and reads ("R") wand getattrs ("G").
    115 A cp operation
    116 
    117 
    118 A -j option will print just the fields (parsable output, csv):
    119 
    120 ./nfsslower.py -j 0
    121 ENDTIME_us,TASK,PID,TYPE,BYTES,OFFSET_b,LATENCY_us,FILE
    122 87054476520,dd,22754,W,1048576,0,425,1.test
    123 87054482916,dd,22754,W,1048576,1048576,320,1.test
    124 87054488179,dd,22754,W,1048576,2097152,389,1.test
    125 87054511340,cp,22756,G,0,0,371,1.test
    126 87054511685,cp,22756,O,0,0,306,1.test
    127 87054511700,cp,22756,G,0,0,2,1.test
    128 87054512325,cp,22756,R,131072,0,56,1.test
    129 87054512432,cp,22756,R,131072,131072,22,1.test
    130 87054512520,cp,22756,R,131072,262144,32,1.test
    131 87054512600,cp,22756,R,131072,393216,21,1.test
    132 87054512678,cp,22756,R,131072,524288,21,1.test
    133 87054512803,cp,22756,R,131072,655360,56,1.test
    134 
    135 This may be useful for visualizing with another tool, for example, for
    136 producing a scatter plot of ENDTIME vs LATENCY, to look for time-based
    137 patterns.
    138 
    139 USAGE message:
    140 
    141 usage: nfsslower.py [-h] [-j] [-p PID] [min_ms]
    142 
    143 Trace READ, WRITE, OPEN and GETATTR NFS calls slower than a threshold,supports NFSv{3,4}
    144 
    145 positional arguments:
    146   min_ms             Minimum IO duration to trace in ms (default=10ms)
    147 
    148 optional arguments:
    149   -h, --help         show this help message and exit
    150   -j, --csv          just print fields: comma-separated values
    151   -p PID, --pid PID  Trace this pid only
    152 
    153  ./nfsslower         # trace operations slower than 10ms
    154  ./nfsslower 1       # trace operations slower than 1ms
    155  ./nfsslower -j 1    # ... 1 ms, parsable output (csv)
    156  ./nfsslower 0       # trace all nfs operations
    157  ./nfsslower -p 121  # trace pid 121 only
    158 
    159