1 Demonstrations of syscount, the Linux/eBPF version. 2 3 4 syscount summarizes syscall counts across the system or a specific process, 5 with optional latency information. It is very useful for general workload 6 characterization, for example: 7 8 # syscount 9 Tracing syscalls, printing top 10... Ctrl+C to quit. 10 [09:39:04] 11 SYSCALL COUNT 12 write 10739 13 read 10584 14 wait4 1460 15 nanosleep 1457 16 select 795 17 rt_sigprocmask 689 18 clock_gettime 653 19 rt_sigaction 128 20 futex 86 21 ioctl 83 22 ^C 23 24 These are the top 10 entries; you can get more by using the -T switch. Here, 25 the output indicates that the write and read syscalls were very common, followed 26 immediately by wait4, nanosleep, and so on. By default, syscount counts across 27 the entire system, but we can point it to a specific process of interest: 28 29 # syscount -p $(pidof dd) 30 Tracing syscalls, printing top 10... Ctrl+C to quit. 31 [09:40:21] 32 SYSCALL COUNT 33 read 7878397 34 write 7878397 35 ^C 36 37 Indeed, dd's workload is a bit easier to characterize. Occasionally, the count 38 of syscalls is not enough, and you'd also want an aggregate latency: 39 40 # syscount -L 41 Tracing syscalls, printing top 10... Ctrl+C to quit. 42 [09:41:32] 43 SYSCALL COUNT TIME (us) 44 select 16 3415860.022 45 nanosleep 291 12038.707 46 ftruncate 1 122.939 47 write 4 63.389 48 stat 1 23.431 49 fstat 1 5.088 50 [unknown: 321] 32 4.965 51 timerfd_settime 1 4.830 52 ioctl 3 4.802 53 kill 1 4.342 54 ^C 55 56 The select and nanosleep calls are responsible for a lot of time, but remember 57 these are blocking calls. This output was taken from a mostly idle system. Note 58 the "unknown" entry -- syscall 321 is the bpf() syscall, which is not in the 59 table used by this tool (borrowed from strace sources). 60 61 Another direction would be to understand which processes are making a lot of 62 syscalls, thus responsible for a lot of activity. This is what the -P switch 63 does: 64 65 # syscount -P 66 Tracing syscalls, printing top 10... Ctrl+C to quit. 67 [09:58:13] 68 PID COMM COUNT 69 13820 vim 548 70 30216 sshd 149 71 29633 bash 72 72 25188 screen 70 73 25776 mysqld 30 74 31285 python 10 75 529 systemd-udevd 9 76 1 systemd 8 77 494 systemd-journal 5 78 ^C 79 80 This is again from a mostly idle system over an interval of a few seconds. 81 82 Sometimes, you'd only care about failed syscalls -- these are the ones that 83 might be worth investigating with follow-up tools like opensnoop, execsnoop, 84 or trace. Use the -x switch for this; the following example also demonstrates 85 the -i switch, for printing at predefined intervals: 86 87 # syscount -x -i 5 88 Tracing failed syscalls, printing top 10... Ctrl+C to quit. 89 [09:44:16] 90 SYSCALL COUNT 91 futex 13 92 getxattr 10 93 stat 8 94 open 6 95 wait4 3 96 access 2 97 [unknown: 321] 1 98 99 [09:44:21] 100 SYSCALL COUNT 101 futex 12 102 getxattr 10 103 [unknown: 321] 2 104 wait4 1 105 access 1 106 pause 1 107 ^C 108 109 Similar to -x/--failures, sometimes you only care about certain syscall 110 errors like EPERM or ENONET -- these are the ones that might be worth 111 investigating with follow-up tools like opensnoop, execsnoop, or 112 trace. Use the -e/--errno switch for this; the following example also 113 demonstrates the -e switch, for printing ENOENT failures at predefined intervals: 114 115 # syscount -e ENOENT -i 5 116 Tracing syscalls, printing top 10... Ctrl+C to quit. 117 [13:15:57] 118 SYSCALL COUNT 119 stat 4669 120 open 1951 121 access 561 122 lstat 62 123 openat 42 124 readlink 8 125 execve 4 126 newfstatat 1 127 128 [13:16:02] 129 SYSCALL COUNT 130 lstat 18506 131 stat 13087 132 open 2907 133 access 412 134 openat 19 135 readlink 12 136 execve 7 137 connect 6 138 unlink 1 139 rmdir 1 140 ^C 141 142 USAGE: 143 # syscount -h 144 usage: syscount.py [-h] [-p PID] [-i INTERVAL] [-T TOP] [-x] [-e ERRNO] [-L] 145 [-m] [-P] [-l] 146 147 Summarize syscall counts and latencies. 148 149 optional arguments: 150 -h, --help show this help message and exit 151 -p PID, --pid PID trace only this pid 152 -i INTERVAL, --interval INTERVAL 153 print summary at this interval (seconds) 154 -d DURATION, --duration DURATION 155 total duration of trace, in seconds 156 -T TOP, --top TOP print only the top syscalls by count or latency 157 -x, --failures trace only failed syscalls (return < 0) 158 -e ERRNO, --errno ERRNO 159 trace only syscalls that return this error (numeric or 160 EPERM, etc.) 161 -L, --latency collect syscall latency 162 -m, --milliseconds display latency in milliseconds (default: 163 microseconds) 164 -P, --process count by process and not by syscall 165 -l, --list print list of recognized syscalls and exit 166