Home | History | Annotate | Download | only in tools
      1 Demonstrations of stackcount, the Linux eBPF/bcc version.
      2 
      3 
      4 This program traces functions and frequency counts them with their entire
      5 stack trace, summarized in-kernel for efficiency. For example, counting
      6 stack traces that led to the submit_bio() kernel function, which creates
      7 block device I/O:
      8 
      9 # ./stackcount submit_bio
     10 Tracing 1 functions for "submit_bio"... Hit Ctrl-C to end.
     11 ^C
     12   submit_bio
     13   ext4_writepages
     14   do_writepages
     15   __filemap_fdatawrite_range
     16   filemap_flush
     17   ext4_alloc_da_blocks
     18   ext4_release_file
     19   __fput
     20   ____fput
     21   task_work_run
     22   exit_to_usermode_loop
     23   syscall_return_slowpath
     24   entry_SYSCALL_64_fastpath
     25   [unknown]
     26   [unknown]
     27     tar [15069]
     28     5
     29 
     30   submit_bio
     31   ext4_bio_write_page
     32   mpage_submit_page
     33   mpage_map_and_submit_buffers
     34   ext4_writepages
     35   do_writepages
     36   __filemap_fdatawrite_range
     37   filemap_flush
     38   ext4_alloc_da_blocks
     39   ext4_release_file
     40   __fput
     41   ____fput
     42   task_work_run
     43   exit_to_usermode_loop
     44   syscall_return_slowpath
     45   entry_SYSCALL_64_fastpath
     46   [unknown]
     47   [unknown]
     48     tar [15069]
     49     15
     50 
     51   submit_bio
     52   ext4_readpages
     53   __do_page_cache_readahead
     54   ondemand_readahead
     55   page_cache_async_readahead
     56   generic_file_read_iter
     57   __vfs_read
     58   vfs_read
     59   sys_read
     60   entry_SYSCALL_64_fastpath
     61   [unknown]
     62     tar [15069]
     63     113
     64 
     65 Detaching...
     66 
     67 The output shows unique stack traces, in order from leaf (on-CPU) to root,
     68 followed by their occurrence count. The last stack trace in the above output
     69 shows syscall handling, sys_read(), vfs_read(), and then "readahead" functions:
     70 looks like an application issued file read has triggered read ahead. The
     71 application can be seen after the stack trace, in this case, "tar [15069]"
     72 for the "tar" command, PID 15069.
     73 
     74 The order of printed stack traces is from least to most frequent. The most
     75 frequent in this case, the ext4_rename() stack, was taken 113 times during
     76 tracing.
     77 
     78 The "[unknown]" frames are from user-level, since this simple workload is
     79 the tar command, which apparently has been compiled without frame pointers.
     80 It's a common compiler optimization, but it breaks frame pointer-based stack
     81 walkers. Similar broken stacks will be seen by other profilers and debuggers
     82 that use frame pointers. Hopefully your application preserves them so that
     83 the user-level stack trace is visible. So how does one get frame pointers, if
     84 your application doesn't have them to start with? For the current bcc (until
     85 it supports other stack walkers), you need to be running a application binaries
     86 that preserves frame pointers, eg, using gcc's -fno-omit-frame-pointer. That's
     87 about all I'll say here: this is a big topic that is not bcc/BPF specific.
     88 
     89 It can be useful to trace the path to submit_bio to explain unusual rates of
     90 disk IOPS. These could have in-kernel origins (eg, background scrub).
     91 
     92 
     93 Now adding the -d option to delimit kernel and user stacks:
     94 
     95 # ./stackcount -d submit_bio
     96 Tracing 1 functions for "submit_bio"... Hit Ctrl-C to end.
     97 ^C
     98   submit_bio
     99   submit_bh
    100   journal_submit_commit_record
    101   jbd2_journal_commit_transaction
    102   kjournald2
    103   kthread
    104   ret_from_fork
    105     --
    106     jbd2/xvda1-8 [405]
    107     1
    108 
    109   submit_bio
    110   submit_bh
    111   jbd2_journal_commit_transaction
    112   kjournald2
    113   kthread
    114   ret_from_fork
    115     --
    116     jbd2/xvda1-8 [405]
    117     2
    118 
    119   submit_bio
    120   ext4_writepages
    121   do_writepages
    122   __filemap_fdatawrite_range
    123   filemap_flush
    124   ext4_alloc_da_blocks
    125   ext4_release_file
    126   __fput
    127   ____fput
    128   task_work_run
    129   exit_to_usermode_loop
    130   syscall_return_slowpath
    131   entry_SYSCALL_64_fastpath
    132     --
    133   [unknown]
    134   [unknown]
    135     tar [15187]
    136     5
    137 
    138   submit_bio
    139   ext4_bio_write_page
    140   mpage_submit_page
    141   mpage_map_and_submit_buffers
    142   ext4_writepages
    143   do_writepages
    144   __filemap_fdatawrite_range
    145   filemap_flush
    146   ext4_alloc_da_blocks
    147   ext4_release_file
    148   __fput
    149   ____fput
    150   task_work_run
    151   exit_to_usermode_loop
    152   syscall_return_slowpath
    153   entry_SYSCALL_64_fastpath
    154     --
    155   [unknown]
    156   [unknown]
    157     tar [15187]
    158     15
    159 
    160   submit_bio
    161   ext4_readpages
    162   __do_page_cache_readahead
    163   ondemand_readahead
    164   page_cache_async_readahead
    165   generic_file_read_iter
    166   __vfs_read
    167   vfs_read
    168   sys_read
    169   entry_SYSCALL_64_fastpath
    170     --
    171   [unknown]
    172   [unknown]
    173   [unknown]
    174     tar [15187]
    175     171
    176 
    177 Detaching...
    178 
    179 A "--" is printed between the kernel and user stacks.
    180 
    181 
    182 As a different example, here is the kernel function hrtimer_init_sleeper():
    183 
    184 # ./stackcount.py -d hrtimer_init_sleeper
    185 Tracing 1 functions for "hrtimer_init_sleeper"... Hit Ctrl-C to end.
    186 ^C
    187   hrtimer_init_sleeper
    188   do_futex
    189   SyS_futex
    190   entry_SYSCALL_64_fastpath
    191     --
    192   [unknown]
    193     containerd [16020]
    194     1
    195 
    196   hrtimer_init_sleeper
    197   do_futex
    198   SyS_futex
    199   entry_SYSCALL_64_fastpath
    200     --
    201   __pthread_cond_timedwait
    202   Monitor::IWait(Thread*, long)
    203   Monitor::wait(bool, long, bool)
    204   CompileQueue::get()
    205   CompileBroker::compiler_thread_loop()
    206   JavaThread::thread_main_inner()
    207   JavaThread::run()
    208   java_start(Thread*)
    209   start_thread
    210     java [4996]
    211     1
    212 
    213   hrtimer_init_sleeper
    214   do_futex
    215   SyS_futex
    216   entry_SYSCALL_64_fastpath
    217     --
    218   [unknown]
    219   [unknown]
    220     containerd [16020]
    221     1
    222 
    223   hrtimer_init_sleeper
    224   do_futex
    225   SyS_futex
    226   entry_SYSCALL_64_fastpath
    227     --
    228   __pthread_cond_timedwait
    229   VMThread::loop()
    230   VMThread::run()
    231   java_start(Thread*)
    232   start_thread
    233     java [4996]
    234     3
    235 
    236   hrtimer_init_sleeper
    237   do_futex
    238   SyS_futex
    239   entry_SYSCALL_64_fastpath
    240     --
    241   [unknown]
    242     dockerd [16008]
    243     4
    244 
    245   hrtimer_init_sleeper
    246   do_futex
    247   SyS_futex
    248   entry_SYSCALL_64_fastpath
    249     --
    250   [unknown]
    251   [unknown]
    252     dockerd [16008]
    253     4
    254 
    255   hrtimer_init_sleeper
    256   do_futex
    257   SyS_futex
    258   entry_SYSCALL_64_fastpath
    259     --
    260   __pthread_cond_timedwait
    261   Lio/netty/util/ThreadDeathWatcher$Watcher;::run
    262   Interpreter
    263   Interpreter
    264   call_stub
    265   JavaCalls::call_helper(JavaValue*, methodHandle*, JavaCallArguments*, Thread*)
    266   JavaCalls::call_virtual(JavaValue*, KlassHandle, Symbol*, Symbol*, JavaCallArguments*, Thread*)
    267   JavaCalls::call_virtual(JavaValue*, Handle, KlassHandle, Symbol*, Symbol*, Thread*)
    268   thread_entry(JavaThread*, Thread*)
    269   JavaThread::thread_main_inner()
    270   JavaThread::run()
    271   java_start(Thread*)
    272   start_thread
    273     java [4996]
    274     4
    275 
    276   hrtimer_init_sleeper
    277   do_futex
    278   SyS_futex
    279   entry_SYSCALL_64_fastpath
    280     --
    281   __pthread_cond_timedwait
    282   clock_gettime
    283   [unknown]
    284     java [4996]
    285     79
    286 
    287 Detaching...
    288 
    289 I was just trying to find a more interesting example. This output includes
    290 some Java stacks where user-level has been walked correctly (even includes a
    291 JIT symbol translation). dockerd and containerd don't have frame pointers
    292 (grumble), but Java does (which is running with -XX:+PreserveFramePointer).
    293 
    294 
    295 Here's another kernel function, ip_output():
    296 
    297 # ./stackcount.py -d ip_output
    298 Tracing 1 functions for "ip_output"... Hit Ctrl-C to end.
    299 ^C
    300   ip_output
    301   ip_queue_xmit
    302   tcp_transmit_skb
    303   tcp_write_xmit
    304   __tcp_push_pending_frames
    305   tcp_push
    306   tcp_sendmsg
    307   inet_sendmsg
    308   sock_sendmsg
    309   sock_write_iter
    310   __vfs_write
    311   vfs_write
    312   SyS_write
    313   entry_SYSCALL_64_fastpath
    314     --
    315   __write_nocancel
    316   [unknown]
    317     sshd [15015]
    318     5
    319 
    320   ip_output
    321   ip_queue_xmit
    322   tcp_transmit_skb
    323   tcp_write_xmit
    324   __tcp_push_pending_frames
    325   tcp_push
    326   tcp_sendmsg
    327   inet_sendmsg
    328   sock_sendmsg
    329   sock_write_iter
    330   __vfs_write
    331   vfs_write
    332   SyS_write
    333   entry_SYSCALL_64_fastpath
    334     --
    335   __write_nocancel
    336   [unknown]
    337   [unknown]
    338     sshd [8234]
    339     5
    340 
    341   ip_output
    342   ip_queue_xmit
    343   tcp_transmit_skb
    344   tcp_write_xmit
    345   __tcp_push_pending_frames
    346   tcp_push
    347   tcp_sendmsg
    348   inet_sendmsg
    349   sock_sendmsg
    350   sock_write_iter
    351   __vfs_write
    352   vfs_write
    353   SyS_write
    354   entry_SYSCALL_64_fastpath
    355     --
    356   __write_nocancel
    357     sshd [15015]
    358     7
    359 
    360 Detaching...
    361 
    362 This time just sshd is triggering ip_output() calls.
    363 
    364 
    365 Watch what happens if I filter on kernel stacks only (-K) for ip_output():
    366 
    367 # ./stackcount.py -K ip_output
    368 Tracing 1 functions for "ip_output"... Hit Ctrl-C to end.
    369 ^C
    370   ip_output
    371   ip_queue_xmit
    372   tcp_transmit_skb
    373   tcp_write_xmit
    374   __tcp_push_pending_frames
    375   tcp_push
    376   tcp_sendmsg
    377   inet_sendmsg
    378   sock_sendmsg
    379   sock_write_iter
    380   __vfs_write
    381   vfs_write
    382   SyS_write
    383   entry_SYSCALL_64_fastpath
    384     13
    385 
    386 Detaching...
    387 
    388 They have grouped together as a single unique stack, since the kernel part
    389 was the same.
    390 
    391 
    392 Here is just the user stacks, fetched during the kernel function ip_output():
    393 
    394 # ./stackcount.py -U ip_output
    395 Tracing 1 functions for "ip_output"... Hit Ctrl-C to end.
    396 ^C
    397   [unknown]
    398     snmpd [1645]
    399     1
    400 
    401   __write_nocancel
    402   [unknown]
    403   [unknown]
    404     sshd [8234]
    405     3
    406 
    407   __write_nocancel
    408     sshd [15015]
    409     4
    410 
    411 I should really run a custom sshd with frame pointers so we can see its
    412 stack trace...
    413 
    414 
    415 User-space functions can also be traced if a library name is provided. For
    416 example, to quickly identify code locations that allocate heap memory for
    417 PID 4902 (using -p), by tracing malloc from libc ("c:malloc"):
    418 
    419 # ./stackcount -p 4902 c:malloc
    420 Tracing 1 functions for "malloc"... Hit Ctrl-C to end.
    421 ^C
    422   malloc
    423   rbtree_new
    424   main
    425   [unknown]
    426     12
    427 
    428   malloc
    429   _rbtree_node_new_internal
    430   _rbtree_node_insert
    431   rbtree_insert
    432   main
    433   [unknown]
    434     1189
    435 
    436 Detaching...
    437 
    438 Kernel stacks are absent as this didn't enter kernel code.
    439 
    440 Note that user-space uses of stackcount can be somewhat more limited because
    441 a lot of user-space libraries and binaries are compiled without frame-pointers
    442 as discussed earlier (the -fomit-frame-pointer compiler default) or are used
    443 without debuginfo.
    444 
    445 
    446 In addition to kernel and user-space functions, kernel tracepoints and USDT
    447 tracepoints are also supported. 
    448 
    449 For example, to determine where threads are being created in a particular 
    450 process, use the pthread_create USDT tracepoint:
    451 
    452 # ./stackcount -p $(pidof parprimes) u:pthread:pthread_create
    453 Tracing 1 functions for "u:pthread:pthread_create"... Hit Ctrl-C to end.
    454 ^C
    455 
    456     parprimes [11923]
    457   pthread_create@@GLIBC_2.2.5
    458   main
    459   __libc_start_main
    460   [unknown]
    461     7
    462 
    463 You can use "readelf -n file" to see if it has USDT tracepoints.
    464 
    465 
    466 Similarly, to determine where context switching is happening in the kernel, 
    467 use the sched:sched_switch kernel tracepoint:
    468 
    469 # ./stackcount t:sched:sched_switch
    470   __schedule
    471   schedule
    472   worker_thread
    473   kthread
    474   ret_from_fork
    475     kworker/0:2 [25482]
    476     1
    477 
    478   __schedule
    479   schedule
    480   schedule_hrtimeout_range_clock
    481   schedule_hrtimeout_range
    482   ep_poll
    483   SyS_epoll_wait
    484   entry_SYSCALL_64_fastpath
    485   epoll_wait
    486   Lsun/nio/ch/SelectorImpl;::lockAndDoSelect
    487   Lsun/nio/ch/SelectorImpl;::select
    488   Lio/netty/channel/nio/NioEventLoop;::select
    489   Lio/netty/channel/nio/NioEventLoop;::run
    490   Interpreter
    491   Interpreter
    492   call_stub
    493   JavaCalls::call_helper(JavaValue*, methodHandle*, JavaCallArguments*, Thread*)
    494   JavaCalls::call_virtual(JavaValue*, KlassHandle, Symbol*, Symbol*, JavaCallArguments*, Thread*)
    495   JavaCalls::call_virtual(JavaValue*, Handle, KlassHandle, Symbol*, Symbol*, Thread*)
    496   thread_entry(JavaThread*, Thread*)
    497   JavaThread::thread_main_inner()
    498   JavaThread::run()
    499   java_start(Thread*)
    500   start_thread
    501     java [4996]
    502     1
    503 
    504 ... (omitted for brevity)
    505 
    506   __schedule
    507   schedule
    508   schedule_preempt_disabled
    509   cpu_startup_entry
    510   xen_play_dead
    511   arch_cpu_idle_dead
    512   cpu_startup_entry
    513   cpu_bringup_and_idle
    514     swapper/1 [0]
    515     289
    516 
    517 
    518 A -i option can be used to set an output interval, and -T to include a
    519 timestamp. For example:
    520 
    521 # ./stackcount.py -Tdi 1 submit_bio
    522 Tracing 1 functions for "submit_bio"... Hit Ctrl-C to end.
    523 
    524 06:05:13
    525 
    526 06:05:14
    527   submit_bio
    528   xfs_do_writepage
    529   write_cache_pages
    530   xfs_vm_writepages
    531   do_writepages
    532   __writeback_single_inode
    533   writeback_sb_inodes
    534   __writeback_inodes_wb
    535   wb_writeback
    536   wb_workfn
    537   process_one_work
    538   worker_thread
    539   kthread
    540   ret_from_fork
    541     --
    542     kworker/u16:1 [15686]
    543     1
    544 
    545   submit_bio
    546   process_one_work
    547   worker_thread
    548   kthread
    549   ret_from_fork
    550     --
    551     kworker/u16:0 [16007]
    552     1
    553 
    554   submit_bio
    555   xfs_buf_submit
    556   xlog_bdstrat
    557   xlog_sync
    558   xlog_state_release_iclog
    559   _xfs_log_force
    560   xfs_log_force
    561   xfs_fs_sync_fs
    562   sync_fs_one_sb
    563   iterate_supers
    564   sys_sync
    565   entry_SYSCALL_64_fastpath
    566     --
    567   [unknown]
    568     sync [16039]
    569     1
    570 
    571   submit_bio
    572   submit_bh
    573   journal_submit_commit_record
    574   jbd2_journal_commit_transaction
    575   kjournald2
    576   kthread
    577   ret_from_fork
    578     --
    579     jbd2/xvda1-8 [405]
    580     1
    581 
    582   submit_bio
    583   process_one_work
    584   worker_thread
    585   kthread
    586   ret_from_fork
    587     --
    588     kworker/0:2 [25482]
    589     2
    590 
    591   submit_bio
    592   ext4_writepages
    593   do_writepages
    594   __writeback_single_inode
    595   writeback_sb_inodes
    596   __writeback_inodes_wb
    597   wb_writeback
    598   wb_workfn
    599   process_one_work
    600   worker_thread
    601   kthread
    602   ret_from_fork
    603     --
    604     kworker/u16:0 [16007]
    605     4
    606 
    607   submit_bio
    608   xfs_vm_writepages
    609   do_writepages
    610   __writeback_single_inode
    611   writeback_sb_inodes
    612   __writeback_inodes_wb
    613   wb_writeback
    614   wb_workfn
    615   process_one_work
    616   worker_thread
    617   kthread
    618   ret_from_fork
    619     --
    620     kworker/u16:1 [15686]
    621     5
    622 
    623   submit_bio
    624   __block_write_full_page
    625   block_write_full_page
    626   blkdev_writepage
    627   __writepage
    628   write_cache_pages
    629   generic_writepages
    630   blkdev_writepages
    631   do_writepages
    632   __filemap_fdatawrite_range
    633   filemap_fdatawrite
    634   fdatawrite_one_bdev
    635   iterate_bdevs
    636   sys_sync
    637   entry_SYSCALL_64_fastpath
    638     --
    639   [unknown]
    640     sync [16039]
    641     7
    642 
    643   submit_bio
    644   submit_bh
    645   jbd2_journal_commit_transaction
    646   kjournald2
    647   kthread
    648   ret_from_fork
    649     --
    650     jbd2/xvda1-8 [405]
    651     8
    652 
    653   submit_bio
    654   ext4_bio_write_page
    655   mpage_submit_page
    656   mpage_map_and_submit_buffers
    657   ext4_writepages
    658   do_writepages
    659   __writeback_single_inode
    660   writeback_sb_inodes
    661   __writeback_inodes_wb
    662   wb_writeback
    663   wb_workfn
    664   process_one_work
    665   worker_thread
    666   kthread
    667   ret_from_fork
    668     --
    669     kworker/u16:0 [16007]
    670     8
    671 
    672   submit_bio
    673   __block_write_full_page
    674   block_write_full_page
    675   blkdev_writepage
    676   __writepage
    677   write_cache_pages
    678   generic_writepages
    679   blkdev_writepages
    680   do_writepages
    681   __writeback_single_inode
    682   writeback_sb_inodes
    683   __writeback_inodes_wb
    684   wb_writeback
    685   wb_workfn
    686   process_one_work
    687   worker_thread
    688   kthread
    689   ret_from_fork
    690     --
    691     kworker/u16:0 [16007]
    692     60
    693 
    694 
    695 06:05:15
    696 
    697 06:05:16
    698 
    699 Detaching...
    700 
    701 This only included output for the 06:05:14 interval. The other internals
    702 did not span block device I/O.
    703 
    704 
    705 The -s output prints the return instruction offset for each function (aka
    706 symbol offset). Eg:
    707 
    708 # ./stackcount.py -s tcp_sendmsg
    709 Tracing 1 functions for "tcp_sendmsg"... Hit Ctrl-C to end.
    710 ^C
    711   tcp_sendmsg+0x1
    712   sock_sendmsg+0x38
    713   sock_write_iter+0x85
    714   __vfs_write+0xe3
    715   vfs_write+0xb8
    716   SyS_write+0x55
    717   entry_SYSCALL_64_fastpath+0x1e
    718   __write_nocancel+0x7
    719     sshd [15015]
    720     3
    721 
    722   tcp_sendmsg+0x1
    723   sock_sendmsg+0x38
    724   sock_write_iter+0x85
    725   __vfs_write+0xe3
    726   vfs_write+0xb8
    727   SyS_write+0x55
    728   entry_SYSCALL_64_fastpath+0x1e
    729   __write_nocancel+0x7
    730     sshd [8234]
    731     3
    732 
    733 Detaching...
    734 
    735 If it wasn't clear how one function called another, knowing the instruction
    736 offset can help you locate the lines of code from a disassembly dump.
    737 
    738 
    739 The -v output is verbose, and shows raw addresses:
    740 
    741 ./stackcount.py -v tcp_sendmsg
    742 Tracing 1 functions for "tcp_sendmsg"... Hit Ctrl-C to end.
    743 ^C
    744   ffffffff817b05c1 tcp_sendmsg
    745   ffffffff8173ea48 sock_sendmsg
    746   ffffffff8173eae5 sock_write_iter
    747   ffffffff81232b33 __vfs_write
    748   ffffffff812331b8 vfs_write
    749   ffffffff81234625 SyS_write
    750   ffffffff818739bb entry_SYSCALL_64_fastpath
    751   7f120511e6e0     __write_nocancel
    752     sshd [8234]
    753     3
    754 
    755   ffffffff817b05c1 tcp_sendmsg
    756   ffffffff8173ea48 sock_sendmsg
    757   ffffffff8173eae5 sock_write_iter
    758   ffffffff81232b33 __vfs_write
    759   ffffffff812331b8 vfs_write
    760   ffffffff81234625 SyS_write
    761   ffffffff818739bb entry_SYSCALL_64_fastpath
    762   7f919c5a26e0     __write_nocancel
    763     sshd [15015]
    764     11
    765 
    766 Detaching...
    767 
    768 
    769 A wildcard can also be used. Eg, all functions beginning with "tcp_send",
    770 kernel stacks only (-K) with offsets (-s):
    771 
    772 # ./stackcount -Ks 'tcp_send*'
    773 Tracing 14 functions for "tcp_send*"... Hit Ctrl-C to end.
    774 ^C
    775   tcp_send_delayed_ack0x1
    776   tcp_rcv_established0x3b1
    777   tcp_v4_do_rcv0x130
    778   tcp_v4_rcv0x8e0
    779   ip_local_deliver_finish0x9f
    780   ip_local_deliver0x51
    781   ip_rcv_finish0x8a
    782   ip_rcv0x29d
    783   __netif_receive_skb_core0x637
    784   __netif_receive_skb0x18
    785   netif_receive_skb_internal0x23
    786     1
    787 
    788   tcp_send_delayed_ack0x1
    789   tcp_rcv_established0x222
    790   tcp_v4_do_rcv0x130
    791   tcp_v4_rcv0x8e0
    792   ip_local_deliver_finish0x9f
    793   ip_local_deliver0x51
    794   ip_rcv_finish0x8a
    795   ip_rcv0x29d
    796   __netif_receive_skb_core0x637
    797   __netif_receive_skb0x18
    798   netif_receive_skb_internal0x23
    799     4
    800 
    801   tcp_send_mss0x1
    802   inet_sendmsg0x67
    803   sock_sendmsg0x38
    804   sock_write_iter0x78
    805   __vfs_write0xaa
    806   vfs_write0xa9
    807   sys_write0x46
    808   entry_SYSCALL_64_fastpath0x16
    809     7
    810 
    811   tcp_sendmsg0x1
    812   sock_sendmsg0x38
    813   sock_write_iter0x78
    814   __vfs_write0xaa
    815   vfs_write0xa9
    816   sys_write0x46
    817   entry_SYSCALL_64_fastpath0x16
    818     7
    819 
    820 Detaching...
    821 
    822 Use -r to allow regular expressions.
    823 
    824 
    825 The -f option will emit folded output, which can be used as input to other
    826 tools including flame graphs. For example, with delimiters as well:
    827 
    828 # ./stackcount.py -df t:sched:sched_switch
    829 ^Csnmp-pass;[unknown];[unknown];[unknown];[unknown];[unknown];-;entry_SYSCALL_64_fastpath;SyS_select;core_sys_select;do_select;poll_schedule_timeout;schedule_hrtimeout_range;schedule_hrtimeout_range_clock;schedule;__schedule 1
    830 kworker/7:0;-;ret_from_fork;kthread;worker_thread;schedule;__schedule 1
    831 watchdog/0;-;ret_from_fork;kthread;smpboot_thread_fn;schedule;__schedule 1
    832 snmp-pass;[unknown];[unknown];[unknown];[unknown];[unknown];-;entry_SYSCALL_64_fastpath;SyS_select;core_sys_select;do_select;poll_schedule_timeout;schedule_hrtimeout_range;schedule_hrtimeout_range_clock;schedule;__schedule 1
    833 svscan;[unknown];-;entry_SYSCALL_64_fastpath;SyS_nanosleep;hrtimer_nanosleep;do_nanosleep;schedule;__schedule 1
    834 python;[unknown];__select_nocancel;-;entry_SYSCALL_64_fastpath;SyS_select;core_sys_select;do_select;poll_schedule_timeout;schedule_hrtimeout_range;schedule_hrtimeout_range_clock;schedule;__schedule 1
    835 kworker/2:0;-;ret_from_fork;kthread;worker_thread;schedule;__schedule 1
    836 [...]
    837 
    838 Flame graphs visualize stack traces. For information about them and links
    839 to open source software, see http://www.brendangregg.com/flamegraphs.html .
    840 This folded output can be piped directly into flamegraph.pl (the Perl version).
    841 
    842 
    843 USAGE message:
    844 
    845 # ./stackcount -h
    846 usage: stackcount [-h] [-p PID] [-i INTERVAL] [-D DURATION] [-T] [-r] [-s]
    847                   [-P] [-K] [-U] [-v] [-d] [-f] [--debug]
    848                   pattern
    849 
    850 Count events and their stack traces
    851 
    852 positional arguments:
    853   pattern               search expression for events
    854 
    855 optional arguments:
    856   -h, --help            show this help message and exit
    857   -p PID, --pid PID     trace this PID only
    858   -i INTERVAL, --interval INTERVAL
    859                         summary interval, seconds
    860   -D DURATION, --duration DURATION
    861                         total duration of trace, seconds
    862   -T, --timestamp       include timestamp on output
    863   -r, --regexp          use regular expressions. Default is "*" wildcards
    864                         only.
    865   -s, --offset          show address offsets
    866   -P, --perpid          display stacks separately for each process
    867   -K, --kernel-stacks-only
    868                         kernel stack only
    869   -U, --user-stacks-only
    870                         user stack only
    871   -v, --verbose         show raw addresses
    872   -d, --delimited       insert delimiter between kernel/user stacks
    873   -f, --folded          output folded format
    874   --debug               print BPF program before starting (for debugging
    875                         purposes)
    876 
    877 examples:
    878     ./stackcount submit_bio         # count kernel stack traces for submit_bio
    879     ./stackcount -d ip_output       # include a user/kernel stack delimiter
    880     ./stackcount -s ip_output       # show symbol offsets
    881     ./stackcount -sv ip_output      # show offsets and raw addresses (verbose)
    882     ./stackcount 'tcp_send*'        # count stacks for funcs matching tcp_send*
    883     ./stackcount -r '^tcp_send.*'   # same as above, using regular expressions
    884     ./stackcount -Ti 5 ip_output    # output every 5 seconds, with timestamps
    885     ./stackcount -p 185 ip_output   # count ip_output stacks for PID 185 only
    886     ./stackcount -p 185 c:malloc    # count stacks for malloc in PID 185
    887     ./stackcount t:sched:sched_fork # count stacks for sched_fork tracepoint
    888     ./stackcount -p 185 u:node:*    # count stacks for all USDT probes in node
    889     ./stackcount -K t:sched:sched_switch   # kernel stacks only
    890     ./stackcount -U t:sched:sched_switch   # user stacks only
    891