1 Demonstrations of stackcount, the Linux eBPF/bcc version. 2 3 4 This program traces functions and frequency counts them with their entire 5 stack trace, summarized in-kernel for efficiency. For example, counting 6 stack traces that led to the submit_bio() kernel function, which creates 7 block device I/O: 8 9 # ./stackcount submit_bio 10 Tracing 1 functions for "submit_bio"... Hit Ctrl-C to end. 11 ^C 12 submit_bio 13 ext4_writepages 14 do_writepages 15 __filemap_fdatawrite_range 16 filemap_flush 17 ext4_alloc_da_blocks 18 ext4_release_file 19 __fput 20 ____fput 21 task_work_run 22 exit_to_usermode_loop 23 syscall_return_slowpath 24 entry_SYSCALL_64_fastpath 25 [unknown] 26 [unknown] 27 tar [15069] 28 5 29 30 submit_bio 31 ext4_bio_write_page 32 mpage_submit_page 33 mpage_map_and_submit_buffers 34 ext4_writepages 35 do_writepages 36 __filemap_fdatawrite_range 37 filemap_flush 38 ext4_alloc_da_blocks 39 ext4_release_file 40 __fput 41 ____fput 42 task_work_run 43 exit_to_usermode_loop 44 syscall_return_slowpath 45 entry_SYSCALL_64_fastpath 46 [unknown] 47 [unknown] 48 tar [15069] 49 15 50 51 submit_bio 52 ext4_readpages 53 __do_page_cache_readahead 54 ondemand_readahead 55 page_cache_async_readahead 56 generic_file_read_iter 57 __vfs_read 58 vfs_read 59 sys_read 60 entry_SYSCALL_64_fastpath 61 [unknown] 62 tar [15069] 63 113 64 65 Detaching... 66 67 The output shows unique stack traces, in order from leaf (on-CPU) to root, 68 followed by their occurrence count. The last stack trace in the above output 69 shows syscall handling, sys_read(), vfs_read(), and then "readahead" functions: 70 looks like an application issued file read has triggered read ahead. The 71 application can be seen after the stack trace, in this case, "tar [15069]" 72 for the "tar" command, PID 15069. 73 74 The order of printed stack traces is from least to most frequent. The most 75 frequent in this case, the ext4_rename() stack, was taken 113 times during 76 tracing. 77 78 The "[unknown]" frames are from user-level, since this simple workload is 79 the tar command, which apparently has been compiled without frame pointers. 80 It's a common compiler optimization, but it breaks frame pointer-based stack 81 walkers. Similar broken stacks will be seen by other profilers and debuggers 82 that use frame pointers. Hopefully your application preserves them so that 83 the user-level stack trace is visible. So how does one get frame pointers, if 84 your application doesn't have them to start with? For the current bcc (until 85 it supports other stack walkers), you need to be running a application binaries 86 that preserves frame pointers, eg, using gcc's -fno-omit-frame-pointer. That's 87 about all I'll say here: this is a big topic that is not bcc/BPF specific. 88 89 It can be useful to trace the path to submit_bio to explain unusual rates of 90 disk IOPS. These could have in-kernel origins (eg, background scrub). 91 92 93 Now adding the -d option to delimit kernel and user stacks: 94 95 # ./stackcount -d submit_bio 96 Tracing 1 functions for "submit_bio"... Hit Ctrl-C to end. 97 ^C 98 submit_bio 99 submit_bh 100 journal_submit_commit_record 101 jbd2_journal_commit_transaction 102 kjournald2 103 kthread 104 ret_from_fork 105 -- 106 jbd2/xvda1-8 [405] 107 1 108 109 submit_bio 110 submit_bh 111 jbd2_journal_commit_transaction 112 kjournald2 113 kthread 114 ret_from_fork 115 -- 116 jbd2/xvda1-8 [405] 117 2 118 119 submit_bio 120 ext4_writepages 121 do_writepages 122 __filemap_fdatawrite_range 123 filemap_flush 124 ext4_alloc_da_blocks 125 ext4_release_file 126 __fput 127 ____fput 128 task_work_run 129 exit_to_usermode_loop 130 syscall_return_slowpath 131 entry_SYSCALL_64_fastpath 132 -- 133 [unknown] 134 [unknown] 135 tar [15187] 136 5 137 138 submit_bio 139 ext4_bio_write_page 140 mpage_submit_page 141 mpage_map_and_submit_buffers 142 ext4_writepages 143 do_writepages 144 __filemap_fdatawrite_range 145 filemap_flush 146 ext4_alloc_da_blocks 147 ext4_release_file 148 __fput 149 ____fput 150 task_work_run 151 exit_to_usermode_loop 152 syscall_return_slowpath 153 entry_SYSCALL_64_fastpath 154 -- 155 [unknown] 156 [unknown] 157 tar [15187] 158 15 159 160 submit_bio 161 ext4_readpages 162 __do_page_cache_readahead 163 ondemand_readahead 164 page_cache_async_readahead 165 generic_file_read_iter 166 __vfs_read 167 vfs_read 168 sys_read 169 entry_SYSCALL_64_fastpath 170 -- 171 [unknown] 172 [unknown] 173 [unknown] 174 tar [15187] 175 171 176 177 Detaching... 178 179 A "--" is printed between the kernel and user stacks. 180 181 182 As a different example, here is the kernel function hrtimer_init_sleeper(): 183 184 # ./stackcount.py -d hrtimer_init_sleeper 185 Tracing 1 functions for "hrtimer_init_sleeper"... Hit Ctrl-C to end. 186 ^C 187 hrtimer_init_sleeper 188 do_futex 189 SyS_futex 190 entry_SYSCALL_64_fastpath 191 -- 192 [unknown] 193 containerd [16020] 194 1 195 196 hrtimer_init_sleeper 197 do_futex 198 SyS_futex 199 entry_SYSCALL_64_fastpath 200 -- 201 __pthread_cond_timedwait 202 Monitor::IWait(Thread*, long) 203 Monitor::wait(bool, long, bool) 204 CompileQueue::get() 205 CompileBroker::compiler_thread_loop() 206 JavaThread::thread_main_inner() 207 JavaThread::run() 208 java_start(Thread*) 209 start_thread 210 java [4996] 211 1 212 213 hrtimer_init_sleeper 214 do_futex 215 SyS_futex 216 entry_SYSCALL_64_fastpath 217 -- 218 [unknown] 219 [unknown] 220 containerd [16020] 221 1 222 223 hrtimer_init_sleeper 224 do_futex 225 SyS_futex 226 entry_SYSCALL_64_fastpath 227 -- 228 __pthread_cond_timedwait 229 VMThread::loop() 230 VMThread::run() 231 java_start(Thread*) 232 start_thread 233 java [4996] 234 3 235 236 hrtimer_init_sleeper 237 do_futex 238 SyS_futex 239 entry_SYSCALL_64_fastpath 240 -- 241 [unknown] 242 dockerd [16008] 243 4 244 245 hrtimer_init_sleeper 246 do_futex 247 SyS_futex 248 entry_SYSCALL_64_fastpath 249 -- 250 [unknown] 251 [unknown] 252 dockerd [16008] 253 4 254 255 hrtimer_init_sleeper 256 do_futex 257 SyS_futex 258 entry_SYSCALL_64_fastpath 259 -- 260 __pthread_cond_timedwait 261 Lio/netty/util/ThreadDeathWatcher$Watcher;::run 262 Interpreter 263 Interpreter 264 call_stub 265 JavaCalls::call_helper(JavaValue*, methodHandle*, JavaCallArguments*, Thread*) 266 JavaCalls::call_virtual(JavaValue*, KlassHandle, Symbol*, Symbol*, JavaCallArguments*, Thread*) 267 JavaCalls::call_virtual(JavaValue*, Handle, KlassHandle, Symbol*, Symbol*, Thread*) 268 thread_entry(JavaThread*, Thread*) 269 JavaThread::thread_main_inner() 270 JavaThread::run() 271 java_start(Thread*) 272 start_thread 273 java [4996] 274 4 275 276 hrtimer_init_sleeper 277 do_futex 278 SyS_futex 279 entry_SYSCALL_64_fastpath 280 -- 281 __pthread_cond_timedwait 282 clock_gettime 283 [unknown] 284 java [4996] 285 79 286 287 Detaching... 288 289 I was just trying to find a more interesting example. This output includes 290 some Java stacks where user-level has been walked correctly (even includes a 291 JIT symbol translation). dockerd and containerd don't have frame pointers 292 (grumble), but Java does (which is running with -XX:+PreserveFramePointer). 293 294 295 Here's another kernel function, ip_output(): 296 297 # ./stackcount.py -d ip_output 298 Tracing 1 functions for "ip_output"... Hit Ctrl-C to end. 299 ^C 300 ip_output 301 ip_queue_xmit 302 tcp_transmit_skb 303 tcp_write_xmit 304 __tcp_push_pending_frames 305 tcp_push 306 tcp_sendmsg 307 inet_sendmsg 308 sock_sendmsg 309 sock_write_iter 310 __vfs_write 311 vfs_write 312 SyS_write 313 entry_SYSCALL_64_fastpath 314 -- 315 __write_nocancel 316 [unknown] 317 sshd [15015] 318 5 319 320 ip_output 321 ip_queue_xmit 322 tcp_transmit_skb 323 tcp_write_xmit 324 __tcp_push_pending_frames 325 tcp_push 326 tcp_sendmsg 327 inet_sendmsg 328 sock_sendmsg 329 sock_write_iter 330 __vfs_write 331 vfs_write 332 SyS_write 333 entry_SYSCALL_64_fastpath 334 -- 335 __write_nocancel 336 [unknown] 337 [unknown] 338 sshd [8234] 339 5 340 341 ip_output 342 ip_queue_xmit 343 tcp_transmit_skb 344 tcp_write_xmit 345 __tcp_push_pending_frames 346 tcp_push 347 tcp_sendmsg 348 inet_sendmsg 349 sock_sendmsg 350 sock_write_iter 351 __vfs_write 352 vfs_write 353 SyS_write 354 entry_SYSCALL_64_fastpath 355 -- 356 __write_nocancel 357 sshd [15015] 358 7 359 360 Detaching... 361 362 This time just sshd is triggering ip_output() calls. 363 364 365 Watch what happens if I filter on kernel stacks only (-K) for ip_output(): 366 367 # ./stackcount.py -K ip_output 368 Tracing 1 functions for "ip_output"... Hit Ctrl-C to end. 369 ^C 370 ip_output 371 ip_queue_xmit 372 tcp_transmit_skb 373 tcp_write_xmit 374 __tcp_push_pending_frames 375 tcp_push 376 tcp_sendmsg 377 inet_sendmsg 378 sock_sendmsg 379 sock_write_iter 380 __vfs_write 381 vfs_write 382 SyS_write 383 entry_SYSCALL_64_fastpath 384 13 385 386 Detaching... 387 388 They have grouped together as a single unique stack, since the kernel part 389 was the same. 390 391 392 Here is just the user stacks, fetched during the kernel function ip_output(): 393 394 # ./stackcount.py -U ip_output 395 Tracing 1 functions for "ip_output"... Hit Ctrl-C to end. 396 ^C 397 [unknown] 398 snmpd [1645] 399 1 400 401 __write_nocancel 402 [unknown] 403 [unknown] 404 sshd [8234] 405 3 406 407 __write_nocancel 408 sshd [15015] 409 4 410 411 I should really run a custom sshd with frame pointers so we can see its 412 stack trace... 413 414 415 User-space functions can also be traced if a library name is provided. For 416 example, to quickly identify code locations that allocate heap memory for 417 PID 4902 (using -p), by tracing malloc from libc ("c:malloc"): 418 419 # ./stackcount -p 4902 c:malloc 420 Tracing 1 functions for "malloc"... Hit Ctrl-C to end. 421 ^C 422 malloc 423 rbtree_new 424 main 425 [unknown] 426 12 427 428 malloc 429 _rbtree_node_new_internal 430 _rbtree_node_insert 431 rbtree_insert 432 main 433 [unknown] 434 1189 435 436 Detaching... 437 438 Kernel stacks are absent as this didn't enter kernel code. 439 440 Note that user-space uses of stackcount can be somewhat more limited because 441 a lot of user-space libraries and binaries are compiled without frame-pointers 442 as discussed earlier (the -fomit-frame-pointer compiler default) or are used 443 without debuginfo. 444 445 446 In addition to kernel and user-space functions, kernel tracepoints and USDT 447 tracepoints are also supported. 448 449 For example, to determine where threads are being created in a particular 450 process, use the pthread_create USDT tracepoint: 451 452 # ./stackcount -p $(pidof parprimes) u:pthread:pthread_create 453 Tracing 1 functions for "u:pthread:pthread_create"... Hit Ctrl-C to end. 454 ^C 455 456 parprimes [11923] 457 pthread_create@@GLIBC_2.2.5 458 main 459 __libc_start_main 460 [unknown] 461 7 462 463 You can use "readelf -n file" to see if it has USDT tracepoints. 464 465 466 Similarly, to determine where context switching is happening in the kernel, 467 use the sched:sched_switch kernel tracepoint: 468 469 # ./stackcount t:sched:sched_switch 470 __schedule 471 schedule 472 worker_thread 473 kthread 474 ret_from_fork 475 kworker/0:2 [25482] 476 1 477 478 __schedule 479 schedule 480 schedule_hrtimeout_range_clock 481 schedule_hrtimeout_range 482 ep_poll 483 SyS_epoll_wait 484 entry_SYSCALL_64_fastpath 485 epoll_wait 486 Lsun/nio/ch/SelectorImpl;::lockAndDoSelect 487 Lsun/nio/ch/SelectorImpl;::select 488 Lio/netty/channel/nio/NioEventLoop;::select 489 Lio/netty/channel/nio/NioEventLoop;::run 490 Interpreter 491 Interpreter 492 call_stub 493 JavaCalls::call_helper(JavaValue*, methodHandle*, JavaCallArguments*, Thread*) 494 JavaCalls::call_virtual(JavaValue*, KlassHandle, Symbol*, Symbol*, JavaCallArguments*, Thread*) 495 JavaCalls::call_virtual(JavaValue*, Handle, KlassHandle, Symbol*, Symbol*, Thread*) 496 thread_entry(JavaThread*, Thread*) 497 JavaThread::thread_main_inner() 498 JavaThread::run() 499 java_start(Thread*) 500 start_thread 501 java [4996] 502 1 503 504 ... (omitted for brevity) 505 506 __schedule 507 schedule 508 schedule_preempt_disabled 509 cpu_startup_entry 510 xen_play_dead 511 arch_cpu_idle_dead 512 cpu_startup_entry 513 cpu_bringup_and_idle 514 swapper/1 [0] 515 289 516 517 518 A -i option can be used to set an output interval, and -T to include a 519 timestamp. For example: 520 521 # ./stackcount.py -Tdi 1 submit_bio 522 Tracing 1 functions for "submit_bio"... Hit Ctrl-C to end. 523 524 06:05:13 525 526 06:05:14 527 submit_bio 528 xfs_do_writepage 529 write_cache_pages 530 xfs_vm_writepages 531 do_writepages 532 __writeback_single_inode 533 writeback_sb_inodes 534 __writeback_inodes_wb 535 wb_writeback 536 wb_workfn 537 process_one_work 538 worker_thread 539 kthread 540 ret_from_fork 541 -- 542 kworker/u16:1 [15686] 543 1 544 545 submit_bio 546 process_one_work 547 worker_thread 548 kthread 549 ret_from_fork 550 -- 551 kworker/u16:0 [16007] 552 1 553 554 submit_bio 555 xfs_buf_submit 556 xlog_bdstrat 557 xlog_sync 558 xlog_state_release_iclog 559 _xfs_log_force 560 xfs_log_force 561 xfs_fs_sync_fs 562 sync_fs_one_sb 563 iterate_supers 564 sys_sync 565 entry_SYSCALL_64_fastpath 566 -- 567 [unknown] 568 sync [16039] 569 1 570 571 submit_bio 572 submit_bh 573 journal_submit_commit_record 574 jbd2_journal_commit_transaction 575 kjournald2 576 kthread 577 ret_from_fork 578 -- 579 jbd2/xvda1-8 [405] 580 1 581 582 submit_bio 583 process_one_work 584 worker_thread 585 kthread 586 ret_from_fork 587 -- 588 kworker/0:2 [25482] 589 2 590 591 submit_bio 592 ext4_writepages 593 do_writepages 594 __writeback_single_inode 595 writeback_sb_inodes 596 __writeback_inodes_wb 597 wb_writeback 598 wb_workfn 599 process_one_work 600 worker_thread 601 kthread 602 ret_from_fork 603 -- 604 kworker/u16:0 [16007] 605 4 606 607 submit_bio 608 xfs_vm_writepages 609 do_writepages 610 __writeback_single_inode 611 writeback_sb_inodes 612 __writeback_inodes_wb 613 wb_writeback 614 wb_workfn 615 process_one_work 616 worker_thread 617 kthread 618 ret_from_fork 619 -- 620 kworker/u16:1 [15686] 621 5 622 623 submit_bio 624 __block_write_full_page 625 block_write_full_page 626 blkdev_writepage 627 __writepage 628 write_cache_pages 629 generic_writepages 630 blkdev_writepages 631 do_writepages 632 __filemap_fdatawrite_range 633 filemap_fdatawrite 634 fdatawrite_one_bdev 635 iterate_bdevs 636 sys_sync 637 entry_SYSCALL_64_fastpath 638 -- 639 [unknown] 640 sync [16039] 641 7 642 643 submit_bio 644 submit_bh 645 jbd2_journal_commit_transaction 646 kjournald2 647 kthread 648 ret_from_fork 649 -- 650 jbd2/xvda1-8 [405] 651 8 652 653 submit_bio 654 ext4_bio_write_page 655 mpage_submit_page 656 mpage_map_and_submit_buffers 657 ext4_writepages 658 do_writepages 659 __writeback_single_inode 660 writeback_sb_inodes 661 __writeback_inodes_wb 662 wb_writeback 663 wb_workfn 664 process_one_work 665 worker_thread 666 kthread 667 ret_from_fork 668 -- 669 kworker/u16:0 [16007] 670 8 671 672 submit_bio 673 __block_write_full_page 674 block_write_full_page 675 blkdev_writepage 676 __writepage 677 write_cache_pages 678 generic_writepages 679 blkdev_writepages 680 do_writepages 681 __writeback_single_inode 682 writeback_sb_inodes 683 __writeback_inodes_wb 684 wb_writeback 685 wb_workfn 686 process_one_work 687 worker_thread 688 kthread 689 ret_from_fork 690 -- 691 kworker/u16:0 [16007] 692 60 693 694 695 06:05:15 696 697 06:05:16 698 699 Detaching... 700 701 This only included output for the 06:05:14 interval. The other internals 702 did not span block device I/O. 703 704 705 The -s output prints the return instruction offset for each function (aka 706 symbol offset). Eg: 707 708 # ./stackcount.py -s tcp_sendmsg 709 Tracing 1 functions for "tcp_sendmsg"... Hit Ctrl-C to end. 710 ^C 711 tcp_sendmsg+0x1 712 sock_sendmsg+0x38 713 sock_write_iter+0x85 714 __vfs_write+0xe3 715 vfs_write+0xb8 716 SyS_write+0x55 717 entry_SYSCALL_64_fastpath+0x1e 718 __write_nocancel+0x7 719 sshd [15015] 720 3 721 722 tcp_sendmsg+0x1 723 sock_sendmsg+0x38 724 sock_write_iter+0x85 725 __vfs_write+0xe3 726 vfs_write+0xb8 727 SyS_write+0x55 728 entry_SYSCALL_64_fastpath+0x1e 729 __write_nocancel+0x7 730 sshd [8234] 731 3 732 733 Detaching... 734 735 If it wasn't clear how one function called another, knowing the instruction 736 offset can help you locate the lines of code from a disassembly dump. 737 738 739 The -v output is verbose, and shows raw addresses: 740 741 ./stackcount.py -v tcp_sendmsg 742 Tracing 1 functions for "tcp_sendmsg"... Hit Ctrl-C to end. 743 ^C 744 ffffffff817b05c1 tcp_sendmsg 745 ffffffff8173ea48 sock_sendmsg 746 ffffffff8173eae5 sock_write_iter 747 ffffffff81232b33 __vfs_write 748 ffffffff812331b8 vfs_write 749 ffffffff81234625 SyS_write 750 ffffffff818739bb entry_SYSCALL_64_fastpath 751 7f120511e6e0 __write_nocancel 752 sshd [8234] 753 3 754 755 ffffffff817b05c1 tcp_sendmsg 756 ffffffff8173ea48 sock_sendmsg 757 ffffffff8173eae5 sock_write_iter 758 ffffffff81232b33 __vfs_write 759 ffffffff812331b8 vfs_write 760 ffffffff81234625 SyS_write 761 ffffffff818739bb entry_SYSCALL_64_fastpath 762 7f919c5a26e0 __write_nocancel 763 sshd [15015] 764 11 765 766 Detaching... 767 768 769 A wildcard can also be used. Eg, all functions beginning with "tcp_send", 770 kernel stacks only (-K) with offsets (-s): 771 772 # ./stackcount -Ks 'tcp_send*' 773 Tracing 14 functions for "tcp_send*"... Hit Ctrl-C to end. 774 ^C 775 tcp_send_delayed_ack0x1 776 tcp_rcv_established0x3b1 777 tcp_v4_do_rcv0x130 778 tcp_v4_rcv0x8e0 779 ip_local_deliver_finish0x9f 780 ip_local_deliver0x51 781 ip_rcv_finish0x8a 782 ip_rcv0x29d 783 __netif_receive_skb_core0x637 784 __netif_receive_skb0x18 785 netif_receive_skb_internal0x23 786 1 787 788 tcp_send_delayed_ack0x1 789 tcp_rcv_established0x222 790 tcp_v4_do_rcv0x130 791 tcp_v4_rcv0x8e0 792 ip_local_deliver_finish0x9f 793 ip_local_deliver0x51 794 ip_rcv_finish0x8a 795 ip_rcv0x29d 796 __netif_receive_skb_core0x637 797 __netif_receive_skb0x18 798 netif_receive_skb_internal0x23 799 4 800 801 tcp_send_mss0x1 802 inet_sendmsg0x67 803 sock_sendmsg0x38 804 sock_write_iter0x78 805 __vfs_write0xaa 806 vfs_write0xa9 807 sys_write0x46 808 entry_SYSCALL_64_fastpath0x16 809 7 810 811 tcp_sendmsg0x1 812 sock_sendmsg0x38 813 sock_write_iter0x78 814 __vfs_write0xaa 815 vfs_write0xa9 816 sys_write0x46 817 entry_SYSCALL_64_fastpath0x16 818 7 819 820 Detaching... 821 822 Use -r to allow regular expressions. 823 824 825 The -f option will emit folded output, which can be used as input to other 826 tools including flame graphs. For example, with delimiters as well: 827 828 # ./stackcount.py -df t:sched:sched_switch 829 ^Csnmp-pass;[unknown];[unknown];[unknown];[unknown];[unknown];-;entry_SYSCALL_64_fastpath;SyS_select;core_sys_select;do_select;poll_schedule_timeout;schedule_hrtimeout_range;schedule_hrtimeout_range_clock;schedule;__schedule 1 830 kworker/7:0;-;ret_from_fork;kthread;worker_thread;schedule;__schedule 1 831 watchdog/0;-;ret_from_fork;kthread;smpboot_thread_fn;schedule;__schedule 1 832 snmp-pass;[unknown];[unknown];[unknown];[unknown];[unknown];-;entry_SYSCALL_64_fastpath;SyS_select;core_sys_select;do_select;poll_schedule_timeout;schedule_hrtimeout_range;schedule_hrtimeout_range_clock;schedule;__schedule 1 833 svscan;[unknown];-;entry_SYSCALL_64_fastpath;SyS_nanosleep;hrtimer_nanosleep;do_nanosleep;schedule;__schedule 1 834 python;[unknown];__select_nocancel;-;entry_SYSCALL_64_fastpath;SyS_select;core_sys_select;do_select;poll_schedule_timeout;schedule_hrtimeout_range;schedule_hrtimeout_range_clock;schedule;__schedule 1 835 kworker/2:0;-;ret_from_fork;kthread;worker_thread;schedule;__schedule 1 836 [...] 837 838 Flame graphs visualize stack traces. For information about them and links 839 to open source software, see http://www.brendangregg.com/flamegraphs.html . 840 This folded output can be piped directly into flamegraph.pl (the Perl version). 841 842 843 USAGE message: 844 845 # ./stackcount -h 846 usage: stackcount [-h] [-p PID] [-i INTERVAL] [-D DURATION] [-T] [-r] [-s] 847 [-P] [-K] [-U] [-v] [-d] [-f] [--debug] 848 pattern 849 850 Count events and their stack traces 851 852 positional arguments: 853 pattern search expression for events 854 855 optional arguments: 856 -h, --help show this help message and exit 857 -p PID, --pid PID trace this PID only 858 -i INTERVAL, --interval INTERVAL 859 summary interval, seconds 860 -D DURATION, --duration DURATION 861 total duration of trace, seconds 862 -T, --timestamp include timestamp on output 863 -r, --regexp use regular expressions. Default is "*" wildcards 864 only. 865 -s, --offset show address offsets 866 -P, --perpid display stacks separately for each process 867 -K, --kernel-stacks-only 868 kernel stack only 869 -U, --user-stacks-only 870 user stack only 871 -v, --verbose show raw addresses 872 -d, --delimited insert delimiter between kernel/user stacks 873 -f, --folded output folded format 874 --debug print BPF program before starting (for debugging 875 purposes) 876 877 examples: 878 ./stackcount submit_bio # count kernel stack traces for submit_bio 879 ./stackcount -d ip_output # include a user/kernel stack delimiter 880 ./stackcount -s ip_output # show symbol offsets 881 ./stackcount -sv ip_output # show offsets and raw addresses (verbose) 882 ./stackcount 'tcp_send*' # count stacks for funcs matching tcp_send* 883 ./stackcount -r '^tcp_send.*' # same as above, using regular expressions 884 ./stackcount -Ti 5 ip_output # output every 5 seconds, with timestamps 885 ./stackcount -p 185 ip_output # count ip_output stacks for PID 185 only 886 ./stackcount -p 185 c:malloc # count stacks for malloc in PID 185 887 ./stackcount t:sched:sched_fork # count stacks for sched_fork tracepoint 888 ./stackcount -p 185 u:node:* # count stacks for all USDT probes in node 889 ./stackcount -K t:sched:sched_switch # kernel stacks only 890 ./stackcount -U t:sched:sched_switch # user stacks only 891