Home | History | Annotate | Download | only in doc
      1 <!doctype linuxdoc system>
      2 
      3 <article>
      4 
      5 <title>SS Utility: Quick Intro
      6 <author>Alexey Kuznetosv, <tt/kuznet@ms2.inr.ac.ru/
      7 <date>some_negative_number, 20 Sep 2001
      8 <abstract>
      9 <tt/ss/ is one another utility to investigate sockets.
     10 Functionally it is NOT better than <tt/netstat/ combined
     11 with some perl/awk scripts and though it is surely faster
     12 it is not enough to make it much better. :-)
     13 So, stop reading this now and do not waste your time.
     14 Well, certainly, it proposes some functionality, which current
     15 netstat is still not able to do, but surely will soon.
     16 </abstract>
     17 
     18 <sect>Why?
     19 
     20 <p> <tt>/proc</tt> interface is inadequate, unfortunately.
     21 When amount of sockets is enough large, <tt/netstat/ or even
     22 plain <tt>cat /proc/net/tcp/</tt> cause nothing but pains and curses.
     23 In linux-2.4 the desease became worse: even if amount
     24 of sockets is small reading <tt>/proc/net/tcp/</tt> is slow enough.
     25 
     26 This utility presents a new approach, which is supposed to scale
     27 well. I am not going to describe technical details here and
     28 will concentrate on description of the command.
     29 The only important thing to say is that it is not so bad idea
     30 to load module <tt/tcp_diag/, which can be found in directory
     31 <tt/Modules/ of <tt/iproute2/. If you do not make this <tt/ss/
     32 will work, but it falls back to <tt>/proc</tt> and becomes slow
     33 like <tt/netstat/, well, a bit faster yet (see section "Some numbers"). 
     34 
     35 <sect>Old news
     36 
     37 <p>
     38 In the simplest form <tt/ss/ is equivalent to netstat
     39 with some small deviations.
     40 
     41 <itemize>
     42 <item><tt/ss -t -a/ dumps all TCP sockets
     43 <item><tt/ss -u -a/ dumps all UDP sockets
     44 <item><tt/ss -w -a/ dumps all RAW sockets
     45 <item><tt/ss -x -a/ dumps all UNIX sockets
     46 </itemize>
     47 
     48 <p>
     49 Option <tt/-o/ shows TCP timers state.
     50 Option <tt/-e/ shows some extended information.
     51 Etc. etc. etc. Seems, all the options of netstat related to sockets
     52 are supported. Though not AX.25 and other bizarres. :-)
     53 If someone wants, he can make support for decnet and ipx.
     54 Some rudimentary support for them is already present in iproute2 libutils,
     55 and I will be glad to see these new members.
     56 
     57 <p>
     58 However, standard functionality is a bit different:
     59 
     60 <p>
     61 The first: without option <tt/-a/ sockets in states
     62 <tt/TIME-WAIT/ and <tt/SYN-RECV/ are skipped too.
     63 It is more reasonable default, I think.
     64 
     65 <p>
     66 The second: format of UNIX sockets is different. It coincides
     67 with tcp/udp. Though standard kernel still does not allow to
     68 see write/read queues and peer address of connected UNIX sockets,
     69 the patch doing this exists.
     70 
     71 <p>
     72 The third: default is to dump only TCP sockets, rather than all of the types.
     73 
     74 <p>
     75 The next: by default it does not resolve numeric host addresses (like <tt/ip/)!
     76 Resolving is enabled with option <tt/-r/. Service names, usually stored
     77 in local files, are resolved by default. Also, if service database
     78 does not contain references to a port, <tt/ss/ queries system
     79 <tt/rpcbind/. RPC services are prefixed with <tt/rpc./
     80 Resolution of services may be suppressed with option <tt/-n/.
     81 
     82 <p>
     83 It does not accept "long" options (I dislike them, sorry).
     84 So, address family is given with family identifier following
     85 option <tt/-f/ to be algined to iproute2 conventions.
     86 Mostly, it is to allow option parser to parse
     87 addresses correctly, but as side effect it really limits dumping
     88 to sockets supporting only given family. Option <tt/-A/ followed
     89 by list of socket tables to dump is also supported.
     90 Logically, id of socket table is different of _address_ family, which is
     91 another point of incompatibility. So, id is one of
     92 <tt/all/, <tt/tcp/, <tt/udp/,
     93 <tt/raw/, <tt/inet/, <tt/unix/, <tt/packet/, <tt/netlink/. See?
     94 Well, <tt/inet/ is just abbreviation for <tt/tcp|udp|raw/
     95 and it is not difficult to guess that <tt/packet/ allows
     96 to look at packet sockets. Actually, there are also some other abbreviations,
     97 f.e. <tt/unix_dgram/ selects only datagram UNIX sockets.
     98 
     99 <p>
    100 The next: well, I still do not know. :-)
    101 
    102 
    103 
    104 
    105 <sect>Time to talk about new functionality.
    106 
    107 <p>It is builtin filtering of socket lists. 
    108 
    109 <sect1> Filtering by state.
    110 
    111 <p>
    112 <tt/ss/ allows to filter socket states, using keywords
    113 <tt/state/ and <tt/exclude/, followed by some state
    114 identifier.
    115 
    116 <p>
    117 State identifier are standard TCP state names (not listed,
    118 they are useless for you if you already do not know them)
    119 or abbreviations:
    120 
    121 <itemize>
    122 <item><tt/all/        - for all the states
    123 <item><tt/bucket/     - for TCP minisockets (<tt/TIME-WAIT|SYN-RECV/)
    124 <item><tt/big/	      - all except for minisockets
    125 <item><tt/connected/  - not closed and not listening
    126 <item><tt/synchronized/ - connected and not <tt/SYN-SENT/
    127 </itemize>
    128 
    129 <p>
    130    F.e. to dump all tcp sockets except <tt/SYN-RECV/:
    131 
    132 <tscreen><verb>
    133    ss exclude SYN-RECV
    134 </verb></tscreen>
    135 
    136 <p>
    137    If neither <tt/state/ nor <tt/exclude/ directives
    138    are present,
    139    state filter defaults to <tt/all/ with option <tt/-a/
    140    or to <tt/all/,
    141    excluding listening, syn-recv, time-wait and closed sockets.
    142 
    143 <sect1> Filtering by addresses and ports.
    144 
    145 <p>
    146 Option list may contain address/port filter.
    147 It is boolean expression which consists of boolean operation
    148 <tt/or/, <tt/and/, <tt/not/ and predicates. 
    149 Actually, all the flavors of names for boolean operations are eaten:
    150 <tt/&amp/, <tt/&amp&amp/, <tt/|/, <tt/||/, <tt/!/, but do not forget
    151 about special sense given to these symbols by unix shells and escape
    152 them correctly, when used from command line.
    153 
    154 <p>
    155 Predicates may be of the folowing kinds:
    156 
    157 <itemize>
    158 <item>A. Address/port match, where address is checked against mask
    159       and port is either wildcard or exact. It is one of:
    160  
    161 <tscreen><verb>
    162 	dst prefix:port
    163 	src prefix:port
    164 	src unix:STRING
    165 	src link:protocol:ifindex
    166 	src nl:channel:pid
    167 </verb></tscreen>
    168 
    169       Both prefix and port may be absent or replaced with <tt/*/,
    170       which means wildcard. UNIX socket use more powerful scheme
    171       matching to socket names by shell wildcards. Also, prefixes
    172       unix: and link: may be omitted, if address family is evident
    173       from context (with option <tt/-x/ or with <tt/-f unix/
    174       or with <tt/unix/ keyword) 
    175 
    176 <p>
    177       F.e.
    178 
    179 <tscreen><verb>
    180 	dst 10.0.0.1
    181 	dst 10.0.0.1:
    182 	dst 10.0.0.1/32:
    183 	dst 10.0.0.1:*
    184 </verb></tscreen>
    185    are equivalent and mean socket connected to
    186 	                 any port on host 10.0.0.1
    187 
    188 <tscreen><verb>
    189 	dst 10.0.0.0/24:22
    190 </verb></tscreen>
    191    sockets connected to port 22 on network
    192                           10.0.0.0...255.
    193 
    194 <p>
    195       Note that port separated of address with colon, which creates
    196       troubles with IPv6 addresses. Generally, we interpret the last
    197       colon as splitting port. To allow to give IPv6 addresses,
    198       trick like used in IPv6 HTTP URLs may be used:
    199 
    200 <tscreen><verb>
    201       dst [::1]
    202 </verb></tscreen>
    203        are sockets connected to ::1 on any port
    204 
    205 <p>
    206       Another way is <tt/dst ::1/128/. / helps to understand that
    207       colon is part of IPv6 address.
    208 
    209 <p>
    210       Now we can add another alias for <tt/dst 10.0.0.1/:
    211       <tt/dst [10.0.0.1]/. :-)
    212 
    213 <p>   Address may be a DNS name. In this case all the addresses are looked
    214       up (in all the address families, if it is not limited by option <tt/-f/
    215       or special address prefix <tt/inet:/, <tt/inet6/) and resulting
    216       expression is <tt/or/ over all of them.  
    217 
    218 <item>   B. Port expressions:
    219 <tscreen><verb>
    220       dport &gt= :1024
    221       dport != :22
    222       sport &lt :32000
    223 </verb></tscreen>
    224       etc.
    225 
    226       All the relations: <tt/&lt/, <tt/&gt/, <tt/=/, <tt/>=/, <tt/=/, <tt/==/,
    227       <tt/!=/, <tt/eq/, <tt/ge/, <tt/lt/, <tt/ne/...
    228       Use variant which you like more, but not forget to escape special
    229       characters when typing them in command line. :-) 
    230 
    231       Note that port number syntactically coincides to the case A!
    232       You may even add an IP address, but it will not participate
    233       incomparison, except for <tt/==/ and <tt/!=/, which are equivalent
    234       to corresponding predicates of type A. F.e.
    235 <p>
    236 <tt/dst 10.0.0.1:22/
    237     is equivalent to  <tt/dport eq 10.0.0.1:22/
    238       and
    239       <tt/not dst 10.0.0.1:22/     is equivalent to
    240  <tt/dport neq 10.0.0.1:22/
    241 
    242 <item>C. Keyword <tt/autobound/. It matches to sockets bound automatically
    243       on local system.
    244 
    245 </itemize>
    246 
    247 
    248 <sect> Examples
    249 
    250 <p>
    251 <itemize>
    252 <item>1. List all the tcp sockets in state <tt/FIN-WAIT-1/ for our apache
    253    to network 193.233.7/24 and look at their timers:
    254 
    255 <tscreen><verb>
    256    ss -o state fin-wait-1 \( sport = :http or sport = :https \) \
    257                           dst 193.233.7/24
    258 </verb></tscreen>
    259 
    260    Oops, forgot to say that missing logical operation is
    261    equivalent to <tt/and/.
    262 
    263 <item> 2. Well, now look at the rest...
    264 
    265 <tscreen><verb>
    266    ss -o excl fin-wait-1
    267    ss state fin-wait-1 \( sport neq :http and sport neq :https \) \
    268                        or not dst 193.233.7/24
    269 </verb></tscreen>
    270 
    271    Note that we have to do _two_ calls of ss to do this.
    272    State match is always anded to address/port match.
    273    The reason for this is purely technical: ss does fast skip of
    274    not matching states before parsing addresses and I consider the
    275    ability to skip fastly gobs of time-wait and syn-recv sockets
    276    as more important than logical generality.
    277 
    278 <item> 3. So, let's look at all our sockets using autobound ports:
    279 
    280 <tscreen><verb>
    281    ss -a -A all autobound
    282 </verb></tscreen>
    283 
    284 
    285 <item> 4. And eventually find all the local processes connected
    286    to local X servers:
    287 
    288 <tscreen><verb>
    289    ss -xp dst "/tmp/.X11-unix/*"
    290 </verb></tscreen>
    291 
    292    Pardon, this does not work with current kernel, patching is required.
    293    But we still can look at server side:
    294    
    295 <tscreen><verb>
    296    ss -x src "/tmp/.X11-unix/*"
    297 </verb></tscreen>
    298 
    299 </itemize>
    300 
    301 
    302 <sect> Returning to ground: real manual  
    303 
    304 <p>
    305 <sect1> Command arguments
    306 
    307 <p> General format of arguments to <tt/ss/ is:
    308 
    309 <tscreen><verb>
    310        ss [ OPTIONS ] [ STATE-FILTER ] [ ADDRESS-FILTER ]
    311 </verb></tscreen>
    312 
    313 <sect2><tt/OPTIONS/
    314 <p> <tt/OPTIONS/ is list of single letter options, using common unix
    315 conventions.
    316 
    317 <itemize>
    318 <item><tt/-h/  - show help page
    319 <item><tt/-?/  - the same, of course
    320 <item><tt/-v/, <tt/-V/  - print version of <tt/ss/ and exit
    321 <item><tt/-s/  - print summary statistics. This option does not parse
    322 socket lists obtaining summary from various sources. It is useful
    323 when amount of sockets is so huge that parsing <tt>/proc/net/tcp</tt>
    324 is painful.
    325 <item><tt/-D FILE/  - do not display anything, just dump raw information
    326 about TCP sockets to <tt/FILE/ after applying filters. If <tt/FILE/ is <tt/-/
    327 <tt/stdout/ is used. 
    328 <item><tt/-F FILE/  - read continuation of filter from <tt/FILE/.
    329 Each line of <tt/FILE/ is interpreted like single command line option.
    330 If <tt/FILE/ is <tt/-/ <tt/stdin/ is used. 
    331 <item><tt/-r/  - try to resolve numeric address/ports
    332 <item><tt/-n/  - do not try to resolve ports
    333 <item><tt/-o/  - show some optional information, f.e. TCP timers
    334 <item><tt/-i/  - show some infomration specific to TCP (RTO, congestion
    335 window, slow start threshould etc.)
    336 <item><tt/-e/  - show even more optional information
    337 <item><tt/-m/  - show extended information on memory used by the socket.
    338 It is available only with <tt/tcp_diag/ enabled.
    339 <item><tt/-p/  - show list of processes owning the socket
    340 <item><tt/-f FAMILY/ - default address family used for parsing addresses.
    341                  Also this option limits listing to sockets supporting
    342                  given address family. Currently the following families
    343                  are supported: <tt/unix/, <tt/inet/, <tt/inet6/, <tt/link/,
    344                  <tt/netlink/.
    345 <item><tt/-4/ - alias for <tt/-f inet/
    346 <item><tt/-6/ - alias for <tt/-f inet6/
    347 <item><tt/-0/ - alias for <tt/-f link/
    348 <item><tt/-A LIST-OF-TABLES/ - list of socket tables to dump, separated
    349                  by commas. The following identifiers are understood:
    350                  <tt/all/, <tt/inet/, <tt/tcp/, <tt/udp/, <tt/raw/,
    351                  <tt/unix/, <tt/packet/, <tt/netlink/, <tt/unix_dgram/,
    352                  <tt/unix_stream/, <tt/packet_raw/, <tt/packet_dgram/.
    353 <item><tt/-x/ - alias for <tt/-A unix/
    354 <item><tt/-t/ - alias for <tt/-A tcp/
    355 <item><tt/-u/ - alias for <tt/-A udp/
    356 <item><tt/-w/ - alias for <tt/-A raw/
    357 <item><tt/-a/ - show sockets of all the states. By default sockets
    358                 in states <tt/LISTEN/, <tt/TIME-WAIT/, <tt/SYN_RECV/
    359                 and <tt/CLOSE/ are skipped.
    360 <item><tt/-l/ - show only sockets in state <tt/LISTEN/ 
    361 </itemize>
    362 
    363 <sect2><tt/STATE-FILTER/
    364 
    365 <p><tt/STATE-FILTER/ allows to construct arbitrary set of
    366 states to match. Its syntax is sequence of keywords <tt/state/
    367 and <tt/exclude/ followed by identifier of state.
    368 Available identifiers are:
    369 
    370 <p>
    371 <itemize>
    372 <item> All standard TCP states: <tt/established/, <tt/syn-sent/,
    373 <tt/syn-recv/, <tt/fin-wait-1/, <tt/fin-wait-2/, <tt/time-wait/,
    374 <tt/closed/, <tt/close-wait/, <tt/last-ack/, <tt/listen/ and <tt/closing/.
    375 
    376 <item><tt/all/ - for all the states 
    377 <item><tt/connected/ - all the states except for <tt/listen/ and <tt/closed/ 
    378 <item><tt/synchronized/ - all the <tt/connected/ states except for 
    379 <tt/syn-sent/
    380 <item><tt/bucket/ - states, which are maintained as minisockets, i.e.
    381 <tt/time-wait/ and <tt/syn-recv/.
    382 <item><tt/big/ - opposite to <tt/bucket/
    383 </itemize>
    384 
    385 <sect2><tt/ADDRESS_FILTER/
    386 
    387 <p><tt/ADDRESS_FILTER/ is boolean expression with operations <tt/and/, <tt/or/
    388 and <tt/not/, which can be abbreviated in C style f.e. as <tt/&amp/,
    389 <tt/&amp&amp/.
    390 
    391 <p>
    392 Predicates check socket addresses, both local and remote.
    393 There are the following kinds of predicates:
    394 
    395 <itemize>
    396 <item> <tt/dst ADDRESS_PATTERN/ - matches remote address and port
    397 <item> <tt/src ADDRESS_PATTERN/ - matches local address and port
    398 <item> <tt/dport RELOP PORT/    - compares remote port to a number
    399 <item> <tt/sport RELOP PORT/    - compares local port to a number
    400 <item> <tt/autobound/           - checks that socket is bound to an ephemeral
    401                                   port
    402 </itemize>
    403 
    404 <p><tt/RELOP/ is some of <tt/&lt=/, <tt/&gt=/, <tt/==/ etc.
    405 To make this more convinient for use in unix shell, alphabetic
    406 FORTRAN-like notations <tt/le/, <tt/gt/ etc. are accepted as well.
    407 
    408 <p>The format and semantics of <tt/ADDRESS_PATTERN/ depends on address
    409 family.
    410 
    411 <itemize>
    412 <item><tt/inet/ - <tt/ADDRESS_PATTERN/ consists of IP prefix, optionally
    413 followed by colon and port. If prefix or port part is absent or replaced
    414 with <tt/*/, this means wildcard match.
    415 <item><tt/inet6/ - The same as <tt/inet/, only prefix refers to an IPv6
    416 address. Unlike <tt/inet/ colon becomes ambiguous, so that <tt/ss/ allows
    417 to use scheme, like used in URLs, where address is suppounded with
    418 <tt/[/ ... <tt/]/.
    419 <item><tt/unix/ - <tt/ADDRESS_PATTERN/ is shell-style wildcard.
    420 <item><tt/packet/ - format looks like <tt/inet/, only interface index
    421 stays instead of port and link layer protocol id instead of address.
    422 <item><tt/netlink/ - format looks like <tt/inet/, only socket pid
    423 stays instead of port and netlink channel instead of address.
    424 </itemize>
    425 
    426 <p><tt/PORT/ is syntactically <tt/ADDRESS_PATTERN/ with wildcard
    427 address part. Certainly, it is undefined for UNIX sockets. 
    428 
    429 <sect1> Environment variables
    430 
    431 <p>
    432 <tt/ss/ allows to change source of information using various
    433 environment variables:
    434 
    435 <p>
    436 <itemize>
    437 <item> <tt/PROC_SLABINFO/  to override <tt>/proc/slabinfo</tt>
    438 <item> <tt/PROC_NET_TCP/  to override <tt>/proc/net/tcp</tt>
    439 <item> <tt/PROC_NET_UDP/  to override <tt>/proc/net/udp</tt>
    440 <item> etc.
    441 </itemize> 
    442 
    443 <p>
    444 Variable <tt/PROC_ROOT/ allows to change root of all the <tt>/proc/</tt>
    445 hierarchy.
    446 
    447 <p>
    448 Variable <tt/TCPDIAG_FILE/ prescribes to open a file instead of
    449 requesting kernel to dump information about TCP sockets.
    450 
    451 
    452 <p> This option is used mainly to investigate bug reports,
    453 when dumps of files usually found in <tt>/proc/</tt> are recevied
    454 by e-mail.
    455 
    456 <sect1> Output format
    457 
    458 <p>Six columns. The first is <tt/Netid/, it denotes socket type and
    459 transport protocol, when it is ambiguous: <tt/tcp/, <tt/udp/, <tt/raw/,
    460 <tt/u_str/ is abbreviation for <tt/unix_stream/, <tt/u_dgr/ for UNIX
    461 datagram sockets, <tt/nl/ for netlink, <tt/p_raw/ and <tt/p_dgr/ for
    462 raw and datagram packet sockets. This column is optional, it will
    463 be hidden, if filter selects an unique netid.
    464 
    465 <p>
    466 The second column is <tt/State/. Socket state is displayed here.
    467 The names are standard TCP names, except for <tt/UNCONN/, which
    468 cannot happen for TCP, but normal for not connected sockets
    469 of another types. Again, this column can be hidden.
    470 
    471 <p>
    472 Then two columns (<tt/Recv-Q/ and <tt/Send-Q/) showing amount of data
    473 queued for receive and transmit.
    474 
    475 <p>
    476 And the last two columns display local address and port of the socket
    477 and its peer address, if the socket is connected.
    478 
    479 <p>
    480 If options <tt/-o/, <tt/-e/ or <tt/-p/ were given, options are
    481 displayed not in fixed positions but separated by spaces pairs:
    482 <tt/option:value/. If value is not a single number, it is presented
    483 as list of values, enclosed to <tt/(/ ... <tt/)/ and separated with
    484 commas. F.e.
    485 
    486 <tscreen><verb>
    487    timer:(keepalive,111min,0)
    488 </verb></tscreen>
    489 is typical format for TCP timer (option <tt/-o/).
    490 
    491 <tscreen><verb>
    492    users:((X,113,3))
    493 </verb></tscreen>
    494 is typical for list of users (option <tt/-p/).
    495 
    496 
    497 <sect>Some numbers
    498 
    499 <p>
    500 Well, let us use <tt/pidentd/ and a tool <tt/ibench/ to measure
    501 its performance. It is 30 requests per second here. Nothing to test,
    502 it is too slow. OK, let us patch pidentd with patch from directory
    503 Patches. After this it handles about 4300 requests per second
    504 and becomes handy tool to pollute socket tables with lots of timewait
    505 buckets.
    506 
    507 <p>
    508 So, each test starts from pollution tables with 30000 sockets
    509 and then doing full dump of the table piped to wc and measuring
    510 timings with time:
    511 
    512 <p>Results:
    513 
    514 <itemize>
    515 <item> <tt/netstat -at/ - 15.6 seconds
    516 <item> <tt/ss -atr/, but without <tt/tcp_diag/     - 5.4 seconds
    517 <item> <tt/ss -atr/ with <tt/tcp_diag/     - 0.47 seconds
    518 </itemize>
    519 
    520 No comments. Though one comment is necessary, most of time
    521 without <tt/tcp_diag/ is wasted inside kernel with completely
    522 blocked networking. More than 10 seconds, yes. <tt/tcp_diag/
    523 does the same work for 100 milliseconds of system time.
    524 
    525 </article>
    526