1 <!doctype linuxdoc system> 2 3 <article> 4 5 <title>SS Utility: Quick Intro 6 <author>Alexey Kuznetosv, <tt/kuznet@ms2.inr.ac.ru/ 7 <date>some_negative_number, 20 Sep 2001 8 <abstract> 9 <tt/ss/ is one another utility to investigate sockets. 10 Functionally it is NOT better than <tt/netstat/ combined 11 with some perl/awk scripts and though it is surely faster 12 it is not enough to make it much better. :-) 13 So, stop reading this now and do not waste your time. 14 Well, certainly, it proposes some functionality, which current 15 netstat is still not able to do, but surely will soon. 16 </abstract> 17 18 <sect>Why? 19 20 <p> <tt>/proc</tt> interface is inadequate, unfortunately. 21 When amount of sockets is enough large, <tt/netstat/ or even 22 plain <tt>cat /proc/net/tcp/</tt> cause nothing but pains and curses. 23 In linux-2.4 the desease became worse: even if amount 24 of sockets is small reading <tt>/proc/net/tcp/</tt> is slow enough. 25 26 This utility presents a new approach, which is supposed to scale 27 well. I am not going to describe technical details here and 28 will concentrate on description of the command. 29 The only important thing to say is that it is not so bad idea 30 to load module <tt/tcp_diag/, which can be found in directory 31 <tt/Modules/ of <tt/iproute2/. If you do not make this <tt/ss/ 32 will work, but it falls back to <tt>/proc</tt> and becomes slow 33 like <tt/netstat/, well, a bit faster yet (see section "Some numbers"). 34 35 <sect>Old news 36 37 <p> 38 In the simplest form <tt/ss/ is equivalent to netstat 39 with some small deviations. 40 41 <itemize> 42 <item><tt/ss -t -a/ dumps all TCP sockets 43 <item><tt/ss -u -a/ dumps all UDP sockets 44 <item><tt/ss -w -a/ dumps all RAW sockets 45 <item><tt/ss -x -a/ dumps all UNIX sockets 46 </itemize> 47 48 <p> 49 Option <tt/-o/ shows TCP timers state. 50 Option <tt/-e/ shows some extended information. 51 Etc. etc. etc. Seems, all the options of netstat related to sockets 52 are supported. Though not AX.25 and other bizarres. :-) 53 If someone wants, he can make support for decnet and ipx. 54 Some rudimentary support for them is already present in iproute2 libutils, 55 and I will be glad to see these new members. 56 57 <p> 58 However, standard functionality is a bit different: 59 60 <p> 61 The first: without option <tt/-a/ sockets in states 62 <tt/TIME-WAIT/ and <tt/SYN-RECV/ are skipped too. 63 It is more reasonable default, I think. 64 65 <p> 66 The second: format of UNIX sockets is different. It coincides 67 with tcp/udp. Though standard kernel still does not allow to 68 see write/read queues and peer address of connected UNIX sockets, 69 the patch doing this exists. 70 71 <p> 72 The third: default is to dump only TCP sockets, rather than all of the types. 73 74 <p> 75 The next: by default it does not resolve numeric host addresses (like <tt/ip/)! 76 Resolving is enabled with option <tt/-r/. Service names, usually stored 77 in local files, are resolved by default. Also, if service database 78 does not contain references to a port, <tt/ss/ queries system 79 <tt/rpcbind/. RPC services are prefixed with <tt/rpc./ 80 Resolution of services may be suppressed with option <tt/-n/. 81 82 <p> 83 It does not accept "long" options (I dislike them, sorry). 84 So, address family is given with family identifier following 85 option <tt/-f/ to be algined to iproute2 conventions. 86 Mostly, it is to allow option parser to parse 87 addresses correctly, but as side effect it really limits dumping 88 to sockets supporting only given family. Option <tt/-A/ followed 89 by list of socket tables to dump is also supported. 90 Logically, id of socket table is different of _address_ family, which is 91 another point of incompatibility. So, id is one of 92 <tt/all/, <tt/tcp/, <tt/udp/, 93 <tt/raw/, <tt/inet/, <tt/unix/, <tt/packet/, <tt/netlink/. See? 94 Well, <tt/inet/ is just abbreviation for <tt/tcp|udp|raw/ 95 and it is not difficult to guess that <tt/packet/ allows 96 to look at packet sockets. Actually, there are also some other abbreviations, 97 f.e. <tt/unix_dgram/ selects only datagram UNIX sockets. 98 99 <p> 100 The next: well, I still do not know. :-) 101 102 103 104 105 <sect>Time to talk about new functionality. 106 107 <p>It is builtin filtering of socket lists. 108 109 <sect1> Filtering by state. 110 111 <p> 112 <tt/ss/ allows to filter socket states, using keywords 113 <tt/state/ and <tt/exclude/, followed by some state 114 identifier. 115 116 <p> 117 State identifier are standard TCP state names (not listed, 118 they are useless for you if you already do not know them) 119 or abbreviations: 120 121 <itemize> 122 <item><tt/all/ - for all the states 123 <item><tt/bucket/ - for TCP minisockets (<tt/TIME-WAIT|SYN-RECV/) 124 <item><tt/big/ - all except for minisockets 125 <item><tt/connected/ - not closed and not listening 126 <item><tt/synchronized/ - connected and not <tt/SYN-SENT/ 127 </itemize> 128 129 <p> 130 F.e. to dump all tcp sockets except <tt/SYN-RECV/: 131 132 <tscreen><verb> 133 ss exclude SYN-RECV 134 </verb></tscreen> 135 136 <p> 137 If neither <tt/state/ nor <tt/exclude/ directives 138 are present, 139 state filter defaults to <tt/all/ with option <tt/-a/ 140 or to <tt/all/, 141 excluding listening, syn-recv, time-wait and closed sockets. 142 143 <sect1> Filtering by addresses and ports. 144 145 <p> 146 Option list may contain address/port filter. 147 It is boolean expression which consists of boolean operation 148 <tt/or/, <tt/and/, <tt/not/ and predicates. 149 Actually, all the flavors of names for boolean operations are eaten: 150 <tt/&/, <tt/&&/, <tt/|/, <tt/||/, <tt/!/, but do not forget 151 about special sense given to these symbols by unix shells and escape 152 them correctly, when used from command line. 153 154 <p> 155 Predicates may be of the folowing kinds: 156 157 <itemize> 158 <item>A. Address/port match, where address is checked against mask 159 and port is either wildcard or exact. It is one of: 160 161 <tscreen><verb> 162 dst prefix:port 163 src prefix:port 164 src unix:STRING 165 src link:protocol:ifindex 166 src nl:channel:pid 167 </verb></tscreen> 168 169 Both prefix and port may be absent or replaced with <tt/*/, 170 which means wildcard. UNIX socket use more powerful scheme 171 matching to socket names by shell wildcards. Also, prefixes 172 unix: and link: may be omitted, if address family is evident 173 from context (with option <tt/-x/ or with <tt/-f unix/ 174 or with <tt/unix/ keyword) 175 176 <p> 177 F.e. 178 179 <tscreen><verb> 180 dst 10.0.0.1 181 dst 10.0.0.1: 182 dst 10.0.0.1/32: 183 dst 10.0.0.1:* 184 </verb></tscreen> 185 are equivalent and mean socket connected to 186 any port on host 10.0.0.1 187 188 <tscreen><verb> 189 dst 10.0.0.0/24:22 190 </verb></tscreen> 191 sockets connected to port 22 on network 192 10.0.0.0...255. 193 194 <p> 195 Note that port separated of address with colon, which creates 196 troubles with IPv6 addresses. Generally, we interpret the last 197 colon as splitting port. To allow to give IPv6 addresses, 198 trick like used in IPv6 HTTP URLs may be used: 199 200 <tscreen><verb> 201 dst [::1] 202 </verb></tscreen> 203 are sockets connected to ::1 on any port 204 205 <p> 206 Another way is <tt/dst ::1/128/. / helps to understand that 207 colon is part of IPv6 address. 208 209 <p> 210 Now we can add another alias for <tt/dst 10.0.0.1/: 211 <tt/dst [10.0.0.1]/. :-) 212 213 <p> Address may be a DNS name. In this case all the addresses are looked 214 up (in all the address families, if it is not limited by option <tt/-f/ 215 or special address prefix <tt/inet:/, <tt/inet6/) and resulting 216 expression is <tt/or/ over all of them. 217 218 <item> B. Port expressions: 219 <tscreen><verb> 220 dport >= :1024 221 dport != :22 222 sport < :32000 223 </verb></tscreen> 224 etc. 225 226 All the relations: <tt/</, <tt/>/, <tt/=/, <tt/>=/, <tt/=/, <tt/==/, 227 <tt/!=/, <tt/eq/, <tt/ge/, <tt/lt/, <tt/ne/... 228 Use variant which you like more, but not forget to escape special 229 characters when typing them in command line. :-) 230 231 Note that port number syntactically coincides to the case A! 232 You may even add an IP address, but it will not participate 233 incomparison, except for <tt/==/ and <tt/!=/, which are equivalent 234 to corresponding predicates of type A. F.e. 235 <p> 236 <tt/dst 10.0.0.1:22/ 237 is equivalent to <tt/dport eq 10.0.0.1:22/ 238 and 239 <tt/not dst 10.0.0.1:22/ is equivalent to 240 <tt/dport neq 10.0.0.1:22/ 241 242 <item>C. Keyword <tt/autobound/. It matches to sockets bound automatically 243 on local system. 244 245 </itemize> 246 247 248 <sect> Examples 249 250 <p> 251 <itemize> 252 <item>1. List all the tcp sockets in state <tt/FIN-WAIT-1/ for our apache 253 to network 193.233.7/24 and look at their timers: 254 255 <tscreen><verb> 256 ss -o state fin-wait-1 \( sport = :http or sport = :https \) \ 257 dst 193.233.7/24 258 </verb></tscreen> 259 260 Oops, forgot to say that missing logical operation is 261 equivalent to <tt/and/. 262 263 <item> 2. Well, now look at the rest... 264 265 <tscreen><verb> 266 ss -o excl fin-wait-1 267 ss state fin-wait-1 \( sport neq :http and sport neq :https \) \ 268 or not dst 193.233.7/24 269 </verb></tscreen> 270 271 Note that we have to do _two_ calls of ss to do this. 272 State match is always anded to address/port match. 273 The reason for this is purely technical: ss does fast skip of 274 not matching states before parsing addresses and I consider the 275 ability to skip fastly gobs of time-wait and syn-recv sockets 276 as more important than logical generality. 277 278 <item> 3. So, let's look at all our sockets using autobound ports: 279 280 <tscreen><verb> 281 ss -a -A all autobound 282 </verb></tscreen> 283 284 285 <item> 4. And eventually find all the local processes connected 286 to local X servers: 287 288 <tscreen><verb> 289 ss -xp dst "/tmp/.X11-unix/*" 290 </verb></tscreen> 291 292 Pardon, this does not work with current kernel, patching is required. 293 But we still can look at server side: 294 295 <tscreen><verb> 296 ss -x src "/tmp/.X11-unix/*" 297 </verb></tscreen> 298 299 </itemize> 300 301 302 <sect> Returning to ground: real manual 303 304 <p> 305 <sect1> Command arguments 306 307 <p> General format of arguments to <tt/ss/ is: 308 309 <tscreen><verb> 310 ss [ OPTIONS ] [ STATE-FILTER ] [ ADDRESS-FILTER ] 311 </verb></tscreen> 312 313 <sect2><tt/OPTIONS/ 314 <p> <tt/OPTIONS/ is list of single letter options, using common unix 315 conventions. 316 317 <itemize> 318 <item><tt/-h/ - show help page 319 <item><tt/-?/ - the same, of course 320 <item><tt/-v/, <tt/-V/ - print version of <tt/ss/ and exit 321 <item><tt/-s/ - print summary statistics. This option does not parse 322 socket lists obtaining summary from various sources. It is useful 323 when amount of sockets is so huge that parsing <tt>/proc/net/tcp</tt> 324 is painful. 325 <item><tt/-D FILE/ - do not display anything, just dump raw information 326 about TCP sockets to <tt/FILE/ after applying filters. If <tt/FILE/ is <tt/-/ 327 <tt/stdout/ is used. 328 <item><tt/-F FILE/ - read continuation of filter from <tt/FILE/. 329 Each line of <tt/FILE/ is interpreted like single command line option. 330 If <tt/FILE/ is <tt/-/ <tt/stdin/ is used. 331 <item><tt/-r/ - try to resolve numeric address/ports 332 <item><tt/-n/ - do not try to resolve ports 333 <item><tt/-o/ - show some optional information, f.e. TCP timers 334 <item><tt/-i/ - show some infomration specific to TCP (RTO, congestion 335 window, slow start threshould etc.) 336 <item><tt/-e/ - show even more optional information 337 <item><tt/-m/ - show extended information on memory used by the socket. 338 It is available only with <tt/tcp_diag/ enabled. 339 <item><tt/-p/ - show list of processes owning the socket 340 <item><tt/-f FAMILY/ - default address family used for parsing addresses. 341 Also this option limits listing to sockets supporting 342 given address family. Currently the following families 343 are supported: <tt/unix/, <tt/inet/, <tt/inet6/, <tt/link/, 344 <tt/netlink/. 345 <item><tt/-4/ - alias for <tt/-f inet/ 346 <item><tt/-6/ - alias for <tt/-f inet6/ 347 <item><tt/-0/ - alias for <tt/-f link/ 348 <item><tt/-A LIST-OF-TABLES/ - list of socket tables to dump, separated 349 by commas. The following identifiers are understood: 350 <tt/all/, <tt/inet/, <tt/tcp/, <tt/udp/, <tt/raw/, 351 <tt/unix/, <tt/packet/, <tt/netlink/, <tt/unix_dgram/, 352 <tt/unix_stream/, <tt/packet_raw/, <tt/packet_dgram/. 353 <item><tt/-x/ - alias for <tt/-A unix/ 354 <item><tt/-t/ - alias for <tt/-A tcp/ 355 <item><tt/-u/ - alias for <tt/-A udp/ 356 <item><tt/-w/ - alias for <tt/-A raw/ 357 <item><tt/-a/ - show sockets of all the states. By default sockets 358 in states <tt/LISTEN/, <tt/TIME-WAIT/, <tt/SYN_RECV/ 359 and <tt/CLOSE/ are skipped. 360 <item><tt/-l/ - show only sockets in state <tt/LISTEN/ 361 </itemize> 362 363 <sect2><tt/STATE-FILTER/ 364 365 <p><tt/STATE-FILTER/ allows to construct arbitrary set of 366 states to match. Its syntax is sequence of keywords <tt/state/ 367 and <tt/exclude/ followed by identifier of state. 368 Available identifiers are: 369 370 <p> 371 <itemize> 372 <item> All standard TCP states: <tt/established/, <tt/syn-sent/, 373 <tt/syn-recv/, <tt/fin-wait-1/, <tt/fin-wait-2/, <tt/time-wait/, 374 <tt/closed/, <tt/close-wait/, <tt/last-ack/, <tt/listen/ and <tt/closing/. 375 376 <item><tt/all/ - for all the states 377 <item><tt/connected/ - all the states except for <tt/listen/ and <tt/closed/ 378 <item><tt/synchronized/ - all the <tt/connected/ states except for 379 <tt/syn-sent/ 380 <item><tt/bucket/ - states, which are maintained as minisockets, i.e. 381 <tt/time-wait/ and <tt/syn-recv/. 382 <item><tt/big/ - opposite to <tt/bucket/ 383 </itemize> 384 385 <sect2><tt/ADDRESS_FILTER/ 386 387 <p><tt/ADDRESS_FILTER/ is boolean expression with operations <tt/and/, <tt/or/ 388 and <tt/not/, which can be abbreviated in C style f.e. as <tt/&/, 389 <tt/&&/. 390 391 <p> 392 Predicates check socket addresses, both local and remote. 393 There are the following kinds of predicates: 394 395 <itemize> 396 <item> <tt/dst ADDRESS_PATTERN/ - matches remote address and port 397 <item> <tt/src ADDRESS_PATTERN/ - matches local address and port 398 <item> <tt/dport RELOP PORT/ - compares remote port to a number 399 <item> <tt/sport RELOP PORT/ - compares local port to a number 400 <item> <tt/autobound/ - checks that socket is bound to an ephemeral 401 port 402 </itemize> 403 404 <p><tt/RELOP/ is some of <tt/<=/, <tt/>=/, <tt/==/ etc. 405 To make this more convinient for use in unix shell, alphabetic 406 FORTRAN-like notations <tt/le/, <tt/gt/ etc. are accepted as well. 407 408 <p>The format and semantics of <tt/ADDRESS_PATTERN/ depends on address 409 family. 410 411 <itemize> 412 <item><tt/inet/ - <tt/ADDRESS_PATTERN/ consists of IP prefix, optionally 413 followed by colon and port. If prefix or port part is absent or replaced 414 with <tt/*/, this means wildcard match. 415 <item><tt/inet6/ - The same as <tt/inet/, only prefix refers to an IPv6 416 address. Unlike <tt/inet/ colon becomes ambiguous, so that <tt/ss/ allows 417 to use scheme, like used in URLs, where address is suppounded with 418 <tt/[/ ... <tt/]/. 419 <item><tt/unix/ - <tt/ADDRESS_PATTERN/ is shell-style wildcard. 420 <item><tt/packet/ - format looks like <tt/inet/, only interface index 421 stays instead of port and link layer protocol id instead of address. 422 <item><tt/netlink/ - format looks like <tt/inet/, only socket pid 423 stays instead of port and netlink channel instead of address. 424 </itemize> 425 426 <p><tt/PORT/ is syntactically <tt/ADDRESS_PATTERN/ with wildcard 427 address part. Certainly, it is undefined for UNIX sockets. 428 429 <sect1> Environment variables 430 431 <p> 432 <tt/ss/ allows to change source of information using various 433 environment variables: 434 435 <p> 436 <itemize> 437 <item> <tt/PROC_SLABINFO/ to override <tt>/proc/slabinfo</tt> 438 <item> <tt/PROC_NET_TCP/ to override <tt>/proc/net/tcp</tt> 439 <item> <tt/PROC_NET_UDP/ to override <tt>/proc/net/udp</tt> 440 <item> etc. 441 </itemize> 442 443 <p> 444 Variable <tt/PROC_ROOT/ allows to change root of all the <tt>/proc/</tt> 445 hierarchy. 446 447 <p> 448 Variable <tt/TCPDIAG_FILE/ prescribes to open a file instead of 449 requesting kernel to dump information about TCP sockets. 450 451 452 <p> This option is used mainly to investigate bug reports, 453 when dumps of files usually found in <tt>/proc/</tt> are recevied 454 by e-mail. 455 456 <sect1> Output format 457 458 <p>Six columns. The first is <tt/Netid/, it denotes socket type and 459 transport protocol, when it is ambiguous: <tt/tcp/, <tt/udp/, <tt/raw/, 460 <tt/u_str/ is abbreviation for <tt/unix_stream/, <tt/u_dgr/ for UNIX 461 datagram sockets, <tt/nl/ for netlink, <tt/p_raw/ and <tt/p_dgr/ for 462 raw and datagram packet sockets. This column is optional, it will 463 be hidden, if filter selects an unique netid. 464 465 <p> 466 The second column is <tt/State/. Socket state is displayed here. 467 The names are standard TCP names, except for <tt/UNCONN/, which 468 cannot happen for TCP, but normal for not connected sockets 469 of another types. Again, this column can be hidden. 470 471 <p> 472 Then two columns (<tt/Recv-Q/ and <tt/Send-Q/) showing amount of data 473 queued for receive and transmit. 474 475 <p> 476 And the last two columns display local address and port of the socket 477 and its peer address, if the socket is connected. 478 479 <p> 480 If options <tt/-o/, <tt/-e/ or <tt/-p/ were given, options are 481 displayed not in fixed positions but separated by spaces pairs: 482 <tt/option:value/. If value is not a single number, it is presented 483 as list of values, enclosed to <tt/(/ ... <tt/)/ and separated with 484 commas. F.e. 485 486 <tscreen><verb> 487 timer:(keepalive,111min,0) 488 </verb></tscreen> 489 is typical format for TCP timer (option <tt/-o/). 490 491 <tscreen><verb> 492 users:((X,113,3)) 493 </verb></tscreen> 494 is typical for list of users (option <tt/-p/). 495 496 497 <sect>Some numbers 498 499 <p> 500 Well, let us use <tt/pidentd/ and a tool <tt/ibench/ to measure 501 its performance. It is 30 requests per second here. Nothing to test, 502 it is too slow. OK, let us patch pidentd with patch from directory 503 Patches. After this it handles about 4300 requests per second 504 and becomes handy tool to pollute socket tables with lots of timewait 505 buckets. 506 507 <p> 508 So, each test starts from pollution tables with 30000 sockets 509 and then doing full dump of the table piped to wc and measuring 510 timings with time: 511 512 <p>Results: 513 514 <itemize> 515 <item> <tt/netstat -at/ - 15.6 seconds 516 <item> <tt/ss -atr/, but without <tt/tcp_diag/ - 5.4 seconds 517 <item> <tt/ss -atr/ with <tt/tcp_diag/ - 0.47 seconds 518 </itemize> 519 520 No comments. Though one comment is necessary, most of time 521 without <tt/tcp_diag/ is wasted inside kernel with completely 522 blocked networking. More than 10 seconds, yes. <tt/tcp_diag/ 523 does the same work for 100 milliseconds of system time. 524 525 </article> 526