1 This is netperf.info, produced by makeinfo version 4.13 from 2 netperf.texi. 3 4 This is Rick Jones' feeble attempt at a Texinfo-based manual for the 5 netperf benchmark. 6 7 Copyright (C) 2005-2012 Hewlett-Packard Company 8 9 Permission is granted to copy, distribute and/or modify this 10 document per the terms of the netperf source license, a copy of 11 which can be found in the file `COPYING' of the basic netperf 12 distribution. 13 14 15 File: netperf.info, Node: Top, Next: Introduction, Prev: (dir), Up: (dir) 16 17 Netperf Manual 18 ************** 19 20 This is Rick Jones' feeble attempt at a Texinfo-based manual for the 21 netperf benchmark. 22 23 Copyright (C) 2005-2012 Hewlett-Packard Company 24 25 Permission is granted to copy, distribute and/or modify this 26 document per the terms of the netperf source license, a copy of 27 which can be found in the file `COPYING' of the basic netperf 28 distribution. 29 30 * Menu: 31 32 * Introduction:: An introduction to netperf - what it 33 is and what it is not. 34 * Installing Netperf:: How to go about installing netperf. 35 * The Design of Netperf:: 36 * Global Command-line Options:: 37 * Using Netperf to Measure Bulk Data Transfer:: 38 * Using Netperf to Measure Request/Response :: 39 * Using Netperf to Measure Aggregate Performance:: 40 * Using Netperf to Measure Bidirectional Transfer:: 41 * The Omni Tests:: 42 * Other Netperf Tests:: 43 * Address Resolution:: 44 * Enhancing Netperf:: 45 * Netperf4:: 46 * Concept Index:: 47 * Option Index:: 48 49 50 File: netperf.info, Node: Introduction, Next: Installing Netperf, Prev: Top, Up: Top 51 52 1 Introduction 53 ************** 54 55 Netperf is a benchmark that can be use to measure various aspect of 56 networking performance. The primary foci are bulk (aka unidirectional) 57 data transfer and request/response performance using either TCP or UDP 58 and the Berkeley Sockets interface. As of this writing, the tests 59 available either unconditionally or conditionally include: 60 61 * TCP and UDP unidirectional transfer and request/response over IPv4 62 and IPv6 using the Sockets interface. 63 64 * TCP and UDP unidirectional transfer and request/response over IPv4 65 using the XTI interface. 66 67 * Link-level unidirectional transfer and request/response using the 68 DLPI interface. 69 70 * Unix domain sockets 71 72 * SCTP unidirectional transfer and request/response over IPv4 and 73 IPv6 using the sockets interface. 74 75 While not every revision of netperf will work on every platform 76 listed, the intention is that at least some version of netperf will 77 work on the following platforms: 78 79 * Unix - at least all the major variants. 80 81 * Linux 82 83 * Windows 84 85 * Others 86 87 Netperf is maintained and informally supported primarily by Rick 88 Jones, who can perhaps be best described as Netperf Contributing 89 Editor. Non-trivial and very appreciated assistance comes from others 90 in the network performance community, who are too numerous to mention 91 here. While it is often used by them, netperf is NOT supported via any 92 of the formal Hewlett-Packard support channels. You should feel free 93 to make enhancements and modifications to netperf to suit your 94 nefarious porpoises, so long as you stay within the guidelines of the 95 netperf copyright. If you feel so inclined, you can send your changes 96 to netperf-feedback <netperf-feedback (a] netperf.org> for possible 97 inclusion into subsequent versions of netperf. 98 99 It is the Contributing Editor's belief that the netperf license walks 100 like open source and talks like open source. However, the license was 101 never submitted for "certification" as an open source license. If you 102 would prefer to make contributions to a networking benchmark using a 103 certified open source license, please consider netperf4, which is 104 distributed under the terms of the GPLv2. 105 106 The netperf-talk <netperf-talk (a] netperf.org> mailing list is 107 available to discuss the care and feeding of netperf with others who 108 share your interest in network performance benchmarking. The 109 netperf-talk mailing list is a closed list (to deal with spam) and you 110 must first subscribe by sending email to netperf-talk-request 111 <netperf-talk-request (a] netperf.org>. 112 113 * Menu: 114 115 * Conventions:: 116 117 118 File: netperf.info, Node: Conventions, Prev: Introduction, Up: Introduction 119 120 1.1 Conventions 121 =============== 122 123 A "sizespec" is a one or two item, comma-separated list used as an 124 argument to a command-line option that can set one or two, related 125 netperf parameters. If you wish to set both parameters to separate 126 values, items should be separated by a comma: 127 128 parameter1,parameter2 129 130 If you wish to set the first parameter without altering the value of 131 the second from its default, you should follow the first item with a 132 comma: 133 134 parameter1, 135 136 Likewise, precede the item with a comma if you wish to set only the 137 second parameter: 138 139 ,parameter2 140 141 An item with no commas: 142 143 parameter1and2 144 145 will set both parameters to the same value. This last mode is one of 146 the most frequently used. 147 148 There is another variant of the comma-separated, two-item list called 149 a "optionspec" which is like a sizespec with the exception that a 150 single item with no comma: 151 152 parameter1 153 154 will only set the value of the first parameter and will leave the 155 second parameter at its default value. 156 157 Netperf has two types of command-line options. The first are global 158 command line options. They are essentially any option not tied to a 159 particular test or group of tests. An example of a global command-line 160 option is the one which sets the test type - `-t'. 161 162 The second type of options are test-specific options. These are 163 options which are only applicable to a particular test or set of tests. 164 An example of a test-specific option would be the send socket buffer 165 size for a TCP_STREAM test. 166 167 Global command-line options are specified first with test-specific 168 options following after a `--' as in: 169 170 netperf <global> -- <test-specific> 171 172 173 File: netperf.info, Node: Installing Netperf, Next: The Design of Netperf, Prev: Introduction, Up: Top 174 175 2 Installing Netperf 176 ******************** 177 178 Netperf's primary form of distribution is source code. This allows 179 installation on systems other than those to which the authors have 180 ready access and thus the ability to create binaries. There are two 181 styles of netperf installation. The first runs the netperf server 182 program - netserver - as a child of inetd. This requires the installer 183 to have sufficient privileges to edit the files `/etc/services' and 184 `/etc/inetd.conf' or their platform-specific equivalents. 185 186 The second style is to run netserver as a standalone daemon. This 187 second method does not require edit privileges on `/etc/services' and 188 `/etc/inetd.conf' but does mean you must remember to run the netserver 189 program explicitly after every system reboot. 190 191 This manual assumes that those wishing to measure networking 192 performance already know how to use anonymous FTP and/or a web browser. 193 It is also expected that you have at least a passing familiarity with 194 the networking protocols and interfaces involved. In all honesty, if 195 you do not have such familiarity, likely as not you have some 196 experience to gain before attempting network performance measurements. 197 The excellent texts by authors such as Stevens, Fenner and Rudoff 198 and/or Stallings would be good starting points. There are likely other 199 excellent sources out there as well. 200 201 * Menu: 202 203 * Getting Netperf Bits:: 204 * Installing Netperf Bits:: 205 * Verifying Installation:: 206 207 208 File: netperf.info, Node: Getting Netperf Bits, Next: Installing Netperf Bits, Prev: Installing Netperf, Up: Installing Netperf 209 210 2.1 Getting Netperf Bits 211 ======================== 212 213 Gzipped tar files of netperf sources can be retrieved via anonymous FTP 214 (ftp://ftp.netperf.org/netperf) for "released" versions of the bits. 215 Pre-release versions of the bits can be retrieved via anonymous FTP 216 from the experimental (ftp://ftp.netperf.org/netperf/experimental) 217 subdirectory. 218 219 For convenience and ease of remembering, a link to the download site 220 is provided via the NetperfPage (http://www.netperf.org/) 221 222 The bits corresponding to each discrete release of netperf are 223 tagged (http://www.netperf.org/svn/netperf2/tags) for retrieval via 224 subversion. For example, there is a tag for the first version 225 corresponding to this version of the manual - netperf 2.6.0 226 (http://www.netperf.org/svn/netperf2/tags/netperf-2.6.0). Those 227 wishing to be on the bleeding edge of netperf development can use 228 subversion to grab the top of trunk 229 (http://www.netperf.org/svn/netperf2/trunk). When fixing bugs or 230 making enhancements, patches against the top-of-trunk are preferred. 231 232 There are likely other places around the Internet from which one can 233 download netperf bits. These may be simple mirrors of the main netperf 234 site, or they may be local variants on netperf. As with anything one 235 downloads from the Internet, take care to make sure it is what you 236 really wanted and isn't some malicious Trojan or whatnot. Caveat 237 downloader. 238 239 As a general rule, binaries of netperf and netserver are not 240 distributed from ftp.netperf.org. From time to time a kind soul or 241 souls has packaged netperf as a Debian package available via the 242 apt-get mechanism or as an RPM. I would be most interested in learning 243 how to enhance the makefiles to make that easier for people. 244 245 246 File: netperf.info, Node: Installing Netperf Bits, Next: Verifying Installation, Prev: Getting Netperf Bits, Up: Installing Netperf 247 248 2.2 Installing Netperf 249 ====================== 250 251 Once you have downloaded the tar file of netperf sources onto your 252 system(s), it is necessary to unpack the tar file, cd to the netperf 253 directory, run configure and then make. Most of the time it should be 254 sufficient to just: 255 256 gzcat netperf-<version>.tar.gz | tar xf - 257 cd netperf-<version> 258 ./configure 259 make 260 make install 261 262 Most of the "usual" configure script options should be present 263 dealing with where to install binaries and whatnot. 264 ./configure --help 265 should list all of those and more. You may find the `--prefix' 266 option helpful in deciding where the binaries and such will be put 267 during the `make install'. 268 269 If the netperf configure script does not know how to automagically 270 detect which CPU utilization mechanism to use on your platform you may 271 want to add a `--enable-cpuutil=mumble' option to the configure 272 command. If you have knowledge and/or experience to contribute to 273 that area, feel free to contact <netperf-feedback (a] netperf.org>. 274 275 Similarly, if you want tests using the XTI interface, Unix Domain 276 Sockets, DLPI or SCTP it will be necessary to add one or more 277 `--enable-[xti|unixdomain|dlpi|sctp]=yes' options to the configure 278 command. As of this writing, the configure script will not include 279 those tests automagically. 280 281 Starting with version 2.5.0, netperf began migrating most of the 282 "classic" netperf tests found in `src/nettest_bsd.c' to the so-called 283 "omni" tests (aka "two routines to run them all") found in 284 `src/nettest_omni.c'. This migration enables a number of new features 285 such as greater control over what output is included, and new things to 286 output. The "omni" test is enabled by default in 2.5.0 and a number of 287 the classic tests are migrated - you can tell if a test has been 288 migrated from the presence of `MIGRATED' in the test banner. If you 289 encounter problems with either the omni or migrated tests, please first 290 attempt to obtain resolution via <netperf-talk (a] netperf.org> or 291 <netperf-feedback (a] netperf.org>. If that is unsuccessful, you can add a 292 `--enable-omni=no' to the configure command and the omni tests will not 293 be compiled-in and the classic tests will not be migrated. 294 295 Starting with version 2.5.0, netperf includes the "burst mode" 296 functionality in a default compilation of the bits. If you encounter 297 problems with this, please first attempt to obtain help via 298 <netperf-talk (a] netperf.org> or <netperf-feedback (a] netperf.org>. If that 299 is unsuccessful, you can add a `--enable-burst=no' to the configure 300 command and the burst mode functionality will not be compiled-in. 301 302 On some platforms, it may be necessary to precede the configure 303 command with a CFLAGS and/or LIBS variable as the netperf configure 304 script is not yet smart enough to set them itself. Whenever possible, 305 these requirements will be found in `README.PLATFORM' files. Expertise 306 and assistance in making that more automagic in the configure script 307 would be most welcome. 308 309 Other optional configure-time settings include 310 `--enable-intervals=yes' to give netperf the ability to "pace" its 311 _STREAM tests and `--enable-histogram=yes' to have netperf keep a 312 histogram of interesting times. Each of these will have some effect on 313 the measured result. If your system supports `gethrtime()' the effect 314 of the histogram measurement should be minimized but probably still 315 measurable. For example, the histogram of a netperf TCP_RR test will 316 be of the individual transaction times: 317 netperf -t TCP_RR -H lag -v 2 318 TCP REQUEST/RESPONSE TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to lag.hpl.hp.com (15.4.89.214) port 0 AF_INET : histogram 319 Local /Remote 320 Socket Size Request Resp. Elapsed Trans. 321 Send Recv Size Size Time Rate 322 bytes Bytes bytes bytes secs. per sec 323 324 16384 87380 1 1 10.00 3538.82 325 32768 32768 326 Alignment Offset 327 Local Remote Local Remote 328 Send Recv Send Recv 329 8 0 0 0 330 Histogram of request/response times 331 UNIT_USEC : 0: 0: 0: 0: 0: 0: 0: 0: 0: 0 332 TEN_USEC : 0: 0: 0: 0: 0: 0: 0: 0: 0: 0 333 HUNDRED_USEC : 0: 34480: 111: 13: 12: 6: 9: 3: 4: 7 334 UNIT_MSEC : 0: 60: 50: 51: 44: 44: 72: 119: 100: 101 335 TEN_MSEC : 0: 105: 0: 0: 0: 0: 0: 0: 0: 0 336 HUNDRED_MSEC : 0: 0: 0: 0: 0: 0: 0: 0: 0: 0 337 UNIT_SEC : 0: 0: 0: 0: 0: 0: 0: 0: 0: 0 338 TEN_SEC : 0: 0: 0: 0: 0: 0: 0: 0: 0: 0 339 >100_SECS: 0 340 HIST_TOTAL: 35391 341 342 The histogram you see above is basically a base-10 log histogram 343 where we can see that most of the transaction times were on the order 344 of one hundred to one-hundred, ninety-nine microseconds, but they were 345 occasionally as long as ten to nineteen milliseconds 346 347 The `--enable-demo=yes' configure option will cause code to be 348 included to report interim results during a test run. The rate at 349 which interim results are reported can then be controlled via the 350 global `-D' option. Here is an example of `-D' output: 351 352 $ src/netperf -D 1.35 -H tardy.hpl.hp.com -f M 353 MIGRATED TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to tardy.hpl.hp.com (15.9.116.144) port 0 AF_INET : demo 354 Interim result: 5.41 MBytes/s over 1.35 seconds ending at 1308789765.848 355 Interim result: 11.07 MBytes/s over 1.36 seconds ending at 1308789767.206 356 Interim result: 16.00 MBytes/s over 1.36 seconds ending at 1308789768.566 357 Interim result: 20.66 MBytes/s over 1.36 seconds ending at 1308789769.922 358 Interim result: 22.74 MBytes/s over 1.36 seconds ending at 1308789771.285 359 Interim result: 23.07 MBytes/s over 1.36 seconds ending at 1308789772.647 360 Interim result: 23.77 MBytes/s over 1.37 seconds ending at 1308789774.016 361 Recv Send Send 362 Socket Socket Message Elapsed 363 Size Size Size Time Throughput 364 bytes bytes bytes secs. MBytes/sec 365 366 87380 16384 16384 10.06 17.81 367 368 Notice how the units of the interim result track that requested by 369 the `-f' option. Also notice that sometimes the interval will be 370 longer than the value specified in the `-D' option. This is normal and 371 stems from how demo mode is implemented not by relying on interval 372 timers or frequent calls to get the current time, but by calculating 373 how many units of work must be performed to take at least the desired 374 interval. 375 376 Those familiar with this option in earlier versions of netperf will 377 note the addition of the "ending at" text. This is the time as 378 reported by a `gettimeofday()' call (or its emulation) with a `NULL' 379 timezone pointer. This addition is intended to make it easier to 380 insert interim results into an rrdtool 381 (http://oss.oetiker.ch/rrdtool/doc/rrdtool.en.html) Round-Robin 382 Database (RRD). A likely bug-riddled example of doing so can be found 383 in `doc/examples/netperf_interim_to_rrd.sh'. The time is reported out 384 to milliseconds rather than microseconds because that is the most 385 rrdtool understands as of the time of this writing. 386 387 As of this writing, a `make install' will not actually update the 388 files `/etc/services' and/or `/etc/inetd.conf' or their 389 platform-specific equivalents. It remains necessary to perform that 390 bit of installation magic by hand. Patches to the makefile sources to 391 effect an automagic editing of the necessary files to have netperf 392 installed as a child of inetd would be most welcome. 393 394 Starting the netserver as a standalone daemon should be as easy as: 395 $ netserver 396 Starting netserver at port 12865 397 Starting netserver at hostname 0.0.0.0 port 12865 and family 0 398 399 Over time the specifics of the messages netserver prints to the 400 screen may change but the gist will remain the same. 401 402 If the compilation of netperf or netserver happens to fail, feel free 403 to contact <netperf-feedback (a] netperf.org> or join and ask in 404 <netperf-talk (a] netperf.org>. However, it is quite important that you 405 include the actual compilation errors and perhaps even the configure 406 log in your email. Otherwise, it will be that much more difficult for 407 someone to assist you. 408 409 410 File: netperf.info, Node: Verifying Installation, Prev: Installing Netperf Bits, Up: Installing Netperf 411 412 2.3 Verifying Installation 413 ========================== 414 415 Basically, once netperf is installed and netserver is configured as a 416 child of inetd, or launched as a standalone daemon, simply typing: 417 netperf 418 should result in output similar to the following: 419 $ netperf 420 TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to localhost.localdomain (127.0.0.1) port 0 AF_INET 421 Recv Send Send 422 Socket Socket Message Elapsed 423 Size Size Size Time Throughput 424 bytes bytes bytes secs. 10^6bits/sec 425 426 87380 16384 16384 10.00 2997.84 427 428 429 File: netperf.info, Node: The Design of Netperf, Next: Global Command-line Options, Prev: Installing Netperf, Up: Top 430 431 3 The Design of Netperf 432 *********************** 433 434 Netperf is designed around a basic client-server model. There are two 435 executables - netperf and netserver. Generally you will only execute 436 the netperf program, with the netserver program being invoked by the 437 remote system's inetd or having been previously started as its own 438 standalone daemon. 439 440 When you execute netperf it will establish a "control connection" to 441 the remote system. This connection will be used to pass test 442 configuration information and results to and from the remote system. 443 Regardless of the type of test to be run, the control connection will 444 be a TCP connection using BSD sockets. The control connection can use 445 either IPv4 or IPv6. 446 447 Once the control connection is up and the configuration information 448 has been passed, a separate "data" connection will be opened for the 449 measurement itself using the API's and protocols appropriate for the 450 specified test. When the test is completed, the data connection will 451 be torn-down and results from the netserver will be passed-back via the 452 control connection and combined with netperf's result for display to 453 the user. 454 455 Netperf places no traffic on the control connection while a test is 456 in progress. Certain TCP options, such as SO_KEEPALIVE, if set as your 457 systems' default, may put packets out on the control connection while a 458 test is in progress. Generally speaking this will have no effect on 459 the results. 460 461 * Menu: 462 463 * CPU Utilization:: 464 465 466 File: netperf.info, Node: CPU Utilization, Prev: The Design of Netperf, Up: The Design of Netperf 467 468 3.1 CPU Utilization 469 =================== 470 471 CPU utilization is an important, and alas all-too infrequently reported 472 component of networking performance. Unfortunately, it can be one of 473 the most difficult metrics to measure accurately and portably. Netperf 474 will do its level best to report accurate CPU utilization figures, but 475 some combinations of processor, OS and configuration may make that 476 difficult. 477 478 CPU utilization in netperf is reported as a value between 0 and 100% 479 regardless of the number of CPUs involved. In addition to CPU 480 utilization, netperf will report a metric called a "service demand". 481 The service demand is the normalization of CPU utilization and work 482 performed. For a _STREAM test it is the microseconds of CPU time 483 consumed to transfer on KB (K == 1024) of data. For a _RR test it is 484 the microseconds of CPU time consumed processing a single transaction. 485 For both CPU utilization and service demand, lower is better. 486 487 Service demand can be particularly useful when trying to gauge the 488 effect of a performance change. It is essentially a measure of 489 efficiency, with smaller values being more efficient and thus "better." 490 491 Netperf is coded to be able to use one of several, generally 492 platform-specific CPU utilization measurement mechanisms. Single 493 letter codes will be included in the CPU portion of the test banner to 494 indicate which mechanism was used on each of the local (netperf) and 495 remote (netserver) system. 496 497 As of this writing those codes are: 498 499 `U' 500 The CPU utilization measurement mechanism was unknown to netperf or 501 netperf/netserver was not compiled to include CPU utilization 502 measurements. The code for the null CPU utilization mechanism can 503 be found in `src/netcpu_none.c'. 504 505 `I' 506 An HP-UX-specific CPU utilization mechanism whereby the kernel 507 incremented a per-CPU counter by one for each trip through the idle 508 loop. This mechanism was only available on specially-compiled HP-UX 509 kernels prior to HP-UX 10 and is mentioned here only for the sake 510 of historical completeness and perhaps as a suggestion to those 511 who might be altering other operating systems. While rather 512 simple, perhaps even simplistic, this mechanism was quite robust 513 and was not affected by the concerns of statistical methods, or 514 methods attempting to track time in each of user, kernel, 515 interrupt and idle modes which require quite careful accounting. 516 It can be thought-of as the in-kernel version of the looper `L' 517 mechanism without the context switch overhead. This mechanism 518 required calibration. 519 520 `P' 521 An HP-UX-specific CPU utilization mechanism whereby the kernel 522 keeps-track of time (in the form of CPU cycles) spent in the kernel 523 idle loop (HP-UX 10.0 to 11.31 inclusive), or where the kernel 524 keeps track of time spent in idle, user, kernel and interrupt 525 processing (HP-UX 11.23 and later). The former requires 526 calibration, the latter does not. Values in either case are 527 retrieved via one of the pstat(2) family of calls, hence the use 528 of the letter `P'. The code for these mechanisms is found in 529 `src/netcpu_pstat.c' and `src/netcpu_pstatnew.c' respectively. 530 531 `K' 532 A Solaris-specific CPU utilization mechanism whereby the kernel 533 keeps track of ticks (eg HZ) spent in the idle loop. This method 534 is statistical and is known to be inaccurate when the interrupt 535 rate is above epsilon as time spent processing interrupts is not 536 subtracted from idle. The value is retrieved via a kstat() call - 537 hence the use of the letter `K'. Since this mechanism uses units 538 of ticks (HZ) the calibration value should invariably match HZ. 539 (Eg 100) The code for this mechanism is implemented in 540 `src/netcpu_kstat.c'. 541 542 `M' 543 A Solaris-specific mechanism available on Solaris 10 and latter 544 which uses the new microstate accounting mechanisms. There are 545 two, alas, overlapping, mechanisms. The first tracks nanoseconds 546 spent in user, kernel, and idle modes. The second mechanism tracks 547 nanoseconds spent in interrupt. Since the mechanisms overlap, 548 netperf goes through some hand-waving to try to "fix" the problem. 549 Since the accuracy of the handwaving cannot be completely 550 determined, one must presume that while better than the `K' 551 mechanism, this mechanism too is not without issues. The values 552 are retrieved via kstat() calls, but the letter code is set to `M' 553 to distinguish this mechanism from the even less accurate `K' 554 mechanism. The code for this mechanism is implemented in 555 `src/netcpu_kstat10.c'. 556 557 `L' 558 A mechanism based on "looper"or "soaker" processes which sit in 559 tight loops counting as fast as they possibly can. This mechanism 560 starts a looper process for each known CPU on the system. The 561 effect of processor hyperthreading on the mechanism is not yet 562 known. This mechanism definitely requires calibration. The code 563 for the "looper"mechanism can be found in `src/netcpu_looper.c' 564 565 `N' 566 A Microsoft Windows-specific mechanism, the code for which can be 567 found in `src/netcpu_ntperf.c'. This mechanism too is based on 568 what appears to be a form of micro-state accounting and requires no 569 calibration. On laptops, or other systems which may dynamically 570 alter the CPU frequency to minimize power consumption, it has been 571 suggested that this mechanism may become slightly confused, in 572 which case using BIOS/uEFI settings to disable the power saving 573 would be indicated. 574 575 `S' 576 This mechanism uses `/proc/stat' on Linux to retrieve time (ticks) 577 spent in idle mode. It is thought but not known to be reasonably 578 accurate. The code for this mechanism can be found in 579 `src/netcpu_procstat.c'. 580 581 `C' 582 A mechanism somewhat similar to `S' but using the sysctl() call on 583 BSD-like Operating systems (*BSD and MacOS X). The code for this 584 mechanism can be found in `src/netcpu_sysctl.c'. 585 586 `Others' 587 Other mechanisms included in netperf in the past have included 588 using the times() and getrusage() calls. These calls are actually 589 rather poorly suited to the task of measuring CPU overhead for 590 networking as they tend to be process-specific and much 591 network-related processing can happen outside the context of a 592 process, in places where it is not a given it will be charged to 593 the correct, or even a process. They are mentioned here as a 594 warning to anyone seeing those mechanisms used in other networking 595 benchmarks. These mechanisms are not available in netperf 2.4.0 596 and later. 597 598 For many platforms, the configure script will chose the best 599 available CPU utilization mechanism. However, some platforms have no 600 particularly good mechanisms. On those platforms, it is probably best 601 to use the "LOOPER" mechanism which is basically some number of 602 processes (as many as there are processors) sitting in tight little 603 loops counting as fast as they can. The rate at which the loopers 604 count when the system is believed to be idle is compared with the rate 605 when the system is running netperf and the ratio is used to compute CPU 606 utilization. 607 608 In the past, netperf included some mechanisms that only reported CPU 609 time charged to the calling process. Those mechanisms have been 610 removed from netperf versions 2.4.0 and later because they are 611 hopelessly inaccurate. Networking can and often results in CPU time 612 being spent in places - such as interrupt contexts - that do not get 613 charged to a or the correct process. 614 615 In fact, time spent in the processing of interrupts is a common issue 616 for many CPU utilization mechanisms. In particular, the "PSTAT" 617 mechanism was eventually known to have problems accounting for certain 618 interrupt time prior to HP-UX 11.11 (11iv1). HP-UX 11iv2 and later are 619 known/presumed to be good. The "KSTAT" mechanism is known to have 620 problems on all versions of Solaris up to and including Solaris 10. 621 Even the microstate accounting available via kstat in Solaris 10 has 622 issues, though perhaps not as bad as those of prior versions. 623 624 The /proc/stat mechanism under Linux is in what the author would 625 consider an "uncertain" category as it appears to be statistical, which 626 may also have issues with time spent processing interrupts. 627 628 In summary, be sure to "sanity-check" the CPU utilization figures 629 with other mechanisms. However, platform tools such as top, vmstat or 630 mpstat are often based on the same mechanisms used by netperf. 631 632 * Menu: 633 634 * CPU Utilization in a Virtual Guest:: 635 636 637 File: netperf.info, Node: CPU Utilization in a Virtual Guest, Prev: CPU Utilization, Up: CPU Utilization 638 639 3.1.1 CPU Utilization in a Virtual Guest 640 ---------------------------------------- 641 642 The CPU utilization mechanisms used by netperf are "inline" in that 643 they are run by the same netperf or netserver process as is running the 644 test itself. This works just fine for "bare iron" tests but runs into 645 a problem when using virtual machines. 646 647 The relationship between virtual guest and hypervisor can be thought 648 of as being similar to that between a process and kernel in a bare iron 649 system. As such, (m)any CPU utilization mechanisms used in the virtual 650 guest are similar to "process-local" mechanisms in a bare iron 651 situation. However, just as with bare iron and process-local 652 mechanisms, much networking processing happens outside the context of 653 the virtual guest. It takes place in the hypervisor, and is not 654 visible to mechanisms running in the guest(s). For this reason, one 655 should not really trust CPU utilization figures reported by netperf or 656 netserver when running in a virtual guest. 657 658 If one is looking to measure the added overhead of a virtualization 659 mechanism, rather than rely on CPU utilization, one can rely instead on 660 netperf _RR tests - path-lengths and overheads can be a significant 661 fraction of the latency, so increases in overhead should appear as 662 decreases in transaction rate. Whatever you do, DO NOT rely on the 663 throughput of a _STREAM test. Achieving link-rate can be done via a 664 multitude of options that mask overhead rather than eliminate it. 665 666 667 File: netperf.info, Node: Global Command-line Options, Next: Using Netperf to Measure Bulk Data Transfer, Prev: The Design of Netperf, Up: Top 668 669 4 Global Command-line Options 670 ***************************** 671 672 This section describes each of the global command-line options 673 available in the netperf and netserver binaries. Essentially, it is an 674 expanded version of the usage information displayed by netperf or 675 netserver when invoked with the `-h' global command-line option. 676 677 * Menu: 678 679 * Command-line Options Syntax:: 680 * Global Options:: 681 682 683 File: netperf.info, Node: Command-line Options Syntax, Next: Global Options, Prev: Global Command-line Options, Up: Global Command-line Options 684 685 4.1 Command-line Options Syntax 686 =============================== 687 688 Revision 1.8 of netperf introduced enough new functionality to overrun 689 the English alphabet for mnemonic command-line option names, and the 690 author was not and is not quite ready to switch to the contemporary 691 `--mumble' style of command-line options. (Call him a Luddite if you 692 wish :). 693 694 For this reason, the command-line options were split into two parts - 695 the first are the global command-line options. They are options that 696 affect nearly any and every test type of netperf. The second type are 697 the test-specific command-line options. Both are entered on the same 698 command line, but they must be separated from one another by a `--' for 699 correct parsing. Global command-line options come first, followed by 700 the `--' and then test-specific command-line options. If there are no 701 test-specific options to be set, the `--' may be omitted. If there are 702 no global command-line options to be set, test-specific options must 703 still be preceded by a `--'. For example: 704 netperf <global> -- <test-specific> 705 sets both global and test-specific options: 706 netperf <global> 707 sets just global options and: 708 netperf -- <test-specific> 709 sets just test-specific options. 710 711 712 File: netperf.info, Node: Global Options, Prev: Command-line Options Syntax, Up: Global Command-line Options 713 714 4.2 Global Options 715 ================== 716 717 `-a <sizespec>' 718 This option allows you to alter the alignment of the buffers used 719 in the sending and receiving calls on the local system.. Changing 720 the alignment of the buffers can force the system to use different 721 copy schemes, which can have a measurable effect on performance. 722 If the page size for the system were 4096 bytes, and you want to 723 pass page-aligned buffers beginning on page boundaries, you could 724 use `-a 4096'. By default the units are bytes, but suffix of "G," 725 "M," or "K" will specify the units to be 2^30 (GB), 2^20 (MB) or 726 2^10 (KB) respectively. A suffix of "g," "m" or "k" will specify 727 units of 10^9, 10^6 or 10^3 bytes respectively. [Default: 8 bytes] 728 729 `-A <sizespec>' 730 This option is identical to the `-a' option with the difference 731 being it affects alignments for the remote system. 732 733 `-b <size>' 734 This option is only present when netperf has been configure with 735 -enable-intervals=yes prior to compilation. It sets the size of 736 the burst of send calls in a _STREAM test. When used in 737 conjunction with the `-w' option it can cause the rate at which 738 data is sent to be "paced." 739 740 `-B <string>' 741 This option will cause `<string>' to be appended to the brief (see 742 -P) output of netperf. 743 744 `-c [rate]' 745 This option will ask that CPU utilization and service demand be 746 calculated for the local system. For those CPU utilization 747 mechanisms requiring calibration, the options rate parameter may 748 be specified to preclude running another calibration step, saving 749 40 seconds of time. For those CPU utilization mechanisms 750 requiring no calibration, the optional rate parameter will be 751 utterly and completely ignored. [Default: no CPU measurements] 752 753 `-C [rate]' 754 This option requests CPU utilization and service demand 755 calculations for the remote system. It is otherwise identical to 756 the `-c' option. 757 758 `-d' 759 Each instance of this option will increase the quantity of 760 debugging output displayed during a test. If the debugging output 761 level is set high enough, it may have a measurable effect on 762 performance. Debugging information for the local system is 763 printed to stdout. Debugging information for the remote system is 764 sent by default to the file `/tmp/netperf.debug'. [Default: no 765 debugging output] 766 767 `-D [interval,units]' 768 This option is only available when netperf is configured with 769 -enable-demo=yes. When set, it will cause netperf to emit periodic 770 reports of performance during the run. [INTERVAL,UNITS] follow 771 the semantics of an optionspec. If specified, INTERVAL gives the 772 minimum interval in real seconds, it does not have to be whole 773 seconds. The UNITS value can be used for the first guess as to 774 how many units of work (bytes or transactions) must be done to 775 take at least INTERVAL seconds. If omitted, INTERVAL defaults to 776 one second and UNITS to values specific to each test type. 777 778 `-f G|M|K|g|m|k|x' 779 This option can be used to change the reporting units for _STREAM 780 tests. Arguments of "G," "M," or "K" will set the units to 2^30, 781 2^20 or 2^10 bytes/s respectively (EG power of two GB, MB or KB). 782 Arguments of "g," ",m" or "k" will set the units to 10^9, 10^6 or 783 10^3 bits/s respectively. An argument of "x" requests the units 784 be transactions per second and is only meaningful for a 785 request-response test. [Default: "m" or 10^6 bits/s] 786 787 `-F <fillfile>' 788 This option specified the file from which send which buffers will 789 be pre-filled . While the buffers will contain data from the 790 specified file, the file is not fully transferred to the remote 791 system as the receiving end of the test will not write the 792 contents of what it receives to a file. This can be used to 793 pre-fill the send buffers with data having different 794 compressibility and so is useful when measuring performance over 795 mechanisms which perform compression. 796 797 While previously required for a TCP_SENDFILE test, later versions 798 of netperf removed that restriction, creating a temporary file as 799 needed. While the author cannot recall exactly when that took 800 place, it is known to be unnecessary in version 2.5.0 and later. 801 802 `-h' 803 This option causes netperf to display its "global" usage string and 804 exit to the exclusion of all else. 805 806 `-H <optionspec>' 807 This option will set the name of the remote system and or the 808 address family used for the control connection. For example: 809 -H linger,4 810 will set the name of the remote system to "linger" and tells 811 netperf to use IPv4 addressing only. 812 -H ,6 813 will leave the name of the remote system at its default, and 814 request that only IPv6 addresses be used for the control 815 connection. 816 -H lag 817 will set the name of the remote system to "lag" and leave the 818 address family to AF_UNSPEC which means selection of IPv4 vs IPv6 819 is left to the system's address resolution. 820 821 A value of "inet" can be used in place of "4" to request IPv4 only 822 addressing. Similarly, a value of "inet6" can be used in place of 823 "6" to request IPv6 only addressing. A value of "0" can be used 824 to request either IPv4 or IPv6 addressing as name resolution 825 dictates. 826 827 By default, the options set with the global `-H' option are 828 inherited by the test for its data connection, unless a 829 test-specific `-H' option is specified. 830 831 If a `-H' option follows either the `-4' or `-6' options, the 832 family setting specified with the -H option will override the `-4' 833 or `-6' options for the remote address family. If no address 834 family is specified, settings from a previous `-4' or `-6' option 835 will remain. In a nutshell, the last explicit global command-line 836 option wins. 837 838 [Default: "localhost" for the remote name/IP address and "0" (eg 839 AF_UNSPEC) for the remote address family.] 840 841 `-I <optionspec>' 842 This option enables the calculation of confidence intervals and 843 sets the confidence and width parameters with the first half of the 844 optionspec being either 99 or 95 for 99% or 95% confidence 845 respectively. The second value of the optionspec specifies the 846 width of the desired confidence interval. For example 847 -I 99,5 848 asks netperf to be 99% confident that the measured mean values for 849 throughput and CPU utilization are within +/- 2.5% of the "real" 850 mean values. If the `-i' option is specified and the `-I' option 851 is omitted, the confidence defaults to 99% and the width to 5% 852 (giving +/- 2.5%) 853 854 If classic netperf test calculates that the desired confidence 855 intervals have not been met, it emits a noticeable warning that 856 cannot be suppressed with the `-P' or `-v' options: 857 858 netperf -H tardy.cup -i 3 -I 99,5 859 TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to tardy.cup.hp.com (15.244.44.58) port 0 AF_INET : +/-2.5% 99% conf. 860 !!! WARNING 861 !!! Desired confidence was not achieved within the specified iterations. 862 !!! This implies that there was variability in the test environment that 863 !!! must be investigated before going further. 864 !!! Confidence intervals: Throughput : 6.8% 865 !!! Local CPU util : 0.0% 866 !!! Remote CPU util : 0.0% 867 868 Recv Send Send 869 Socket Socket Message Elapsed 870 Size Size Size Time Throughput 871 bytes bytes bytes secs. 10^6bits/sec 872 873 32768 16384 16384 10.01 40.23 874 875 In the example above we see that netperf did not meet the desired 876 confidence intervals. Instead of being 99% confident it was within 877 +/- 2.5% of the real mean value of throughput it is only confident 878 it was within +/-3.4%. In this example, increasing the `-i' 879 option (described below) and/or increasing the iteration length 880 with the `-l' option might resolve the situation. 881 882 In an explicit "omni" test, failure to meet the confidence 883 intervals will not result in netperf emitting a warning. To 884 verify the hitting, or not, of the confidence intervals one will 885 need to include them as part of an *note output selection: Omni 886 Output Selection. in the test-specific `-o', `-O' or `k' output 887 selection options. The warning about not hitting the confidence 888 intervals will remain in a "migrated" classic netperf test. 889 890 `-i <sizespec>' 891 This option enables the calculation of confidence intervals and 892 sets the minimum and maximum number of iterations to run in 893 attempting to achieve the desired confidence interval. The first 894 value sets the maximum number of iterations to run, the second, 895 the minimum. The maximum number of iterations is silently capped 896 at 30 and the minimum is silently floored at 3. Netperf repeats 897 the measurement the minimum number of iterations and continues 898 until it reaches either the desired confidence interval, or the 899 maximum number of iterations, whichever comes first. A classic or 900 migrated netperf test will not display the actual number of 901 iterations run. An *note omni test: The Omni Tests. will emit the 902 number of iterations run if the `CONFIDENCE_ITERATION' output 903 selector is included in the *note output selection: Omni Output 904 Selection. 905 906 If the `-I' option is specified and the `-i' option omitted the 907 maximum number of iterations is set to 10 and the minimum to three. 908 909 Output of a warning upon not hitting the desired confidence 910 intervals follows the description provided for the `-I' option. 911 912 The total test time will be somewhere between the minimum and 913 maximum number of iterations multiplied by the test length 914 supplied by the `-l' option. 915 916 `-j' 917 This option instructs netperf to keep additional timing statistics 918 when explicitly running an *note omni test: The Omni Tests. These 919 can be output when the test-specific `-o', `-O' or `-k' *note 920 output selectors: Omni Output Selectors. include one or more of: 921 922 * MIN_LATENCY 923 924 * MAX_LATENCY 925 926 * P50_LATENCY 927 928 * P90_LATENCY 929 930 * P99_LATENCY 931 932 * MEAN_LATENCY 933 934 * STDDEV_LATENCY 935 936 These statistics will be based on an expanded (100 buckets per row 937 rather than 10) histogram of times rather than a terribly long 938 list of individual times. As such, there will be some slight 939 error thanks to the bucketing. However, the reduction in storage 940 and processing overheads is well worth it. When running a 941 request/response test, one might get some idea of the error by 942 comparing the *note `MEAN_LATENCY': Omni Output Selectors. 943 calculated from the histogram with the `RT_LATENCY' calculated 944 from the number of request/response transactions and the test run 945 time. 946 947 In the case of a request/response test the latencies will be 948 transaction latencies. In the case of a receive-only test they 949 will be time spent in the receive call. In the case of a 950 send-only test they will be time spent in the send call. The units 951 will be microseconds. Added in netperf 2.5.0. 952 953 `-l testlen' 954 This option controls the length of any one iteration of the 955 requested test. A positive value for TESTLEN will run each 956 iteration of the test for at least TESTLEN seconds. A negative 957 value for TESTLEN will run each iteration for the absolute value of 958 TESTLEN transactions for a _RR test or bytes for a _STREAM test. 959 Certain tests, notably those using UDP can only be timed, they 960 cannot be limited by transaction or byte count. This limitation 961 may be relaxed in an *note omni: The Omni Tests. test. 962 963 In some situations, individual iterations of a test may run for 964 longer for the number of seconds specified by the `-l' option. In 965 particular, this may occur for those tests where the socket buffer 966 size(s) are significantly longer than the bandwidthXdelay product 967 of the link(s) over which the data connection passes, or those 968 tests where there may be non-trivial numbers of retransmissions. 969 970 If confidence intervals are enabled via either `-I' or `-i' the 971 total length of the netperf test will be somewhere between the 972 minimum and maximum iteration count multiplied by TESTLEN. 973 974 `-L <optionspec>' 975 This option is identical to the `-H' option with the difference 976 being it sets the _local_ hostname/IP and/or address family 977 information. This option is generally unnecessary, but can be 978 useful when you wish to make sure that the netperf control and data 979 connections go via different paths. It can also come-in handy if 980 one is trying to run netperf through those evil, end-to-end 981 breaking things known as firewalls. 982 983 [Default: 0.0.0.0 (eg INADDR_ANY) for IPv4 and ::0 for IPv6 for the 984 local name. AF_UNSPEC for the local address family.] 985 986 `-n numcpus' 987 This option tells netperf how many CPUs it should ass-u-me are 988 active on the system running netperf. In particular, this is used 989 for the *note CPU utilization: CPU Utilization. and service demand 990 calculations. On certain systems, netperf is able to determine 991 the number of CPU's automagically. This option will override any 992 number netperf might be able to determine on its own. 993 994 Note that this option does _not_ set the number of CPUs on the 995 system running netserver. When netperf/netserver cannot 996 automagically determine the number of CPUs that can only be set 997 for netserver via a netserver `-n' command-line option. 998 999 As it is almost universally possible for netperf/netserver to 1000 determine the number of CPUs on the system automagically, 99 times 1001 out of 10 this option should not be necessary and may be removed 1002 in a future release of netperf. 1003 1004 `-N' 1005 This option tells netperf to forgo establishing a control 1006 connection. This makes it is possible to run some limited netperf 1007 tests without a corresponding netserver on the remote system. 1008 1009 With this option set, the test to be run is to get all the 1010 addressing information it needs to establish its data connection 1011 from the command line or internal defaults. If not otherwise 1012 specified by test-specific command line options, the data 1013 connection for a "STREAM" or "SENDFILE" test will be to the 1014 "discard" port, an "RR" test will be to the "echo" port, and a 1015 "MEARTS" test will be to the chargen port. 1016 1017 The response size of an "RR" test will be silently set to be the 1018 same as the request size. Otherwise the test would hang if the 1019 response size was larger than the request size, or would report an 1020 incorrect, inflated transaction rate if the response size was less 1021 than the request size. 1022 1023 Since there is no control connection when this option is 1024 specified, it is not possible to set "remote" properties such as 1025 socket buffer size and the like via the netperf command line. Nor 1026 is it possible to retrieve such interesting remote information as 1027 CPU utilization. These items will be displayed as values which 1028 should make it immediately obvious that was the case. 1029 1030 The only way to change remote characteristics such as socket buffer 1031 size or to obtain information such as CPU utilization is to employ 1032 platform-specific methods on the remote system. Frankly, if one 1033 has access to the remote system to employ those methods one aught 1034 to be able to run a netserver there. However, that ability may 1035 not be present in certain "support" situations, hence the addition 1036 of this option. 1037 1038 Added in netperf 2.4.3. 1039 1040 `-o <sizespec>' 1041 The value(s) passed-in with this option will be used as an offset 1042 added to the alignment specified with the `-a' option. For 1043 example: 1044 -o 3 -a 4096 1045 will cause the buffers passed to the local (netperf) send and 1046 receive calls to begin three bytes past an address aligned to 4096 1047 bytes. [Default: 0 bytes] 1048 1049 `-O <sizespec>' 1050 This option behaves just as the `-o' option but on the remote 1051 (netserver) system and in conjunction with the `-A' option. 1052 [Default: 0 bytes] 1053 1054 `-p <optionspec>' 1055 The first value of the optionspec passed-in with this option tells 1056 netperf the port number at which it should expect the remote 1057 netserver to be listening for control connections. The second 1058 value of the optionspec will request netperf to bind to that local 1059 port number before establishing the control connection. For 1060 example 1061 -p 12345 1062 tells netperf that the remote netserver is listening on port 12345 1063 and leaves selection of the local port number for the control 1064 connection up to the local TCP/IP stack whereas 1065 -p ,32109 1066 leaves the remote netserver port at the default value of 12865 and 1067 causes netperf to bind to the local port number 32109 before 1068 connecting to the remote netserver. 1069 1070 In general, setting the local port number is only necessary when 1071 one is looking to run netperf through those evil, end-to-end 1072 breaking things known as firewalls. 1073 1074 `-P 0|1' 1075 A value of "1" for the `-P' option will enable display of the test 1076 banner. A value of "0" will disable display of the test banner. 1077 One might want to disable display of the test banner when running 1078 the same basic test type (eg TCP_STREAM) multiple times in 1079 succession where the test banners would then simply be redundant 1080 and unnecessarily clutter the output. [Default: 1 - display test 1081 banners] 1082 1083 `-s <seconds>' 1084 This option will cause netperf to sleep `<seconds>' before 1085 actually transferring data over the data connection. This may be 1086 useful in situations where one wishes to start a great many netperf 1087 instances and do not want the earlier ones affecting the ability of 1088 the later ones to get established. 1089 1090 Added somewhere between versions 2.4.3 and 2.5.0. 1091 1092 `-S' 1093 This option will cause an attempt to be made to set SO_KEEPALIVE on 1094 the data socket of a test using the BSD sockets interface. The 1095 attempt will be made on the netperf side of all tests, and will be 1096 made on the netserver side of an *note omni: The Omni Tests. or 1097 *note migrated: Migrated Tests. test. No indication of failure is 1098 given unless debug output is enabled with the global `-d' option. 1099 1100 Added in version 2.5.0. 1101 1102 `-t testname' 1103 This option is used to tell netperf which test you wish to run. 1104 As of this writing, valid values for TESTNAME include: 1105 * *note TCP_STREAM::, *note TCP_MAERTS::, *note TCP_SENDFILE::, 1106 *note TCP_RR::, *note TCP_CRR::, *note TCP_CC:: 1107 1108 * *note UDP_STREAM::, *note UDP_RR:: 1109 1110 * *note XTI_TCP_STREAM::, *note XTI_TCP_RR::, *note 1111 XTI_TCP_CRR::, *note XTI_TCP_CC:: 1112 1113 * *note XTI_UDP_STREAM::, *note XTI_UDP_RR:: 1114 1115 * *note SCTP_STREAM::, *note SCTP_RR:: 1116 1117 * *note DLCO_STREAM::, *note DLCO_RR::, *note DLCL_STREAM::, 1118 *note DLCL_RR:: 1119 1120 * *note LOC_CPU: Other Netperf Tests, *note REM_CPU: Other 1121 Netperf Tests. 1122 1123 * *note OMNI: The Omni Tests. 1124 Not all tests are always compiled into netperf. In particular, the 1125 "XTI," "SCTP," "UNIXDOMAIN," and "DL*" tests are only included in 1126 netperf when configured with 1127 `--enable-[xti|sctp|unixdomain|dlpi]=yes'. 1128 1129 Netperf only runs one type of test no matter how many `-t' options 1130 may be present on the command-line. The last `-t' global 1131 command-line option will determine the test to be run. [Default: 1132 TCP_STREAM] 1133 1134 `-T <optionspec>' 1135 This option controls the CPU, and probably by extension memory, 1136 affinity of netperf and/or netserver. 1137 netperf -T 1 1138 will bind both netperf and netserver to "CPU 1" on their respective 1139 systems. 1140 netperf -T 1, 1141 will bind just netperf to "CPU 1" and will leave netserver unbound. 1142 netperf -T ,2 1143 will leave netperf unbound and will bind netserver to "CPU 2." 1144 netperf -T 1,2 1145 will bind netperf to "CPU 1" and netserver to "CPU 2." 1146 1147 This can be particularly useful when investigating performance 1148 issues involving where processes run relative to where NIC 1149 interrupts are processed or where NICs allocate their DMA buffers. 1150 1151 `-v verbosity' 1152 This option controls how verbose netperf will be in its output, 1153 and is often used in conjunction with the `-P' option. If the 1154 verbosity is set to a value of "0" then only the test's SFM (Single 1155 Figure of Merit) is displayed. If local *note CPU utilization: 1156 CPU Utilization. is requested via the `-c' option then the SFM is 1157 the local service demand. Othersise, if remote CPU utilization is 1158 requested via the `-C' option then the SFM is the remote service 1159 demand. If neither local nor remote CPU utilization are requested 1160 the SFM will be the measured throughput or transaction rate as 1161 implied by the test specified with the `-t' option. 1162 1163 If the verbosity level is set to "1" then the "normal" netperf 1164 result output for each test is displayed. 1165 1166 If the verbosity level is set to "2" then "extra" information will 1167 be displayed. This may include, but is not limited to the number 1168 of send or recv calls made and the average number of bytes per 1169 send or recv call, or a histogram of the time spent in each send() 1170 call or for each transaction if netperf was configured with 1171 `--enable-histogram=yes'. [Default: 1 - normal verbosity] 1172 1173 In an *note omni: The Omni Tests. test the verbosity setting is 1174 largely ignored, save for when asking for the time histogram to be 1175 displayed. In version 2.5.0 and later there is no *note output 1176 selector: Omni Output Selectors. for the histogram and so it 1177 remains displayed only when the verbosity level is set to 2. 1178 1179 `-V' 1180 This option displays the netperf version and then exits. 1181 1182 Added in netperf 2.4.4. 1183 1184 `-w time' 1185 If netperf was configured with `--enable-intervals=yes' then this 1186 value will set the inter-burst time to time milliseconds, and the 1187 `-b' option will set the number of sends per burst. The actual 1188 inter-burst time may vary depending on the system's timer 1189 resolution. 1190 1191 `-W <sizespec>' 1192 This option controls the number of buffers in the send (first or 1193 only value) and or receive (second or only value) buffer rings. 1194 Unlike some benchmarks, netperf does not continuously send or 1195 receive from a single buffer. Instead it rotates through a ring of 1196 buffers. [Default: One more than the size of the send or receive 1197 socket buffer sizes (`-s' and/or `-S' options) divided by the send 1198 `-m' or receive `-M' buffer size respectively] 1199 1200 `-4' 1201 Specifying this option will set both the local and remote address 1202 families to AF_INET - that is use only IPv4 addresses on the 1203 control connection. This can be overridden by a subsequent `-6', 1204 `-H' or `-L' option. Basically, the last option explicitly 1205 specifying an address family wins. Unless overridden by a 1206 test-specific option, this will be inherited for the data 1207 connection as well. 1208 1209 `-6' 1210 Specifying this option will set both local and and remote address 1211 families to AF_INET6 - that is use only IPv6 addresses on the 1212 control connection. This can be overridden by a subsequent `-4', 1213 `-H' or `-L' option. Basically, the last address family 1214 explicitly specified wins. Unless overridden by a test-specific 1215 option, this will be inherited for the data connection as well. 1216 1217 1218 1219 File: netperf.info, Node: Using Netperf to Measure Bulk Data Transfer, Next: Using Netperf to Measure Request/Response, Prev: Global Command-line Options, Up: Top 1220 1221 5 Using Netperf to Measure Bulk Data Transfer 1222 ********************************************* 1223 1224 The most commonly measured aspect of networked system performance is 1225 that of bulk or unidirectional transfer performance. Everyone wants to 1226 know how many bits or bytes per second they can push across the 1227 network. The classic netperf convention for a bulk data transfer test 1228 name is to tack a "_STREAM" suffix to a test name. 1229 1230 * Menu: 1231 1232 * Issues in Bulk Transfer:: 1233 * Options common to TCP UDP and SCTP tests:: 1234 1235 1236 File: netperf.info, Node: Issues in Bulk Transfer, Next: Options common to TCP UDP and SCTP tests, Prev: Using Netperf to Measure Bulk Data Transfer, Up: Using Netperf to Measure Bulk Data Transfer 1237 1238 5.1 Issues in Bulk Transfer 1239 =========================== 1240 1241 There are any number of things which can affect the performance of a 1242 bulk transfer test. 1243 1244 Certainly, absent compression, bulk-transfer tests can be limited by 1245 the speed of the slowest link in the path from the source to the 1246 destination. If testing over a gigabit link, you will not see more 1247 than a gigabit :) Such situations can be described as being 1248 "network-limited" or "NIC-limited". 1249 1250 CPU utilization can also affect the results of a bulk-transfer test. 1251 If the networking stack requires a certain number of instructions or 1252 CPU cycles per KB of data transferred, and the CPU is limited in the 1253 number of instructions or cycles it can provide, then the transfer can 1254 be described as being "CPU-bound". 1255 1256 A bulk-transfer test can be CPU bound even when netperf reports less 1257 than 100% CPU utilization. This can happen on an MP system where one 1258 or more of the CPUs saturate at 100% but other CPU's remain idle. 1259 Typically, a single flow of data, such as that from a single instance 1260 of a netperf _STREAM test cannot make use of much more than the power 1261 of one CPU. Exceptions to this generally occur when netperf and/or 1262 netserver run on CPU(s) other than the CPU(s) taking interrupts from 1263 the NIC(s). In that case, one might see as much as two CPUs' worth of 1264 processing being used to service the flow of data. 1265 1266 Distance and the speed-of-light can affect performance for a 1267 bulk-transfer; often this can be mitigated by using larger windows. 1268 One common limit to the performance of a transport using window-based 1269 flow-control is: 1270 Throughput <= WindowSize/RoundTripTime 1271 As the sender can only have a window's-worth of data outstanding on 1272 the network at any one time, and the soonest the sender can receive a 1273 window update from the receiver is one RoundTripTime (RTT). TCP and 1274 SCTP are examples of such protocols. 1275 1276 Packet losses and their effects can be particularly bad for 1277 performance. This is especially true if the packet losses result in 1278 retransmission timeouts for the protocol(s) involved. By the time a 1279 retransmission timeout has happened, the flow or connection has sat 1280 idle for a considerable length of time. 1281 1282 On many platforms, some variant on the `netstat' command can be used 1283 to retrieve statistics about packet loss and retransmission. For 1284 example: 1285 netstat -p tcp 1286 will retrieve TCP statistics on the HP-UX Operating System. On other 1287 platforms, it may not be possible to retrieve statistics for a specific 1288 protocol and something like: 1289 netstat -s 1290 would be used instead. 1291 1292 Many times, such network statistics are keep since the time the stack 1293 started, and we are only really interested in statistics from when 1294 netperf was running. In such situations something along the lines of: 1295 netstat -p tcp > before 1296 netperf -t TCP_mumble... 1297 netstat -p tcp > after 1298 is indicated. The beforeafter 1299 (ftp://ftp.cup.hp.com/dist/networking/tools/) utility can be used to 1300 subtract the statistics in `before' from the statistics in `after': 1301 beforeafter before after > delta 1302 and then one can look at the statistics in `delta'. Beforeafter is 1303 distributed in source form so one can compile it on the platform(s) of 1304 interest. 1305 1306 If running a version 2.5.0 or later "omni" test under Linux one can 1307 include either or both of: 1308 * LOCAL_TRANSPORT_RETRANS 1309 1310 * REMOTE_TRANSPORT_RETRANS 1311 1312 in the values provided via a test-specific `-o', `-O', or `-k' 1313 output selction option and netperf will report the retransmissions 1314 experienced on the data connection, as reported via a 1315 `getsockopt(TCP_INFO)' call. If confidence intervals have been 1316 requested via the global `-I' or `-i' options, the reported value(s) 1317 will be for the last iteration. If the test is over a protocol other 1318 than TCP, or on a platform other than Linux, the results are undefined. 1319 1320 While it was written with HP-UX's netstat in mind, the annotated 1321 netstat 1322 (ftp://ftp.cup.hp.com/dist/networking/briefs/annotated_netstat.txt) 1323 writeup may be helpful with other platforms as well. 1324 1325 1326 File: netperf.info, Node: Options common to TCP UDP and SCTP tests, Prev: Issues in Bulk Transfer, Up: Using Netperf to Measure Bulk Data Transfer 1327 1328 5.2 Options common to TCP UDP and SCTP tests 1329 ============================================ 1330 1331 Many "test-specific" options are actually common across the different 1332 tests. For those tests involving TCP, UDP and SCTP, whether using the 1333 BSD Sockets or the XTI interface those common options include: 1334 1335 `-h' 1336 Display the test-suite-specific usage string and exit. For a TCP_ 1337 or UDP_ test this will be the usage string from the source file 1338 nettest_bsd.c. For an XTI_ test, this will be the usage string 1339 from the source file nettest_xti.c. For an SCTP test, this will 1340 be the usage string from the source file nettest_sctp.c. 1341 1342 `-H <optionspec>' 1343 Normally, the remote hostname|IP and address family information is 1344 inherited from the settings for the control connection (eg global 1345 command-line `-H', `-4' and/or `-6' options). The test-specific 1346 `-H' will override those settings for the data (aka test) 1347 connection only. Settings for the control connection are left 1348 unchanged. 1349 1350 `-L <optionspec>' 1351 The test-specific `-L' option is identical to the test-specific 1352 `-H' option except it affects the local hostname|IP and address 1353 family information. As with its global command-line counterpart, 1354 this is generally only useful when measuring though those evil, 1355 end-to-end breaking things called firewalls. 1356 1357 `-m bytes' 1358 Set the size of the buffer passed-in to the "send" calls of a 1359 _STREAM test. Note that this may have only an indirect effect on 1360 the size of the packets sent over the network, and certain Layer 4 1361 protocols do _not_ preserve or enforce message boundaries, so 1362 setting `-m' for the send size does not necessarily mean the 1363 receiver will receive that many bytes at any one time. By default 1364 the units are bytes, but suffix of "G," "M," or "K" will specify 1365 the units to be 2^30 (GB), 2^20 (MB) or 2^10 (KB) respectively. A 1366 suffix of "g," "m" or "k" will specify units of 10^9, 10^6 or 10^3 1367 bytes respectively. For example: 1368 `-m 32K' 1369 will set the size to 32KB or 32768 bytes. [Default: the local send 1370 socket buffer size for the connection - either the system's 1371 default or the value set via the `-s' option.] 1372 1373 `-M bytes' 1374 Set the size of the buffer passed-in to the "recv" calls of a 1375 _STREAM test. This will be an upper bound on the number of bytes 1376 received per receive call. By default the units are bytes, but 1377 suffix of "G," "M," or "K" will specify the units to be 2^30 (GB), 1378 2^20 (MB) or 2^10 (KB) respectively. A suffix of "g," "m" or "k" 1379 will specify units of 10^9, 10^6 or 10^3 bytes respectively. For 1380 example: 1381 `-M 32K' 1382 will set the size to 32KB or 32768 bytes. [Default: the remote 1383 receive socket buffer size for the data connection - either the 1384 system's default or the value set via the `-S' option.] 1385 1386 `-P <optionspec>' 1387 Set the local and/or remote port numbers for the data connection. 1388 1389 `-s <sizespec>' 1390 This option sets the local (netperf) send and receive socket buffer 1391 sizes for the data connection to the value(s) specified. Often, 1392 this will affect the advertised and/or effective TCP or other 1393 window, but on some platforms it may not. By default the units are 1394 bytes, but suffix of "G," "M," or "K" will specify the units to be 1395 2^30 (GB), 2^20 (MB) or 2^10 (KB) respectively. A suffix of "g," 1396 "m" or "k" will specify units of 10^9, 10^6 or 10^3 bytes 1397 respectively. For example: 1398 `-s 128K' 1399 Will request the local send and receive socket buffer sizes to be 1400 128KB or 131072 bytes. 1401 1402 While the historic expectation is that setting the socket buffer 1403 size has a direct effect on say the TCP window, today that may not 1404 hold true for all stacks. Further, while the historic expectation 1405 is that the value specified in a `setsockopt()' call will be the 1406 value returned via a `getsockopt()' call, at least one stack is 1407 known to deliberately ignore history. When running under Windows 1408 a value of 0 may be used which will be an indication to the stack 1409 the user wants to enable a form of copy avoidance. [Default: -1 - 1410 use the system's default socket buffer sizes] 1411 1412 `-S <sizespec>' 1413 This option sets the remote (netserver) send and/or receive socket 1414 buffer sizes for the data connection to the value(s) specified. 1415 Often, this will affect the advertised and/or effective TCP or 1416 other window, but on some platforms it may not. By default the 1417 units are bytes, but suffix of "G," "M," or "K" will specify the 1418 units to be 2^30 (GB), 2^20 (MB) or 2^10 (KB) respectively. A 1419 suffix of "g," "m" or "k" will specify units of 10^9, 10^6 or 10^3 1420 bytes respectively. For example: 1421 `-S 128K' 1422 Will request the remote send and receive socket buffer sizes to be 1423 128KB or 131072 bytes. 1424 1425 While the historic expectation is that setting the socket buffer 1426 size has a direct effect on say the TCP window, today that may not 1427 hold true for all stacks. Further, while the historic expectation 1428 is that the value specified in a `setsockopt()' call will be the 1429 value returned via a `getsockopt()' call, at least one stack is 1430 known to deliberately ignore history. When running under Windows 1431 a value of 0 may be used which will be an indication to the stack 1432 the user wants to enable a form of copy avoidance. [Default: -1 - 1433 use the system's default socket buffer sizes] 1434 1435 `-4' 1436 Set the local and remote address family for the data connection to 1437 AF_INET - ie use IPv4 addressing only. Just as with their global 1438 command-line counterparts the last of the `-4', `-6', `-H' or `-L' 1439 option wins for their respective address families. 1440 1441 `-6' 1442 This option is identical to its `-4' cousin, but requests IPv6 1443 addresses for the local and remote ends of the data connection. 1444 1445 1446 * Menu: 1447 1448 * TCP_STREAM:: 1449 * TCP_MAERTS:: 1450 * TCP_SENDFILE:: 1451 * UDP_STREAM:: 1452 * XTI_TCP_STREAM:: 1453 * XTI_UDP_STREAM:: 1454 * SCTP_STREAM:: 1455 * DLCO_STREAM:: 1456 * DLCL_STREAM:: 1457 * STREAM_STREAM:: 1458 * DG_STREAM:: 1459 1460 1461 File: netperf.info, Node: TCP_STREAM, Next: TCP_MAERTS, Prev: Options common to TCP UDP and SCTP tests, Up: Options common to TCP UDP and SCTP tests 1462 1463 5.2.1 TCP_STREAM 1464 ---------------- 1465 1466 The TCP_STREAM test is the default test in netperf. It is quite 1467 simple, transferring some quantity of data from the system running 1468 netperf to the system running netserver. While time spent establishing 1469 the connection is not included in the throughput calculation, time 1470 spent flushing the last of the data to the remote at the end of the 1471 test is. This is how netperf knows that all the data it sent was 1472 received by the remote. In addition to the *note options common to 1473 STREAM tests: Options common to TCP UDP and SCTP tests, the following 1474 test-specific options can be included to possibly alter the behavior of 1475 the test: 1476 1477 `-C' 1478 This option will set TCP_CORK mode on the data connection on those 1479 systems where TCP_CORK is defined (typically Linux). A full 1480 description of TCP_CORK is beyond the scope of this manual, but in 1481 a nutshell it forces sub-MSS sends to be buffered so every segment 1482 sent is Maximum Segment Size (MSS) unless the application performs 1483 an explicit flush operation or the connection is closed. At 1484 present netperf does not perform any explicit flush operations. 1485 Setting TCP_CORK may improve the bitrate of tests where the "send 1486 size" (`-m' option) is smaller than the MSS. It should also 1487 improve (make smaller) the service demand. 1488 1489 The Linux tcp(7) manpage states that TCP_CORK cannot be used in 1490 conjunction with TCP_NODELAY (set via the `-d' option), however 1491 netperf does not validate command-line options to enforce that. 1492 1493 `-D' 1494 This option will set TCP_NODELAY on the data connection on those 1495 systems where TCP_NODELAY is defined. This disables something 1496 known as the Nagle Algorithm, which is intended to make the 1497 segments TCP sends as large as reasonably possible. Setting 1498 TCP_NODELAY for a TCP_STREAM test should either have no effect 1499 when the send size (`-m' option) is larger than the MSS or will 1500 decrease reported bitrate and increase service demand when the 1501 send size is smaller than the MSS. This stems from TCP_NODELAY 1502 causing each sub-MSS send to be its own TCP segment rather than 1503 being aggregated with other small sends. This means more trips up 1504 and down the protocol stack per KB of data transferred, which 1505 means greater CPU utilization. 1506 1507 If setting TCP_NODELAY with `-D' affects throughput and/or service 1508 demand for tests where the send size (`-m') is larger than the MSS 1509 it suggests the TCP/IP stack's implementation of the Nagle 1510 Algorithm _may_ be broken, perhaps interpreting the Nagle 1511 Algorithm on a segment by segment basis rather than the proper user 1512 send by user send basis. However, a better test of this can be 1513 achieved with the *note TCP_RR:: test. 1514 1515 1516 Here is an example of a basic TCP_STREAM test, in this case from a 1517 Debian Linux (2.6 kernel) system to an HP-UX 11iv2 (HP-UX 11.23) system: 1518 1519 $ netperf -H lag 1520 TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to lag.hpl.hp.com (15.4.89.214) port 0 AF_INET 1521 Recv Send Send 1522 Socket Socket Message Elapsed 1523 Size Size Size Time Throughput 1524 bytes bytes bytes secs. 10^6bits/sec 1525 1526 32768 16384 16384 10.00 80.42 1527 1528 We see that the default receive socket buffer size for the receiver 1529 (lag - HP-UX 11.23) is 32768 bytes, and the default socket send buffer 1530 size for the sender (Debian 2.6 kernel) is 16384 bytes, however Linux 1531 does "auto tuning" of socket buffer and TCP window sizes, which means 1532 the send socket buffer size may be different at the end of the test 1533 than it was at the beginning. This is addressed in the *note omni 1534 tests: The Omni Tests. added in version 2.5.0 and *note output 1535 selection: Omni Output Selection. Throughput is expressed as 10^6 (aka 1536 Mega) bits per second, and the test ran for 10 seconds. IPv4 addresses 1537 (AF_INET) were used. 1538 1539 1540 File: netperf.info, Node: TCP_MAERTS, Next: TCP_SENDFILE, Prev: TCP_STREAM, Up: Options common to TCP UDP and SCTP tests 1541 1542 5.2.2 TCP_MAERTS 1543 ---------------- 1544 1545 A TCP_MAERTS (MAERTS is STREAM backwards) test is "just like" a *note 1546 TCP_STREAM:: test except the data flows from the netserver to the 1547 netperf. The global command-line `-F' option is ignored for this test 1548 type. The test-specific command-line `-C' option is ignored for this 1549 test type. 1550 1551 Here is an example of a TCP_MAERTS test between the same two systems 1552 as in the example for the *note TCP_STREAM:: test. This time we request 1553 larger socket buffers with `-s' and `-S' options: 1554 1555 $ netperf -H lag -t TCP_MAERTS -- -s 128K -S 128K 1556 TCP MAERTS TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to lag.hpl.hp.com (15.4.89.214) port 0 AF_INET 1557 Recv Send Send 1558 Socket Socket Message Elapsed 1559 Size Size Size Time Throughput 1560 bytes bytes bytes secs. 10^6bits/sec 1561 1562 221184 131072 131072 10.03 81.14 1563 1564 Where we see that Linux, unlike HP-UX, may not return the same value 1565 in a `getsockopt()' as was requested in the prior `setsockopt()'. 1566 1567 This test is included more for benchmarking convenience than anything 1568 else. 1569 1570 1571 File: netperf.info, Node: TCP_SENDFILE, Next: UDP_STREAM, Prev: TCP_MAERTS, Up: Options common to TCP UDP and SCTP tests 1572 1573 5.2.3 TCP_SENDFILE 1574 ------------------ 1575 1576 The TCP_SENDFILE test is "just like" a *note TCP_STREAM:: test except 1577 netperf the platform's `sendfile()' call instead of calling `send()'. 1578 Often this results in a "zero-copy" operation where data is sent 1579 directly from the filesystem buffer cache. This _should_ result in 1580 lower CPU utilization and possibly higher throughput. If it does not, 1581 then you may want to contact your vendor(s) because they have a problem 1582 on their hands. 1583 1584 Zero-copy mechanisms may also alter the characteristics (size and 1585 number of buffers per) of packets passed to the NIC. In many stacks, 1586 when a copy is performed, the stack can "reserve" space at the 1587 beginning of the destination buffer for things like TCP, IP and Link 1588 headers. This then has the packet contained in a single buffer which 1589 can be easier to DMA to the NIC. When no copy is performed, there is 1590 no opportunity to reserve space for headers and so a packet will be 1591 contained in two or more buffers. 1592 1593 As of some time before version 2.5.0, the *note global `-F' option: 1594 Global Options. is no longer required for this test. If it is not 1595 specified, netperf will create a temporary file, which it will delete 1596 at the end of the test. If the `-F' option is specified it must 1597 reference a file of at least the size of the send ring (*Note the 1598 global `-W' option: Global Options.) multiplied by the send size (*Note 1599 the test-specific `-m' option: Options common to TCP UDP and SCTP 1600 tests.). All other TCP-specific options remain available and optional. 1601 1602 In this first example: 1603 $ netperf -H lag -F ../src/netperf -t TCP_SENDFILE -- -s 128K -S 128K 1604 TCP SENDFILE TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to lag.hpl.hp.com (15.4.89.214) port 0 AF_INET 1605 alloc_sendfile_buf_ring: specified file too small. 1606 file must be larger than send_width * send_size 1607 1608 we see what happens when the file is too small. Here: 1609 1610 $ netperf -H lag -F /boot/vmlinuz-2.6.8-1-686 -t TCP_SENDFILE -- -s 128K -S 128K 1611 TCP SENDFILE TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to lag.hpl.hp.com (15.4.89.214) port 0 AF_INET 1612 Recv Send Send 1613 Socket Socket Message Elapsed 1614 Size Size Size Time Throughput 1615 bytes bytes bytes secs. 10^6bits/sec 1616 1617 131072 221184 221184 10.02 81.83 1618 1619 we resolve that issue by selecting a larger file. 1620 1621 1622 File: netperf.info, Node: UDP_STREAM, Next: XTI_TCP_STREAM, Prev: TCP_SENDFILE, Up: Options common to TCP UDP and SCTP tests 1623 1624 5.2.4 UDP_STREAM 1625 ---------------- 1626 1627 A UDP_STREAM test is similar to a *note TCP_STREAM:: test except UDP is 1628 used as the transport rather than TCP. 1629 1630 A UDP_STREAM test has no end-to-end flow control - UDP provides none 1631 and neither does netperf. However, if you wish, you can configure 1632 netperf with `--enable-intervals=yes' to enable the global command-line 1633 `-b' and `-w' options to pace bursts of traffic onto the network. 1634 1635 This has a number of implications. 1636 1637 The biggest of these implications is the data which is sent might not 1638 be received by the remote. For this reason, the output of a UDP_STREAM 1639 test shows both the sending and receiving throughput. On some 1640 platforms, it may be possible for the sending throughput to be reported 1641 as a value greater than the maximum rate of the link. This is common 1642 when the CPU(s) are faster than the network and there is no 1643 "intra-stack" flow-control. 1644 1645 Here is an example of a UDP_STREAM test between two systems connected 1646 by a 10 Gigabit Ethernet link: 1647 $ netperf -t UDP_STREAM -H 192.168.2.125 -- -m 32768 1648 UDP UNIDIRECTIONAL SEND TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 192.168.2.125 (192.168.2.125) port 0 AF_INET 1649 Socket Message Elapsed Messages 1650 Size Size Time Okay Errors Throughput 1651 bytes bytes secs # # 10^6bits/sec 1652 1653 124928 32768 10.00 105672 0 2770.20 1654 135168 10.00 104844 2748.50 1655 1656 The first line of numbers are statistics from the sending (netperf) 1657 side. The second line of numbers are from the receiving (netserver) 1658 side. In this case, 105672 - 104844 or 828 messages did not make it 1659 all the way to the remote netserver process. 1660 1661 If the value of the `-m' option is larger than the local send socket 1662 buffer size (`-s' option) netperf will likely abort with an error 1663 message about how the send call failed: 1664 1665 netperf -t UDP_STREAM -H 192.168.2.125 1666 UDP UNIDIRECTIONAL SEND TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 192.168.2.125 (192.168.2.125) port 0 AF_INET 1667 udp_send: data send error: Message too long 1668 1669 If the value of the `-m' option is larger than the remote socket 1670 receive buffer, the reported receive throughput will likely be zero as 1671 the remote UDP will discard the messages as being too large to fit into 1672 the socket buffer. 1673 1674 $ netperf -t UDP_STREAM -H 192.168.2.125 -- -m 65000 -S 32768 1675 UDP UNIDIRECTIONAL SEND TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 192.168.2.125 (192.168.2.125) port 0 AF_INET 1676 Socket Message Elapsed Messages 1677 Size Size Time Okay Errors Throughput 1678 bytes bytes secs # # 10^6bits/sec 1679 1680 124928 65000 10.00 53595 0 2786.99 1681 65536 10.00 0 0.00 1682 1683 The example above was between a pair of systems running a "Linux" 1684 kernel. Notice that the remote Linux system returned a value larger 1685 than that passed-in to the `-S' option. In fact, this value was larger 1686 than the message size set with the `-m' option. That the remote socket 1687 buffer size is reported as 65536 bytes would suggest to any sane person 1688 that a message of 65000 bytes would fit, but the socket isn't _really_ 1689 65536 bytes, even though Linux is telling us so. Go figure. 1690 1691 1692 File: netperf.info, Node: XTI_TCP_STREAM, Next: XTI_UDP_STREAM, Prev: UDP_STREAM, Up: Options common to TCP UDP and SCTP tests 1693 1694 5.2.5 XTI_TCP_STREAM 1695 -------------------- 1696 1697 An XTI_TCP_STREAM test is simply a *note TCP_STREAM:: test using the XTI 1698 rather than BSD Sockets interface. The test-specific `-X <devspec>' 1699 option can be used to specify the name of the local and/or remote XTI 1700 device files, which is required by the `t_open()' call made by netperf 1701 XTI tests. 1702 1703 The XTI_TCP_STREAM test is only present if netperf was configured 1704 with `--enable-xti=yes'. The remote netserver must have also been 1705 configured with `--enable-xti=yes'. 1706 1707 1708 File: netperf.info, Node: XTI_UDP_STREAM, Next: SCTP_STREAM, Prev: XTI_TCP_STREAM, Up: Options common to TCP UDP and SCTP tests 1709 1710 5.2.6 XTI_UDP_STREAM 1711 -------------------- 1712 1713 An XTI_UDP_STREAM test is simply a *note UDP_STREAM:: test using the XTI 1714 rather than BSD Sockets Interface. The test-specific `-X <devspec>' 1715 option can be used to specify the name of the local and/or remote XTI 1716 device files, which is required by the `t_open()' call made by netperf 1717 XTI tests. 1718 1719 The XTI_UDP_STREAM test is only present if netperf was configured 1720 with `--enable-xti=yes'. The remote netserver must have also been 1721 configured with `--enable-xti=yes'. 1722 1723 1724 File: netperf.info, Node: SCTP_STREAM, Next: DLCO_STREAM, Prev: XTI_UDP_STREAM, Up: Options common to TCP UDP and SCTP tests 1725 1726 5.2.7 SCTP_STREAM 1727 ----------------- 1728 1729 An SCTP_STREAM test is essentially a *note TCP_STREAM:: test using the 1730 SCTP rather than TCP. The `-D' option will set SCTP_NODELAY, which is 1731 much like the TCP_NODELAY option for TCP. The `-C' option is not 1732 applicable to an SCTP test as there is no corresponding SCTP_CORK 1733 option. The author is still figuring-out what the test-specific `-N' 1734 option does :) 1735 1736 The SCTP_STREAM test is only present if netperf was configured with 1737 `--enable-sctp=yes'. The remote netserver must have also been 1738 configured with `--enable-sctp=yes'. 1739 1740 1741 File: netperf.info, Node: DLCO_STREAM, Next: DLCL_STREAM, Prev: SCTP_STREAM, Up: Options common to TCP UDP and SCTP tests 1742 1743 5.2.8 DLCO_STREAM 1744 ----------------- 1745 1746 A DLPI Connection Oriented Stream (DLCO_STREAM) test is very similar in 1747 concept to a *note TCP_STREAM:: test. Both use reliable, 1748 connection-oriented protocols. The DLPI test differs from the TCP test 1749 in that its protocol operates only at the link-level and does not 1750 include TCP-style segmentation and reassembly. This last difference 1751 means that the value passed-in with the `-m' option must be less than 1752 the interface MTU. Otherwise, the `-m' and `-M' options are just like 1753 their TCP/UDP/SCTP counterparts. 1754 1755 Other DLPI-specific options include: 1756 1757 `-D <devspec>' 1758 This option is used to provide the fully-qualified names for the 1759 local and/or remote DLPI device files. The syntax is otherwise 1760 identical to that of a "sizespec". 1761 1762 `-p <ppaspec>' 1763 This option is used to specify the local and/or remote DLPI PPA(s). 1764 The PPA is used to identify the interface over which traffic is to 1765 be sent/received. The syntax of a "ppaspec" is otherwise the same 1766 as a "sizespec". 1767 1768 `-s sap' 1769 This option specifies the 802.2 SAP for the test. A SAP is 1770 somewhat like either the port field of a TCP or UDP header or the 1771 protocol field of an IP header. The specified SAP should not 1772 conflict with any other active SAPs on the specified PPA's (`-p' 1773 option). 1774 1775 `-w <sizespec>' 1776 This option specifies the local send and receive window sizes in 1777 units of frames on those platforms which support setting such 1778 things. 1779 1780 `-W <sizespec>' 1781 This option specifies the remote send and receive window sizes in 1782 units of frames on those platforms which support setting such 1783 things. 1784 1785 The DLCO_STREAM test is only present if netperf was configured with 1786 `--enable-dlpi=yes'. The remote netserver must have also been 1787 configured with `--enable-dlpi=yes'. 1788 1789 1790 File: netperf.info, Node: DLCL_STREAM, Next: STREAM_STREAM, Prev: DLCO_STREAM, Up: Options common to TCP UDP and SCTP tests 1791 1792 5.2.9 DLCL_STREAM 1793 ----------------- 1794 1795 A DLPI ConnectionLess Stream (DLCL_STREAM) test is analogous to a *note 1796 UDP_STREAM:: test in that both make use of unreliable/best-effort, 1797 connection-less transports. The DLCL_STREAM test differs from the 1798 *note UDP_STREAM:: test in that the message size (`-m' option) must 1799 always be less than the link MTU as there is no IP-like fragmentation 1800 and reassembly available and netperf does not presume to provide one. 1801 1802 The test-specific command-line options for a DLCL_STREAM test are the 1803 same as those for a *note DLCO_STREAM:: test. 1804 1805 The DLCL_STREAM test is only present if netperf was configured with 1806 `--enable-dlpi=yes'. The remote netserver must have also been 1807 configured with `--enable-dlpi=yes'. 1808 1809 1810 File: netperf.info, Node: STREAM_STREAM, Next: DG_STREAM, Prev: DLCL_STREAM, Up: Options common to TCP UDP and SCTP tests 1811 1812 5.2.10 STREAM_STREAM 1813 -------------------- 1814 1815 A Unix Domain Stream Socket Stream test (STREAM_STREAM) is similar in 1816 concept to a *note TCP_STREAM:: test, but using Unix Domain sockets. 1817 It is, naturally, limited to intra-machine traffic. A STREAM_STREAM 1818 test shares the `-m', `-M', `-s' and `-S' options of the other _STREAM 1819 tests. In a STREAM_STREAM test the `-p' option sets the directory in 1820 which the pipes will be created rather than setting a port number. The 1821 default is to create the pipes in the system default for the 1822 `tempnam()' call. 1823 1824 The STREAM_STREAM test is only present if netperf was configured with 1825 `--enable-unixdomain=yes'. The remote netserver must have also been 1826 configured with `--enable-unixdomain=yes'. 1827 1828 1829 File: netperf.info, Node: DG_STREAM, Prev: STREAM_STREAM, Up: Options common to TCP UDP and SCTP tests 1830 1831 5.2.11 DG_STREAM 1832 ---------------- 1833 1834 A Unix Domain Datagram Socket Stream test (SG_STREAM) is very much like 1835 a *note TCP_STREAM:: test except that message boundaries are preserved. 1836 In this way, it may also be considered similar to certain flavors of 1837 SCTP test which can also preserve message boundaries. 1838 1839 All the options of a *note STREAM_STREAM:: test are applicable to a 1840 DG_STREAM test. 1841 1842 The DG_STREAM test is only present if netperf was configured with 1843 `--enable-unixdomain=yes'. The remote netserver must have also been 1844 configured with `--enable-unixdomain=yes'. 1845 1846 1847 File: netperf.info, Node: Using Netperf to Measure Request/Response, Next: Using Netperf to Measure Aggregate Performance, Prev: Using Netperf to Measure Bulk Data Transfer, Up: Top 1848 1849 6 Using Netperf to Measure Request/Response 1850 ******************************************* 1851 1852 Request/response performance is often overlooked, yet it is just as 1853 important as bulk-transfer performance. While things like larger 1854 socket buffers and TCP windows, and stateless offloads like TSO and LRO 1855 can cover a multitude of latency and even path-length sins, those sins 1856 cannot easily hide from a request/response test. The convention for a 1857 request/response test is to have a _RR suffix. There are however a few 1858 "request/response" tests that have other suffixes. 1859 1860 A request/response test, particularly synchronous, one transaction at 1861 a time test such as those found by default in netperf, is particularly 1862 sensitive to the path-length of the networking stack. An _RR test can 1863 also uncover those platforms where the NICs are strapped by default 1864 with overbearing interrupt avoidance settings in an attempt to increase 1865 the bulk-transfer performance (or rather, decrease the CPU utilization 1866 of a bulk-transfer test). This sensitivity is most acute for small 1867 request and response sizes, such as the single-byte default for a 1868 netperf _RR test. 1869 1870 While a bulk-transfer test reports its results in units of bits or 1871 bytes transferred per second, by default a mumble_RR test reports 1872 transactions per second where a transaction is defined as the completed 1873 exchange of a request and a response. One can invert the transaction 1874 rate to arrive at the average round-trip latency. If one is confident 1875 about the symmetry of the connection, the average one-way latency can 1876 be taken as one-half the average round-trip latency. As of version 1877 2.5.0 (actually slightly before) netperf still does not do the latter, 1878 but will do the former if one sets the verbosity to 2 for a classic 1879 netperf test, or includes the appropriate *note output selector: Omni 1880 Output Selectors. in an *note omni test: The Omni Tests. It will also 1881 allow the user to switch the throughput units from transactions per 1882 second to bits or bytes per second with the global `-f' option. 1883 1884 * Menu: 1885 1886 * Issues in Request/Response:: 1887 * Options Common to TCP UDP and SCTP _RR tests:: 1888 1889 1890 File: netperf.info, Node: Issues in Request/Response, Next: Options Common to TCP UDP and SCTP _RR tests, Prev: Using Netperf to Measure Request/Response, Up: Using Netperf to Measure Request/Response 1891 1892 6.1 Issues in Request/Response 1893 ============================== 1894 1895 Most if not all the *note Issues in Bulk Transfer:: apply to 1896 request/response. The issue of round-trip latency is even more 1897 important as netperf generally only has one transaction outstanding at 1898 a time. 1899 1900 A single instance of a one transaction outstanding _RR test should 1901 _never_ completely saturate the CPU of a system. If testing between 1902 otherwise evenly matched systems, the symmetric nature of a _RR test 1903 with equal request and response sizes should result in equal CPU 1904 loading on both systems. However, this may not hold true on MP systems, 1905 particularly if one CPU binds the netperf and netserver differently via 1906 the global `-T' option. 1907 1908 For smaller request and response sizes packet loss is a bigger issue 1909 as there is no opportunity for a "fast retransmit" or retransmission 1910 prior to a retransmission timer expiring. 1911 1912 Virtualization may considerably increase the effective path length of 1913 a networking stack. While this may not preclude achieving link-rate on 1914 a comparatively slow link (eg 1 Gigabit Ethernet) on a _STREAM test, it 1915 can show-up as measurably fewer transactions per second on an _RR test. 1916 However, this may still be masked by interrupt coalescing in the 1917 NIC/driver. 1918 1919 Certain NICs have ways to minimize the number of interrupts sent to 1920 the host. If these are strapped badly they can significantly reduce 1921 the performance of something like a single-byte request/response test. 1922 Such setups are distinguished by seriously low reported CPU utilization 1923 and what seems like a low (even if in the thousands) transaction per 1924 second rate. Also, if you run such an OS/driver combination on faster 1925 or slower hardware and do not see a corresponding change in the 1926 transaction rate, chances are good that the driver is strapping the NIC 1927 with aggressive interrupt avoidance settings. Good for bulk 1928 throughput, but bad for latency. 1929 1930 Some drivers may try to automagically adjust the interrupt avoidance 1931 settings. If they are not terribly good at it, you will see 1932 considerable run-to-run variation in reported transaction rates. 1933 Particularly if you "mix-up" _STREAM and _RR tests. 1934 1935 1936 File: netperf.info, Node: Options Common to TCP UDP and SCTP _RR tests, Prev: Issues in Request/Response, Up: Using Netperf to Measure Request/Response 1937 1938 6.2 Options Common to TCP UDP and SCTP _RR tests 1939 ================================================ 1940 1941 Many "test-specific" options are actually common across the different 1942 tests. For those tests involving TCP, UDP and SCTP, whether using the 1943 BSD Sockets or the XTI interface those common options include: 1944 1945 `-h' 1946 Display the test-suite-specific usage string and exit. For a TCP_ 1947 or UDP_ test this will be the usage string from the source file 1948 `nettest_bsd.c'. For an XTI_ test, this will be the usage string 1949 from the source file `src/nettest_xti.c'. For an SCTP test, this 1950 will be the usage string from the source file `src/nettest_sctp.c'. 1951 1952 `-H <optionspec>' 1953 Normally, the remote hostname|IP and address family information is 1954 inherited from the settings for the control connection (eg global 1955 command-line `-H', `-4' and/or `-6' options. The test-specific 1956 `-H' will override those settings for the data (aka test) 1957 connection only. Settings for the control connection are left 1958 unchanged. This might be used to cause the control and data 1959 connections to take different paths through the network. 1960 1961 `-L <optionspec>' 1962 The test-specific `-L' option is identical to the test-specific 1963 `-H' option except it affects the local hostname|IP and address 1964 family information. As with its global command-line counterpart, 1965 this is generally only useful when measuring though those evil, 1966 end-to-end breaking things called firewalls. 1967 1968 `-P <optionspec>' 1969 Set the local and/or remote port numbers for the data connection. 1970 1971 `-r <sizespec>' 1972 This option sets the request (first value) and/or response (second 1973 value) sizes for an _RR test. By default the units are bytes, but a 1974 suffix of "G," "M," or "K" will specify the units to be 2^30 (GB), 1975 2^20 (MB) or 2^10 (KB) respectively. A suffix of "g," "m" or "k" 1976 will specify units of 10^9, 10^6 or 10^3 bytes respectively. For 1977 example: 1978 `-r 128,16K' 1979 Will set the request size to 128 bytes and the response size to 16 1980 KB or 16384 bytes. [Default: 1 - a single-byte request and 1981 response ] 1982 1983 `-s <sizespec>' 1984 This option sets the local (netperf) send and receive socket buffer 1985 sizes for the data connection to the value(s) specified. Often, 1986 this will affect the advertised and/or effective TCP or other 1987 window, but on some platforms it may not. By default the units are 1988 bytes, but a suffix of "G," "M," or "K" will specify the units to 1989 be 2^30 (GB), 2^20 (MB) or 2^10 (KB) respectively. A suffix of 1990 "g," "m" or "k" will specify units of 10^9, 10^6 or 10^3 bytes 1991 respectively. For example: 1992 `-s 128K' 1993 Will request the local send (netperf) and receive socket buffer 1994 sizes to be 128KB or 131072 bytes. 1995 1996 While the historic expectation is that setting the socket buffer 1997 size has a direct effect on say the TCP window, today that may not 1998 hold true for all stacks. When running under Windows a value of 0 1999 may be used which will be an indication to the stack the user 2000 wants to enable a form of copy avoidance. [Default: -1 - use the 2001 system's default socket buffer sizes] 2002 2003 `-S <sizespec>' 2004 This option sets the remote (netserver) send and/or receive socket 2005 buffer sizes for the data connection to the value(s) specified. 2006 Often, this will affect the advertised and/or effective TCP or 2007 other window, but on some platforms it may not. By default the 2008 units are bytes, but a suffix of "G," "M," or "K" will specify the 2009 units to be 2^30 (GB), 2^20 (MB) or 2^10 (KB) respectively. A 2010 suffix of "g," "m" or "k" will specify units of 10^9, 10^6 or 10^3 2011 bytes respectively. For example: 2012 `-S 128K' 2013 Will request the remote (netserver) send and receive socket buffer 2014 sizes to be 128KB or 131072 bytes. 2015 2016 While the historic expectation is that setting the socket buffer 2017 size has a direct effect on say the TCP window, today that may not 2018 hold true for all stacks. When running under Windows a value of 0 2019 may be used which will be an indication to the stack the user 2020 wants to enable a form of copy avoidance. [Default: -1 - use the 2021 system's default socket buffer sizes] 2022 2023 `-4' 2024 Set the local and remote address family for the data connection to 2025 AF_INET - ie use IPv4 addressing only. Just as with their global 2026 command-line counterparts the last of the `-4', `-6', `-H' or `-L' 2027 option wins for their respective address families. 2028 2029 `-6' 2030 This option is identical to its `-4' cousin, but requests IPv6 2031 addresses for the local and remote ends of the data connection. 2032 2033 2034 * Menu: 2035 2036 * TCP_RR:: 2037 * TCP_CC:: 2038 * TCP_CRR:: 2039 * UDP_RR:: 2040 * XTI_TCP_RR:: 2041 * XTI_TCP_CC:: 2042 * XTI_TCP_CRR:: 2043 * XTI_UDP_RR:: 2044 * DLCL_RR:: 2045 * DLCO_RR:: 2046 * SCTP_RR:: 2047 2048 2049 File: netperf.info, Node: TCP_RR, Next: TCP_CC, Prev: Options Common to TCP UDP and SCTP _RR tests, Up: Options Common to TCP UDP and SCTP _RR tests 2050 2051 6.2.1 TCP_RR 2052 ------------ 2053 2054 A TCP_RR (TCP Request/Response) test is requested by passing a value of 2055 "TCP_RR" to the global `-t' command-line option. A TCP_RR test can be 2056 thought-of as a user-space to user-space `ping' with no think time - it 2057 is by default a synchronous, one transaction at a time, 2058 request/response test. 2059 2060 The transaction rate is the number of complete transactions exchanged 2061 divided by the length of time it took to perform those transactions. 2062 2063 If the two Systems Under Test are otherwise identical, a TCP_RR test 2064 with the same request and response size should be symmetric - it should 2065 not matter which way the test is run, and the CPU utilization measured 2066 should be virtually the same on each system. If not, it suggests that 2067 the CPU utilization mechanism being used may have some, well, issues 2068 measuring CPU utilization completely and accurately. 2069 2070 Time to establish the TCP connection is not counted in the result. 2071 If you want connection setup overheads included, you should consider the 2072 *note TPC_CC: TCP_CC. or *note TCP_CRR: TCP_CRR. tests. 2073 2074 If specifying the `-D' option to set TCP_NODELAY and disable the 2075 Nagle Algorithm increases the transaction rate reported by a TCP_RR 2076 test, it implies the stack(s) over which the TCP_RR test is running 2077 have a broken implementation of the Nagle Algorithm. Likely as not 2078 they are interpreting Nagle on a segment by segment basis rather than a 2079 user send by user send basis. You should contact your stack vendor(s) 2080 to report the problem to them. 2081 2082 Here is an example of two systems running a basic TCP_RR test over a 2083 10 Gigabit Ethernet link: 2084 2085 netperf -t TCP_RR -H 192.168.2.125 2086 TCP REQUEST/RESPONSE TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 192.168.2.125 (192.168.2.125) port 0 AF_INET 2087 Local /Remote 2088 Socket Size Request Resp. Elapsed Trans. 2089 Send Recv Size Size Time Rate 2090 bytes Bytes bytes bytes secs. per sec 2091 2092 16384 87380 1 1 10.00 29150.15 2093 16384 87380 2094 2095 In this example the request and response sizes were one byte, the 2096 socket buffers were left at their defaults, and the test ran for all of 2097 10 seconds. The transaction per second rate was rather good for the 2098 time :) 2099 2100 2101 File: netperf.info, Node: TCP_CC, Next: TCP_CRR, Prev: TCP_RR, Up: Options Common to TCP UDP and SCTP _RR tests 2102 2103 6.2.2 TCP_CC 2104 ------------ 2105 2106 A TCP_CC (TCP Connect/Close) test is requested by passing a value of 2107 "TCP_CC" to the global `-t' option. A TCP_CC test simply measures how 2108 fast the pair of systems can open and close connections between one 2109 another in a synchronous (one at a time) manner. While this is 2110 considered an _RR test, no request or response is exchanged over the 2111 connection. 2112 2113 The issue of TIME_WAIT reuse is an important one for a TCP_CC test. 2114 Basically, TIME_WAIT reuse is when a pair of systems churn through 2115 connections fast enough that they wrap the 16-bit port number space in 2116 less time than the length of the TIME_WAIT state. While it is indeed 2117 theoretically possible to "reuse" a connection in TIME_WAIT, the 2118 conditions under which such reuse is possible are rather rare. An 2119 attempt to reuse a connection in TIME_WAIT can result in a non-trivial 2120 delay in connection establishment. 2121 2122 Basically, any time the connection churn rate approaches: 2123 2124 Sizeof(clientportspace) / Lengthof(TIME_WAIT) 2125 2126 there is the risk of TIME_WAIT reuse. To minimize the chances of 2127 this happening, netperf will by default select its own client port 2128 numbers from the range of 5000 to 65535. On systems with a 60 second 2129 TIME_WAIT state, this should allow roughly 1000 transactions per 2130 second. The size of the client port space used by netperf can be 2131 controlled via the test-specific `-p' option, which takes a "sizespec" 2132 as a value setting the minimum (first value) and maximum (second value) 2133 port numbers used by netperf at the client end. 2134 2135 Since no requests or responses are exchanged during a TCP_CC test, 2136 only the `-H', `-L', `-4' and `-6' of the "common" test-specific 2137 options are likely to have an effect, if any, on the results. The `-s' 2138 and `-S' options _may_ have some effect if they alter the number and/or 2139 type of options carried in the TCP SYNchronize segments, such as Window 2140 Scaling or Timestamps. The `-P' and `-r' options are utterly ignored. 2141 2142 Since connection establishment and tear-down for TCP is not 2143 symmetric, a TCP_CC test is not symmetric in its loading of the two 2144 systems under test. 2145 2146 2147 File: netperf.info, Node: TCP_CRR, Next: UDP_RR, Prev: TCP_CC, Up: Options Common to TCP UDP and SCTP _RR tests 2148 2149 6.2.3 TCP_CRR 2150 ------------- 2151 2152 The TCP Connect/Request/Response (TCP_CRR) test is requested by passing 2153 a value of "TCP_CRR" to the global `-t' command-line option. A TCP_CRR 2154 test is like a merger of a *note TCP_RR:: and *note TCP_CC:: test which 2155 measures the performance of establishing a connection, exchanging a 2156 single request/response transaction, and tearing-down that connection. 2157 This is very much like what happens in an HTTP 1.0 or HTTP 1.1 2158 connection when HTTP Keepalives are not used. In fact, the TCP_CRR 2159 test was added to netperf to simulate just that. 2160 2161 Since a request and response are exchanged the `-r', `-s' and `-S' 2162 options can have an effect on the performance. 2163 2164 The issue of TIME_WAIT reuse exists for the TCP_CRR test just as it 2165 does for the TCP_CC test. Similarly, since connection establishment 2166 and tear-down is not symmetric, a TCP_CRR test is not symmetric even 2167 when the request and response sizes are the same. 2168 2169 2170 File: netperf.info, Node: UDP_RR, Next: XTI_TCP_RR, Prev: TCP_CRR, Up: Options Common to TCP UDP and SCTP _RR tests 2171 2172 6.2.4 UDP_RR 2173 ------------ 2174 2175 A UDP Request/Response (UDP_RR) test is requested by passing a value of 2176 "UDP_RR" to a global `-t' option. It is very much the same as a TCP_RR 2177 test except UDP is used rather than TCP. 2178 2179 UDP does not provide for retransmission of lost UDP datagrams, and 2180 netperf does not add anything for that either. This means that if 2181 _any_ request or response is lost, the exchange of requests and 2182 responses will stop from that point until the test timer expires. 2183 Netperf will not really "know" this has happened - the only symptom 2184 will be a low transaction per second rate. If `--enable-burst' was 2185 included in the `configure' command and a test-specific `-b' option 2186 used, the UDP_RR test will "survive" the loss of requests and responses 2187 until the sum is one more than the value passed via the `-b' option. It 2188 will though almost certainly run more slowly. 2189 2190 The netperf side of a UDP_RR test will call `connect()' on its data 2191 socket and thenceforth use the `send()' and `recv()' socket calls. The 2192 netserver side of a UDP_RR test will not call `connect()' and will use 2193 `recvfrom()' and `sendto()' calls. This means that even if the request 2194 and response sizes are the same, a UDP_RR test is _not_ symmetric in 2195 its loading of the two systems under test. 2196 2197 Here is an example of a UDP_RR test between two otherwise identical 2198 two-CPU systems joined via a 1 Gigabit Ethernet network: 2199 2200 $ netperf -T 1 -H 192.168.1.213 -t UDP_RR -c -C 2201 UDP REQUEST/RESPONSE TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 192.168.1.213 (192.168.1.213) port 0 AF_INET 2202 Local /Remote 2203 Socket Size Request Resp. Elapsed Trans. CPU CPU S.dem S.dem 2204 Send Recv Size Size Time Rate local remote local remote 2205 bytes bytes bytes bytes secs. per sec % I % I us/Tr us/Tr 2206 2207 65535 65535 1 1 10.01 15262.48 13.90 16.11 18.221 21.116 2208 65535 65535 2209 2210 This example includes the `-c' and `-C' options to enable CPU 2211 utilization reporting and shows the asymmetry in CPU loading. The `-T' 2212 option was used to make sure netperf and netserver ran on a given CPU 2213 and did not move around during the test. 2214 2215 2216 File: netperf.info, Node: XTI_TCP_RR, Next: XTI_TCP_CC, Prev: UDP_RR, Up: Options Common to TCP UDP and SCTP _RR tests 2217 2218 6.2.5 XTI_TCP_RR 2219 ---------------- 2220 2221 An XTI_TCP_RR test is essentially the same as a *note TCP_RR:: test only 2222 using the XTI rather than BSD Sockets interface. It is requested by 2223 passing a value of "XTI_TCP_RR" to the `-t' global command-line option. 2224 2225 The test-specific options for an XTI_TCP_RR test are the same as 2226 those for a TCP_RR test with the addition of the `-X <devspec>' option 2227 to specify the names of the local and/or remote XTI device file(s). 2228 2229 2230 File: netperf.info, Node: XTI_TCP_CC, Next: XTI_TCP_CRR, Prev: XTI_TCP_RR, Up: Options Common to TCP UDP and SCTP _RR tests 2231 2232 6.2.6 XTI_TCP_CC 2233 ---------------- 2234 2235 An XTI_TCP_CC test is essentially the same as a *note TCP_CC: TCP_CC. 2236 test, only using the XTI rather than BSD Sockets interface. 2237 2238 The test-specific options for an XTI_TCP_CC test are the same as 2239 those for a TCP_CC test with the addition of the `-X <devspec>' option 2240 to specify the names of the local and/or remote XTI device file(s). 2241 2242 2243 File: netperf.info, Node: XTI_TCP_CRR, Next: XTI_UDP_RR, Prev: XTI_TCP_CC, Up: Options Common to TCP UDP and SCTP _RR tests 2244 2245 6.2.7 XTI_TCP_CRR 2246 ----------------- 2247 2248 The XTI_TCP_CRR test is essentially the same as a *note TCP_CRR: 2249 TCP_CRR. test, only using the XTI rather than BSD Sockets interface. 2250 2251 The test-specific options for an XTI_TCP_CRR test are the same as 2252 those for a TCP_RR test with the addition of the `-X <devspec>' option 2253 to specify the names of the local and/or remote XTI device file(s). 2254 2255 2256 File: netperf.info, Node: XTI_UDP_RR, Next: DLCL_RR, Prev: XTI_TCP_CRR, Up: Options Common to TCP UDP and SCTP _RR tests 2257 2258 6.2.8 XTI_UDP_RR 2259 ---------------- 2260 2261 An XTI_UDP_RR test is essentially the same as a UDP_RR test only using 2262 the XTI rather than BSD Sockets interface. It is requested by passing 2263 a value of "XTI_UDP_RR" to the `-t' global command-line option. 2264 2265 The test-specific options for an XTI_UDP_RR test are the same as 2266 those for a UDP_RR test with the addition of the `-X <devspec>' option 2267 to specify the name of the local and/or remote XTI device file(s). 2268 2269 2270 File: netperf.info, Node: DLCL_RR, Next: DLCO_RR, Prev: XTI_UDP_RR, Up: Options Common to TCP UDP and SCTP _RR tests 2271 2272 6.2.9 DLCL_RR 2273 ------------- 2274 2275 2276 File: netperf.info, Node: DLCO_RR, Next: SCTP_RR, Prev: DLCL_RR, Up: Options Common to TCP UDP and SCTP _RR tests 2277 2278 6.2.10 DLCO_RR 2279 -------------- 2280 2281 2282 File: netperf.info, Node: SCTP_RR, Prev: DLCO_RR, Up: Options Common to TCP UDP and SCTP _RR tests 2283 2284 6.2.11 SCTP_RR 2285 -------------- 2286 2287 2288 File: netperf.info, Node: Using Netperf to Measure Aggregate Performance, Next: Using Netperf to Measure Bidirectional Transfer, Prev: Using Netperf to Measure Request/Response, Up: Top 2289 2290 7 Using Netperf to Measure Aggregate Performance 2291 ************************************************ 2292 2293 Ultimately, *note Netperf4: Netperf4. will be the preferred benchmark to 2294 use when one wants to measure aggregate performance because netperf has 2295 no support for explicit synchronization of concurrent tests. Until 2296 netperf4 is ready for prime time, one can make use of the heuristics 2297 and procedures mentioned here for the 85% solution. 2298 2299 There are a few ways to measure aggregate performance with netperf. 2300 The first is to run multiple, concurrent netperf tests and can be 2301 applied to any of the netperf tests. The second is to configure 2302 netperf with `--enable-burst' and is applicable to the TCP_RR test. The 2303 third is a variation on the first. 2304 2305 * Menu: 2306 2307 * Running Concurrent Netperf Tests:: 2308 * Using --enable-burst:: 2309 * Using --enable-demo:: 2310 2311 2312 File: netperf.info, Node: Running Concurrent Netperf Tests, Next: Using --enable-burst, Prev: Using Netperf to Measure Aggregate Performance, Up: Using Netperf to Measure Aggregate Performance 2313 2314 7.1 Running Concurrent Netperf Tests 2315 ==================================== 2316 2317 *note Netperf4: Netperf4. is the preferred benchmark to use when one 2318 wants to measure aggregate performance because netperf has no support 2319 for explicit synchronization of concurrent tests. This leaves netperf2 2320 results vulnerable to "skew" errors. 2321 2322 However, since there are times when netperf4 is unavailable it may be 2323 necessary to run netperf. The skew error can be minimized by making use 2324 of the confidence interval functionality. Then one simply launches 2325 multiple tests from the shell using a `for' loop or the like: 2326 2327 for i in 1 2 3 4 2328 do 2329 netperf -t TCP_STREAM -H tardy.cup.hp.com -i 10 -P 0 & 2330 done 2331 2332 which will run four, concurrent *note TCP_STREAM: TCP_STREAM. tests 2333 from the system on which it is executed to tardy.cup.hp.com. Each 2334 concurrent netperf will iterate 10 times thanks to the `-i' option and 2335 will omit the test banners (option `-P') for brevity. The output looks 2336 something like this: 2337 2338 87380 16384 16384 10.03 235.15 2339 87380 16384 16384 10.03 235.09 2340 87380 16384 16384 10.03 235.38 2341 87380 16384 16384 10.03 233.96 2342 2343 We can take the sum of the results and be reasonably confident that 2344 the aggregate performance was 940 Mbits/s. This method does not need 2345 to be limited to one system speaking to one other system. It can be 2346 extended to one system talking to N other systems. It could be as 2347 simple as: 2348 for host in 'foo bar baz bing' 2349 do 2350 netperf -t TCP_STREAM -H $hosts -i 10 -P 0 & 2351 done 2352 A more complicated/sophisticated example can be found in 2353 `doc/examples/runemomniagg2.sh' where. 2354 2355 If you see warnings about netperf not achieving the confidence 2356 intervals, the best thing to do is to increase the number of iterations 2357 with `-i' and/or increase the run length of each iteration with `-l'. 2358 2359 You can also enable local (`-c') and/or remote (`-C') CPU 2360 utilization: 2361 2362 for i in 1 2 3 4 2363 do 2364 netperf -t TCP_STREAM -H tardy.cup.hp.com -i 10 -P 0 -c -C & 2365 done 2366 2367 87380 16384 16384 10.03 235.47 3.67 5.09 10.226 14.180 2368 87380 16384 16384 10.03 234.73 3.67 5.09 10.260 14.225 2369 87380 16384 16384 10.03 234.64 3.67 5.10 10.263 14.231 2370 87380 16384 16384 10.03 234.87 3.67 5.09 10.253 14.215 2371 2372 If the CPU utilizations reported for the same system are the same or 2373 very very close you can be reasonably confident that skew error is 2374 minimized. Presumably one could then omit `-i' but that is not 2375 advised, particularly when/if the CPU utilization approaches 100 2376 percent. In the example above we see that the CPU utilization on the 2377 local system remains the same for all four tests, and is only off by 2378 0.01 out of 5.09 on the remote system. As the number of CPUs in the 2379 system increases, and so too the odds of saturating a single CPU, the 2380 accuracy of similar CPU utilization implying little skew error is 2381 diminished. This is also the case for those increasingly rare single 2382 CPU systems if the utilization is reported as 100% or very close to it. 2383 2384 NOTE: It is very important to remember that netperf is calculating 2385 system-wide CPU utilization. When calculating the service demand 2386 (those last two columns in the output above) each netperf assumes 2387 it is the only thing running on the system. This means that for 2388 concurrent tests the service demands reported by netperf will be 2389 wrong. One has to compute service demands for concurrent tests by 2390 hand. 2391 2392 If you wish you can add a unique, global `-B' option to each command 2393 line to append the given string to the output: 2394 2395 for i in 1 2 3 4 2396 do 2397 netperf -t TCP_STREAM -H tardy.cup.hp.com -B "this is test $i" -i 10 -P 0 & 2398 done 2399 2400 87380 16384 16384 10.03 234.90 this is test 4 2401 87380 16384 16384 10.03 234.41 this is test 2 2402 87380 16384 16384 10.03 235.26 this is test 1 2403 87380 16384 16384 10.03 235.09 this is test 3 2404 2405 You will notice that the tests completed in an order other than they 2406 were started from the shell. This underscores why there is a threat of 2407 skew error and why netperf4 will eventually be the preferred tool for 2408 aggregate tests. Even if you see the Netperf Contributing Editor 2409 acting to the contrary!-) 2410 2411 * Menu: 2412 2413 * Issues in Running Concurrent Tests:: 2414 2415 2416 File: netperf.info, Node: Issues in Running Concurrent Tests, Prev: Running Concurrent Netperf Tests, Up: Running Concurrent Netperf Tests 2417 2418 7.1.1 Issues in Running Concurrent Tests 2419 ---------------------------------------- 2420 2421 In addition to the aforementioned issue of skew error, there can be 2422 other issues to consider when running concurrent netperf tests. 2423 2424 For example, when running concurrent tests over multiple interfaces, 2425 one is not always assured that the traffic one thinks went over a given 2426 interface actually did so. In particular, the Linux networking stack 2427 takes a particularly strong stance on its following the so called `weak 2428 end system model'. As such, it is willing to answer ARP requests for 2429 any of its local IP addresses on any of its interfaces. If multiple 2430 interfaces are connected to the same broadcast domain, then even if 2431 they are configured into separate IP subnets there is no a priori way 2432 of knowing which interface was actually used for which connection(s). 2433 This can be addressed by setting the `arp_ignore' sysctl before 2434 configuring interfaces. 2435 2436 As it is quite important, we will repeat that it is very important to 2437 remember that each concurrent netperf instance is calculating 2438 system-wide CPU utilization. When calculating the service demand each 2439 netperf assumes it is the only thing running on the system. This means 2440 that for concurrent tests the service demands reported by netperf will 2441 be wrong. One has to compute service demands for concurrent tests by 2442 hand 2443 2444 Running concurrent tests can also become difficult when there is no 2445 one "central" node. Running tests between pairs of systems may be more 2446 difficult, calling for remote shell commands in the for loop rather 2447 than netperf commands. This introduces more skew error, which the 2448 confidence intervals may not be able to sufficiently mitigate. One 2449 possibility is to actually run three consecutive netperf tests on each 2450 node - the first being a warm-up, the last being a cool-down. The idea 2451 then is to ensure that the time it takes to get all the netperfs 2452 started is less than the length of the first netperf command in the 2453 sequence of three. Similarly, it assumes that all "middle" netperfs 2454 will complete before the first of the "last" netperfs complete. 2455 2456 2457 File: netperf.info, Node: Using --enable-burst, Next: Using --enable-demo, Prev: Running Concurrent Netperf Tests, Up: Using Netperf to Measure Aggregate Performance 2458 2459 7.2 Using - -enable-burst 2460 ========================= 2461 2462 Starting in version 2.5.0 `--enable-burst=yes' is the default, which 2463 means one no longer must: 2464 2465 configure --enable-burst 2466 2467 To have burst-mode functionality present in netperf. This enables a 2468 test-specific `-b num' option in *note TCP_RR: TCP_RR, *note UDP_RR: 2469 UDP_RR. and *note omni: The Omni Tests. tests. 2470 2471 Normally, netperf will attempt to ramp-up the number of outstanding 2472 requests to `num' plus one transactions in flight at one time. The 2473 ramp-up is to avoid transactions being smashed together into a smaller 2474 number of segments when the transport's congestion window (if any) is 2475 smaller at the time than what netperf wants to have outstanding at one 2476 time. If, however, the user specifies a negative value for `num' this 2477 ramp-up is bypassed and the burst of sends is made without 2478 consideration of transport congestion window. 2479 2480 This burst-mode is used as an alternative to or even in conjunction 2481 with multiple-concurrent _RR tests and as a way to implement a 2482 single-connection, bidirectional bulk-transfer test. When run with 2483 just a single instance of netperf, increasing the burst size can 2484 determine the maximum number of transactions per second which can be 2485 serviced by a single process: 2486 2487 for b in 0 1 2 4 8 16 32 2488 do 2489 netperf -v 0 -t TCP_RR -B "-b $b" -H hpcpc108 -P 0 -- -b $b 2490 done 2491 2492 9457.59 -b 0 2493 9975.37 -b 1 2494 10000.61 -b 2 2495 20084.47 -b 4 2496 29965.31 -b 8 2497 71929.27 -b 16 2498 109718.17 -b 32 2499 2500 The global `-v' and `-P' options were used to minimize the output to 2501 the single figure of merit which in this case the transaction rate. 2502 The global `-B' option was used to more clearly label the output, and 2503 the test-specific `-b' option enabled by `--enable-burst' increase the 2504 number of transactions in flight at one time. 2505 2506 Now, since the test-specific `-D' option was not specified to set 2507 TCP_NODELAY, the stack was free to "bundle" requests and/or responses 2508 into TCP segments as it saw fit, and since the default request and 2509 response size is one byte, there could have been some considerable 2510 bundling even in the absence of transport congestion window issues. If 2511 one wants to try to achieve a closer to one-to-one correspondence 2512 between a request and response and a TCP segment, add the test-specific 2513 `-D' option: 2514 2515 for b in 0 1 2 4 8 16 32 2516 do 2517 netperf -v 0 -t TCP_RR -B "-b $b -D" -H hpcpc108 -P 0 -- -b $b -D 2518 done 2519 2520 8695.12 -b 0 -D 2521 19966.48 -b 1 -D 2522 20691.07 -b 2 -D 2523 49893.58 -b 4 -D 2524 62057.31 -b 8 -D 2525 108416.88 -b 16 -D 2526 114411.66 -b 32 -D 2527 2528 You can see that this has a rather large effect on the reported 2529 transaction rate. In this particular instance, the author believes it 2530 relates to interactions between the test and interrupt coalescing 2531 settings in the driver for the NICs used. 2532 2533 NOTE: Even if you set the `-D' option that is still not a 2534 guarantee that each transaction is in its own TCP segments. You 2535 should get into the habit of verifying the relationship between the 2536 transaction rate and the packet rate via other means. 2537 2538 You can also combine `--enable-burst' functionality with concurrent 2539 netperf tests. This would then be an "aggregate of aggregates" if you 2540 like: 2541 2542 2543 for i in 1 2 3 4 2544 do 2545 netperf -H hpcpc108 -v 0 -P 0 -i 10 -B "aggregate $i -b 8 -D" -t TCP_RR -- -b 8 -D & 2546 done 2547 2548 46668.38 aggregate 4 -b 8 -D 2549 44890.64 aggregate 2 -b 8 -D 2550 45702.04 aggregate 1 -b 8 -D 2551 46352.48 aggregate 3 -b 8 -D 2552 2553 Since each netperf did hit the confidence intervals, we can be 2554 reasonably certain that the aggregate transaction per second rate was 2555 the sum of all four concurrent tests, or something just shy of 184,000 2556 transactions per second. To get some idea if that was also the packet 2557 per second rate, we could bracket that `for' loop with something to 2558 gather statistics and run the results through beforeafter 2559 (ftp://ftp.cup.hp.com/dist/networking/tools): 2560 2561 /usr/sbin/ethtool -S eth2 > before 2562 for i in 1 2 3 4 2563 do 2564 netperf -H 192.168.2.108 -l 60 -v 0 -P 0 -B "aggregate $i -b 8 -D" -t TCP_RR -- -b 8 -D & 2565 done 2566 wait 2567 /usr/sbin/ethtool -S eth2 > after 2568 2569 52312.62 aggregate 2 -b 8 -D 2570 50105.65 aggregate 4 -b 8 -D 2571 50890.82 aggregate 1 -b 8 -D 2572 50869.20 aggregate 3 -b 8 -D 2573 2574 beforeafter before after > delta 2575 2576 grep packets delta 2577 rx_packets: 12251544 2578 tx_packets: 12251550 2579 2580 This example uses `ethtool' because the system being used is running 2581 Linux. Other platforms have other tools - for example HP-UX has 2582 lanadmin: 2583 2584 lanadmin -g mibstats <ppa> 2585 2586 and of course one could instead use `netstat'. 2587 2588 The `wait' is important because we are launching concurrent netperfs 2589 in the background. Without it, the second ethtool command would be run 2590 before the tests finished and perhaps even before the last of them got 2591 started! 2592 2593 The sum of the reported transaction rates is 204178 over 60 seconds, 2594 which is a total of 12250680 transactions. Each transaction is the 2595 exchange of a request and a response, so we multiply that by 2 to 2596 arrive at 24501360. 2597 2598 The sum of the ethtool stats is 24503094 packets which matches what 2599 netperf was reporting very well. 2600 2601 Had the request or response size differed, we would need to know how 2602 it compared with the "MSS" for the connection. 2603 2604 Just for grins, here is the exercise repeated, using `netstat' 2605 instead of `ethtool' 2606 2607 netstat -s -t > before 2608 for i in 1 2 3 4 2609 do 2610 netperf -l 60 -H 192.168.2.108 -v 0 -P 0 -B "aggregate $i -b 8 -D" -t TCP_RR -- -b 8 -D & done 2611 wait 2612 netstat -s -t > after 2613 2614 51305.88 aggregate 4 -b 8 -D 2615 51847.73 aggregate 2 -b 8 -D 2616 50648.19 aggregate 3 -b 8 -D 2617 53605.86 aggregate 1 -b 8 -D 2618 2619 beforeafter before after > delta 2620 2621 grep segments delta 2622 12445708 segments received 2623 12445730 segments send out 2624 1 segments retransmited 2625 0 bad segments received. 2626 2627 The sums are left as an exercise to the reader :) 2628 2629 Things become considerably more complicated if there are non-trvial 2630 packet losses and/or retransmissions. 2631 2632 Of course all this checking is unnecessary if the test is a UDP_RR 2633 test because UDP "never" aggregates multiple sends into the same UDP 2634 datagram, and there are no ACKnowledgements in UDP. The loss of a 2635 single request or response will not bring a "burst" UDP_RR test to a 2636 screeching halt, but it will reduce the number of transactions 2637 outstanding at any one time. A "burst" UDP_RR test will come to a halt 2638 if the sum of the lost requests and responses reaches the value 2639 specified in the test-specific `-b' option. 2640 2641 2642 File: netperf.info, Node: Using --enable-demo, Prev: Using --enable-burst, Up: Using Netperf to Measure Aggregate Performance 2643 2644 7.3 Using - -enable-demo 2645 ======================== 2646 2647 One can 2648 configure --enable-demo 2649 and compile netperf to enable netperf to emit "interim results" at 2650 semi-regular intervals. This enables a global `-D' option which takes 2651 a reporting interval as an argument. With that specified, the output 2652 of netperf will then look something like 2653 2654 $ src/netperf -D 1.25 2655 MIGRATED TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to localhost.localdomain () port 0 AF_INET : demo 2656 Interim result: 25425.52 10^6bits/s over 1.25 seconds ending at 1327962078.405 2657 Interim result: 25486.82 10^6bits/s over 1.25 seconds ending at 1327962079.655 2658 Interim result: 25474.96 10^6bits/s over 1.25 seconds ending at 1327962080.905 2659 Interim result: 25523.49 10^6bits/s over 1.25 seconds ending at 1327962082.155 2660 Interim result: 25053.57 10^6bits/s over 1.27 seconds ending at 1327962083.429 2661 Interim result: 25349.64 10^6bits/s over 1.25 seconds ending at 1327962084.679 2662 Interim result: 25292.84 10^6bits/s over 1.25 seconds ending at 1327962085.932 2663 Recv Send Send 2664 Socket Socket Message Elapsed 2665 Size Size Size Time Throughput 2666 bytes bytes bytes secs. 10^6bits/sec 2667 2668 87380 16384 16384 10.00 25375.66 2669 The units of the "Interim result" lines will follow the units 2670 selected via the global `-f' option. If the test-specific `-o' option 2671 is specified on the command line, the format will be CSV: 2672 ... 2673 2978.81,MBytes/s,1.25,1327962298.035 2674 ... 2675 If the test-specific `-k' option is used the format will be keyval 2676 with each keyval being given an index: 2677 ... 2678 NETPERF_INTERIM_RESULT[2]=25.00 2679 NETPERF_UNITS[2]=10^9bits/s 2680 NETPERF_INTERVAL[2]=1.25 2681 NETPERF_ENDING[2]=1327962357.249 2682 ... 2683 The expectation is it may be easier to utilize the keyvals if they 2684 have indices. 2685 2686 But how does this help with aggregate tests? Well, what one can do 2687 is start the netperfs via a script, giving each a Very Long (tm) run 2688 time. Direct the output to a file per instance. Then, once all the 2689 netperfs have been started, take a timestamp and wait for some desired 2690 test interval. Once that interval expires take another timestamp and 2691 then start terminating the netperfs by sending them a SIGALRM signal 2692 via the likes of the `kill' or `pkill' command. The netperfs will 2693 terminate and emit the rest of the "usual" output, and you can then 2694 bring the files to a central location for post processing to find the 2695 aggregate performance over the "test interval." 2696 2697 This method has the advantage that it does not require advance 2698 knowledge of how long it takes to get netperf tests started and/or 2699 stopped. It does though require sufficiently synchronized clocks on 2700 all the test systems. 2701 2702 While calls to get the current time can be inexpensive, that neither 2703 has been nor is universally true. For that reason netperf tries to 2704 minimize the number of such "timestamping" calls (eg `gettimeofday') 2705 calls it makes when in demo mode. Rather than take a timestamp after 2706 each `send' or `recv' call completes netperf tries to guess how many 2707 units of work will be performed over the desired interval. Only once 2708 that many units of work have been completed will netperf check the 2709 time. If the reporting interval has passed, netperf will emit an 2710 "interim result." If the interval has not passed, netperf will update 2711 its estimate for units and continue. 2712 2713 After a bit of thought one can see that if things "speed-up" netperf 2714 will still honor the interval. However, if things "slow-down" netperf 2715 may be late with an "interim result." Here is an example of both of 2716 those happening during a test - with the interval being honored while 2717 throughput increases, and then about half-way through when another 2718 netperf (not shown) is started we see things slowing down and netperf 2719 not hitting the interval as desired. 2720 $ src/netperf -D 2 -H tardy.hpl.hp.com -l 20 2721 MIGRATED TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to tardy.hpl.hp.com () port 0 AF_INET : demo 2722 Interim result: 36.46 10^6bits/s over 2.01 seconds ending at 1327963880.565 2723 Interim result: 59.19 10^6bits/s over 2.00 seconds ending at 1327963882.569 2724 Interim result: 73.39 10^6bits/s over 2.01 seconds ending at 1327963884.576 2725 Interim result: 84.01 10^6bits/s over 2.03 seconds ending at 1327963886.603 2726 Interim result: 75.63 10^6bits/s over 2.21 seconds ending at 1327963888.814 2727 Interim result: 55.52 10^6bits/s over 2.72 seconds ending at 1327963891.538 2728 Interim result: 70.94 10^6bits/s over 2.11 seconds ending at 1327963893.650 2729 Interim result: 80.66 10^6bits/s over 2.13 seconds ending at 1327963895.777 2730 Interim result: 86.42 10^6bits/s over 2.12 seconds ending at 1327963897.901 2731 Recv Send Send 2732 Socket Socket Message Elapsed 2733 Size Size Size Time Throughput 2734 bytes bytes bytes secs. 10^6bits/sec 2735 2736 87380 16384 16384 20.34 68.87 2737 So long as your post-processing mechanism can account for that, there 2738 should be no problem. As time passes there may be changes to try to 2739 improve the netperf's honoring the interval but one should not ass-u-me 2740 it will always do so. One should not assume the precision will remain 2741 fixed - future versions may change it - perhaps going beyond tenths of 2742 seconds in reporting the interval length etc. 2743 2744 2745 File: netperf.info, Node: Using Netperf to Measure Bidirectional Transfer, Next: The Omni Tests, Prev: Using Netperf to Measure Aggregate Performance, Up: Top 2746 2747 8 Using Netperf to Measure Bidirectional Transfer 2748 ************************************************* 2749 2750 There are two ways to use netperf to measure the performance of 2751 bidirectional transfer. The first is to run concurrent netperf tests 2752 from the command line. The second is to configure netperf with 2753 `--enable-burst' and use a single instance of the *note TCP_RR: TCP_RR. 2754 test. 2755 2756 While neither method is more "correct" than the other, each is doing 2757 so in different ways, and that has possible implications. For 2758 instance, using the concurrent netperf test mechanism means that 2759 multiple TCP connections and multiple processes are involved, whereas 2760 using the single instance of TCP_RR there is only one TCP connection 2761 and one process on each end. They may behave differently, especially 2762 on an MP system. 2763 2764 * Menu: 2765 2766 * Bidirectional Transfer with Concurrent Tests:: 2767 * Bidirectional Transfer with TCP_RR:: 2768 * Implications of Concurrent Tests vs Burst Request/Response:: 2769 2770 2771 File: netperf.info, Node: Bidirectional Transfer with Concurrent Tests, Next: Bidirectional Transfer with TCP_RR, Prev: Using Netperf to Measure Bidirectional Transfer, Up: Using Netperf to Measure Bidirectional Transfer 2772 2773 8.1 Bidirectional Transfer with Concurrent Tests 2774 ================================================ 2775 2776 If we had two hosts Fred and Ethel, we could simply run a netperf *note 2777 TCP_STREAM: TCP_STREAM. test on Fred pointing at Ethel, and a 2778 concurrent netperf TCP_STREAM test on Ethel pointing at Fred, but since 2779 there are no mechanisms to synchronize netperf tests and we would be 2780 starting tests from two different systems, there is a considerable risk 2781 of skew error. 2782 2783 Far better would be to run simultaneous TCP_STREAM and *note 2784 TCP_MAERTS: TCP_MAERTS. tests from just one system, using the concepts 2785 and procedures outlined in *note Running Concurrent Netperf Tests: 2786 Running Concurrent Netperf Tests. Here then is an example: 2787 2788 for i in 1 2789 do 2790 netperf -H 192.168.2.108 -t TCP_STREAM -B "outbound" -i 10 -P 0 -v 0 \ 2791 -- -s 256K -S 256K & 2792 netperf -H 192.168.2.108 -t TCP_MAERTS -B "inbound" -i 10 -P 0 -v 0 \ 2793 -- -s 256K -S 256K & 2794 done 2795 2796 892.66 outbound 2797 891.34 inbound 2798 2799 We have used a `for' loop in the shell with just one iteration 2800 because that will be much easier to get both tests started at more or 2801 less the same time than doing it by hand. The global `-P' and `-v' 2802 options are used because we aren't interested in anything other than 2803 the throughput, and the global `-B' option is used to tag each output 2804 so we know which was inbound and which outbound relative to the system 2805 on which we were running netperf. Of course that sense is switched on 2806 the system running netserver :) The use of the global `-i' option is 2807 explained in *note Running Concurrent Netperf Tests: Running Concurrent 2808 Netperf Tests. 2809 2810 Beginning with version 2.5.0 we can accomplish a similar result with 2811 the *note the omni tests: The Omni Tests. and *note output selectors: 2812 Omni Output Selectors.: 2813 2814 for i in 1 2815 do 2816 netperf -H 192.168.1.3 -t omni -l 10 -P 0 -- \ 2817 -d stream -s 256K -S 256K -o throughput,direction & 2818 netperf -H 192.168.1.3 -t omni -l 10 -P 0 -- \ 2819 -d maerts -s 256K -S 256K -o throughput,direction & 2820 done 2821 2822 805.26,Receive 2823 828.54,Send 2824 2825 2826 File: netperf.info, Node: Bidirectional Transfer with TCP_RR, Next: Implications of Concurrent Tests vs Burst Request/Response, Prev: Bidirectional Transfer with Concurrent Tests, Up: Using Netperf to Measure Bidirectional Transfer 2827 2828 8.2 Bidirectional Transfer with TCP_RR 2829 ====================================== 2830 2831 Starting with version 2.5.0 the `--enable-burst' configure option 2832 defaults to `yes', and starting some time before version 2.5.0 but 2833 after 2.4.0 the global `-f' option would affect the "throughput" 2834 reported by request/response tests. If one uses the test-specific `-b' 2835 option to have several "transactions" in flight at one time and the 2836 test-specific `-r' option to increase their size, the test looks more 2837 and more like a single-connection bidirectional transfer than a simple 2838 request/response test. 2839 2840 So, putting it all together one can do something like: 2841 2842 netperf -f m -t TCP_RR -H 192.168.1.3 -v 2 -- -b 6 -r 32K -S 256K -S 256K 2843 MIGRATED TCP REQUEST/RESPONSE TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 192.168.1.3 (192.168.1.3) port 0 AF_INET : interval : first burst 6 2844 Local /Remote 2845 Socket Size Request Resp. Elapsed 2846 Send Recv Size Size Time Throughput 2847 bytes Bytes bytes bytes secs. 10^6bits/sec 2848 2849 16384 87380 32768 32768 10.00 1821.30 2850 524288 524288 2851 Alignment Offset RoundTrip Trans Throughput 2852 Local Remote Local Remote Latency Rate 10^6bits/s 2853 Send Recv Send Recv usec/Tran per sec Outbound Inbound 2854 8 0 0 0 2015.402 3473.252 910.492 910.492 2855 2856 to get a bidirectional bulk-throughput result. As one can see, the -v 2857 2 output will include a number of interesting, related values. 2858 2859 NOTE: The logic behind `--enable-burst' is very simple, and there 2860 are no calls to `poll()' or `select()' which means we want to make 2861 sure that the `send()' calls will never block, or we run the risk 2862 of deadlock with each side stuck trying to call `send()' and 2863 neither calling `recv()'. 2864 2865 Fortunately, this is easily accomplished by setting a "large enough" 2866 socket buffer size with the test-specific `-s' and `-S' options. 2867 Presently this must be performed by the user. Future versions of 2868 netperf might attempt to do this automagically, but there are some 2869 issues to be worked-out. 2870 2871 2872 File: netperf.info, Node: Implications of Concurrent Tests vs Burst Request/Response, Prev: Bidirectional Transfer with TCP_RR, Up: Using Netperf to Measure Bidirectional Transfer 2873 2874 8.3 Implications of Concurrent Tests vs Burst Request/Response 2875 ============================================================== 2876 2877 There are perhaps subtle but important differences between using 2878 concurrent unidirectional tests vs a burst-mode request to measure 2879 bidirectional performance. 2880 2881 Broadly speaking, a single "connection" or "flow" of traffic cannot 2882 make use of the services of more than one or two CPUs at either end. 2883 Whether one or two CPUs will be used processing a flow will depend on 2884 the specifics of the stack(s) involved and whether or not the global 2885 `-T' option has been used to bind netperf/netserver to specific CPUs. 2886 2887 When using concurrent tests there will be two concurrent connections 2888 or flows, which means that upwards of four CPUs will be employed 2889 processing the packets (global `-T' used, no more than two if not), 2890 however, with just a single, bidirectional request/response test no 2891 more than two CPUs will be employed (only one if the global `-T' is not 2892 used). 2893 2894 If there is a CPU bottleneck on either system this may result in 2895 rather different results between the two methods. 2896 2897 Also, with a bidirectional request/response test there is something 2898 of a natural balance or synchronization between inbound and outbound - a 2899 response will not be sent until a request is received, and (once the 2900 burst level is reached) a subsequent request will not be sent until a 2901 response is received. This may mask favoritism in the NIC between 2902 inbound and outbound processing. 2903 2904 With two concurrent unidirectional tests there is no such 2905 synchronization or balance and any favoritism in the NIC may be exposed. 2906 2907 2908 File: netperf.info, Node: The Omni Tests, Next: Other Netperf Tests, Prev: Using Netperf to Measure Bidirectional Transfer, Up: Top 2909 2910 9 The Omni Tests 2911 **************** 2912 2913 Beginning with version 2.5.0, netperf begins a migration to the `omni' 2914 tests or "Two routines to measure them all." The code for the omni 2915 tests can be found in `src/nettest_omni.c' and the goal is to make it 2916 easier for netperf to support multiple protocols and report a great 2917 many additional things about the systems under test. Additionally, a 2918 flexible output selection mechanism is present which allows the user to 2919 chose specifically what values she wishes to have reported and in what 2920 format. 2921 2922 The omni tests are included by default in version 2.5.0. To disable 2923 them, one must: 2924 ./configure --enable-omni=no ... 2925 2926 and remake netperf. Remaking netserver is optional because even in 2927 2.5.0 it has "unmigrated" netserver side routines for the classic (eg 2928 `src/nettest_bsd.c') tests. 2929 2930 * Menu: 2931 2932 * Native Omni Tests:: 2933 * Migrated Tests:: 2934 * Omni Output Selection:: 2935 2936 2937 File: netperf.info, Node: Native Omni Tests, Next: Migrated Tests, Prev: The Omni Tests, Up: The Omni Tests 2938 2939 9.1 Native Omni Tests 2940 ===================== 2941 2942 One access the omni tests "natively" by using a value of "OMNI" with 2943 the global `-t' test-selection option. This will then cause netperf to 2944 use the code in `src/nettest_omni.c' and in particular the 2945 test-specific options parser for the omni tests. The test-specific 2946 options for the omni tests are a superset of those for "classic" tests. 2947 The options added by the omni tests are: 2948 2949 `-c' 2950 This explicitly declares that the test is to include connection 2951 establishment and tear-down as in either a TCP_CRR or TCP_CC test. 2952 2953 `-d <direction>' 2954 This option sets the direction of the test relative to the netperf 2955 process. As of version 2.5.0 one can use the following in a 2956 case-insensitive manner: 2957 2958 `send, stream, transmit, xmit or 2' 2959 Any of which will cause netperf to send to the netserver. 2960 2961 `recv, receive, maerts or 4' 2962 Any of which will cause netserver to send to netperf. 2963 2964 `rr or 6' 2965 Either of which will cause a request/response test. 2966 2967 Additionally, one can specify two directions separated by a '|' 2968 character and they will be OR'ed together. In this way one can use 2969 the "Send|Recv" that will be emitted by the *note DIRECTION: Omni 2970 Output Selectors. *note output selector: Omni Output Selection. 2971 when used with a request/response test. 2972 2973 `-k [*note output selector: Omni Output Selection.]' 2974 This option sets the style of output to "keyval" where each line of 2975 output has the form: 2976 key=value 2977 For example: 2978 $ netperf -t omni -- -d rr -k "THROUGHPUT,THROUGHPUT_UNITS" 2979 OMNI TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to localhost.localdomain (127.0.0.1) port 0 AF_INET : demo 2980 THROUGHPUT=59092.65 2981 THROUGHPUT_UNITS=Trans/s 2982 2983 Using the `-k' option will override any previous, test-specific 2984 `-o' or `-O' option. 2985 2986 `-o [*note output selector: Omni Output Selection.]' 2987 This option sets the style of output to "CSV" where there will be 2988 one line of comma-separated values, preceded by one line of column 2989 names unless the global `-P' option is used with a value of 0: 2990 $ netperf -t omni -- -d rr -o "THROUGHPUT,THROUGHPUT_UNITS" 2991 OMNI TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to localhost.localdomain (127.0.0.1) port 0 AF_INET : demo 2992 Throughput,Throughput Units 2993 60999.07,Trans/s 2994 2995 Using the `-o' option will override any previous, test-specific 2996 `-k' or `-O' option. 2997 2998 `-O [*note output selector: Omni Output Selection.]' 2999 This option sets the style of output to "human readable" which will 3000 look quite similar to classic netperf output: 3001 $ netperf -t omni -- -d rr -O "THROUGHPUT,THROUGHPUT_UNITS" 3002 OMNI TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to localhost.localdomain (127.0.0.1) port 0 AF_INET : demo 3003 Throughput Throughput 3004 Units 3005 3006 3007 60492.57 Trans/s 3008 3009 Using the `-O' option will override any previous, test-specific 3010 `-k' or `-o' option. 3011 3012 `-t' 3013 This option explicitly sets the socket type for the test's data 3014 connection. As of version 2.5.0 the known socket types include 3015 "stream" and "dgram" for SOCK_STREAM and SOCK_DGRAM respectively. 3016 3017 `-T <protocol>' 3018 This option is used to explicitly set the protocol used for the 3019 test. It is case-insensitive. As of version 2.5.0 the protocols 3020 known to netperf include: 3021 `TCP' 3022 Select the Transmission Control Protocol 3023 3024 `UDP' 3025 Select the User Datagram Protocol 3026 3027 `SDP' 3028 Select the Sockets Direct Protocol 3029 3030 `DCCP' 3031 Select the Datagram Congestion Control Protocol 3032 3033 `SCTP' 3034 Select the Stream Control Transport Protocol 3035 3036 `udplite' 3037 Select UDP Lite 3038 3039 The default is implicit based on other settings. 3040 3041 The omni tests also extend the interpretation of some of the classic, 3042 test-specific options for the BSD Sockets tests: 3043 3044 `-m <optionspec>' 3045 This can set the send size for either or both of the netperf and 3046 netserver sides of the test: 3047 -m 32K 3048 sets only the netperf-side send size to 32768 bytes, and or's-in 3049 transmit for the direction. This is effectively the same behaviour 3050 as for the classic tests. 3051 -m ,32K 3052 sets only the netserver side send size to 32768 bytes and or's-in 3053 receive for the direction. 3054 -m 16K,32K 3055 sets the netperf side send size to 16284 bytes, the netserver side 3056 send size to 32768 bytes and the direction will be "Send|Recv." 3057 3058 `-M <optionspec>' 3059 This can set the receive size for either or both of the netperf and 3060 netserver sides of the test: 3061 -M 32K 3062 sets only the netserver side receive size to 32768 bytes and 3063 or's-in send for the test direction. 3064 -M ,32K 3065 sets only the netperf side receive size to 32768 bytes and or's-in 3066 receive for the test direction. 3067 -M 16K,32K 3068 sets the netserver side receive size to 16384 bytes and the netperf 3069 side receive size to 32768 bytes and the direction will be 3070 "Send|Recv." 3071 3072 3073 File: netperf.info, Node: Migrated Tests, Next: Omni Output Selection, Prev: Native Omni Tests, Up: The Omni Tests 3074 3075 9.2 Migrated Tests 3076 ================== 3077 3078 As of version 2.5.0 several tests have been migrated to use the omni 3079 code in `src/nettest_omni.c' for the core of their testing. A migrated 3080 test retains all its previous output code and so should still "look and 3081 feel" just like a pre-2.5.0 test with one exception - the first line of 3082 the test banners will include the word "MIGRATED" at the beginning as 3083 in: 3084 3085 $ netperf 3086 MIGRATED TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to localhost.localdomain (127.0.0.1) port 0 AF_INET : demo 3087 Recv Send Send 3088 Socket Socket Message Elapsed 3089 Size Size Size Time Throughput 3090 bytes bytes bytes secs. 10^6bits/sec 3091 3092 87380 16384 16384 10.00 27175.27 3093 3094 The tests migrated in version 2.5.0 are: 3095 * TCP_STREAM 3096 3097 * TCP_MAERTS 3098 3099 * TCP_RR 3100 3101 * TCP_CRR 3102 3103 * UDP_STREAM 3104 3105 * UDP_RR 3106 3107 It is expected that future releases will have additional tests 3108 migrated to use the "omni" functionality. 3109 3110 If one uses "omni-specific" test-specific options in conjunction 3111 with a migrated test, instead of using the classic output code, the new 3112 omni output code will be used. For example if one uses the `-k' 3113 test-specific option with a value of "MIN_LATENCY,MAX_LATENCY" with a 3114 migrated TCP_RR test one will see: 3115 3116 $ netperf -t tcp_rr -- -k THROUGHPUT,THROUGHPUT_UNITS 3117 MIGRATED TCP REQUEST/RESPONSE TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to localhost.localdomain (127.0.0.1) port 0 AF_INET : demo 3118 THROUGHPUT=60074.74 3119 THROUGHPUT_UNITS=Trans/s 3120 rather than: 3121 $ netperf -t tcp_rr 3122 MIGRATED TCP REQUEST/RESPONSE TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to localhost.localdomain (127.0.0.1) port 0 AF_INET : demo 3123 Local /Remote 3124 Socket Size Request Resp. Elapsed Trans. 3125 Send Recv Size Size Time Rate 3126 bytes Bytes bytes bytes secs. per sec 3127 3128 16384 87380 1 1 10.00 59421.52 3129 16384 87380 3130 3131 3132 File: netperf.info, Node: Omni Output Selection, Prev: Migrated Tests, Up: The Omni Tests 3133 3134 9.3 Omni Output Selection 3135 ========================= 3136 3137 The omni test-specific `-k', `-o' and `-O' options take an optional 3138 `output selector' by which the user can configure what values are 3139 reported. The output selector can take several forms: 3140 3141 ``filename'' 3142 The output selections will be read from the named file. Within the 3143 file there can be up to four lines of comma-separated output 3144 selectors. This controls how many multi-line blocks of output are 3145 emitted when the `-O' option is used. This output, while not 3146 identical to "classic" netperf output, is inspired by it. 3147 Multiple lines have no effect for `-k' and `-o' options. Putting 3148 output selections in a file can be useful when the list of 3149 selections is long. 3150 3151 `comma and/or semi-colon-separated list' 3152 The output selections will be parsed from a comma and/or 3153 semi-colon-separated list of output selectors. When the list is 3154 given to a `-O' option a semi-colon specifies a new output block 3155 should be started. Semi-colons have the same meaning as commas 3156 when used with the `-k' or `-o' options. Depending on the command 3157 interpreter being used, the semi-colon may have to be escaped 3158 somehow to keep it from being interpreted by the command 3159 interpreter. This can often be done by enclosing the entire list 3160 in quotes. 3161 3162 `all' 3163 If the keyword all is specified it means that all known output 3164 values should be displayed at the end of the test. This can be a 3165 great deal of output. As of version 2.5.0 there are 157 different 3166 output selectors. 3167 3168 `?' 3169 If a "?" is given as the output selection, the list of all known 3170 output selectors will be displayed and no test actually run. When 3171 passed to the `-O' option they will be listed one per line. 3172 Otherwise they will be listed as a comma-separated list. It may 3173 be necessary to protect the "?" from the command interpreter by 3174 escaping it or enclosing it in quotes. 3175 3176 `no selector' 3177 If nothing is given to the `-k', `-o' or `-O' option then the code 3178 selects a default set of output selectors inspired by classic 3179 netperf output. The format will be the `human readable' format 3180 emitted by the test-specific `-O' option. 3181 3182 The order of evaluation will first check for an output selection. If 3183 none is specified with the `-k', `-o' or `-O' option netperf will 3184 select a default based on the characteristics of the test. If there is 3185 an output selection, the code will first check for `?', then check to 3186 see if it is the magic `all' keyword. After that it will check for 3187 either `,' or `;' in the selection and take that to mean it is a comma 3188 and/or semi-colon-separated list. If none of those checks match, 3189 netperf will then assume the output specification is a filename and 3190 attempt to open and parse the file. 3191 3192 * Menu: 3193 3194 * Omni Output Selectors:: 3195 3196 3197 File: netperf.info, Node: Omni Output Selectors, Prev: Omni Output Selection, Up: Omni Output Selection 3198 3199 9.3.1 Omni Output Selectors 3200 --------------------------- 3201 3202 As of version 2.5.0 the output selectors are: 3203 3204 `OUTPUT_NONE' 3205 This is essentially a null output. For `-k' output it will simply 3206 add a line that reads "OUTPUT_NONE=" to the output. For `-o' it 3207 will cause an empty "column" to be included. For `-O' output it 3208 will cause extra spaces to separate "real" output. 3209 3210 `SOCKET_TYPE' 3211 This will cause the socket type (eg SOCK_STREAM, SOCK_DGRAM) for 3212 the data connection to be output. 3213 3214 `PROTOCOL' 3215 This will cause the protocol used for the data connection to be 3216 displayed. 3217 3218 `DIRECTION' 3219 This will display the data flow direction relative to the netperf 3220 process. Units: Send or Recv for a unidirectional bulk-transfer 3221 test, or Send|Recv for a request/response test. 3222 3223 `ELAPSED_TIME' 3224 This will display the elapsed time in seconds for the test. 3225 3226 `THROUGHPUT' 3227 This will display the throughput for the test. Units: As requested 3228 via the global `-f' option and displayed by the THROUGHPUT_UNITS 3229 output selector. 3230 3231 `THROUGHPUT_UNITS' 3232 This will display the units for what is displayed by the 3233 `THROUGHPUT' output selector. 3234 3235 `LSS_SIZE_REQ' 3236 This will display the local (netperf) send socket buffer size (aka 3237 SO_SNDBUF) requested via the command line. Units: Bytes. 3238 3239 `LSS_SIZE' 3240 This will display the local (netperf) send socket buffer size 3241 (SO_SNDBUF) immediately after the data connection socket was 3242 created. Peculiarities of different networking stacks may lead to 3243 this differing from the size requested via the command line. 3244 Units: Bytes. 3245 3246 `LSS_SIZE_END' 3247 This will display the local (netperf) send socket buffer size 3248 (SO_SNDBUF) immediately before the data connection socket is 3249 closed. Peculiarities of different networking stacks may lead 3250 this to differ from the size requested via the command line and/or 3251 the size immediately after the data connection socket was created. 3252 Units: Bytes. 3253 3254 `LSR_SIZE_REQ' 3255 This will display the local (netperf) receive socket buffer size 3256 (aka SO_RCVBUF) requested via the command line. Units: Bytes. 3257 3258 `LSR_SIZE' 3259 This will display the local (netperf) receive socket buffer size 3260 (SO_RCVBUF) immediately after the data connection socket was 3261 created. Peculiarities of different networking stacks may lead to 3262 this differing from the size requested via the command line. 3263 Units: Bytes. 3264 3265 `LSR_SIZE_END' 3266 This will display the local (netperf) receive socket buffer size 3267 (SO_RCVBUF) immediately before the data connection socket is 3268 closed. Peculiarities of different networking stacks may lead 3269 this to differ from the size requested via the command line and/or 3270 the size immediately after the data connection socket was created. 3271 Units: Bytes. 3272 3273 `RSS_SIZE_REQ' 3274 This will display the remote (netserver) send socket buffer size 3275 (aka SO_SNDBUF) requested via the command line. Units: Bytes. 3276 3277 `RSS_SIZE' 3278 This will display the remote (netserver) send socket buffer size 3279 (SO_SNDBUF) immediately after the data connection socket was 3280 created. Peculiarities of different networking stacks may lead to 3281 this differing from the size requested via the command line. 3282 Units: Bytes. 3283 3284 `RSS_SIZE_END' 3285 This will display the remote (netserver) send socket buffer size 3286 (SO_SNDBUF) immediately before the data connection socket is 3287 closed. Peculiarities of different networking stacks may lead 3288 this to differ from the size requested via the command line and/or 3289 the size immediately after the data connection socket was created. 3290 Units: Bytes. 3291 3292 `RSR_SIZE_REQ' 3293 This will display the remote (netserver) receive socket buffer 3294 size (aka SO_RCVBUF) requested via the command line. Units: Bytes. 3295 3296 `RSR_SIZE' 3297 This will display the remote (netserver) receive socket buffer size 3298 (SO_RCVBUF) immediately after the data connection socket was 3299 created. Peculiarities of different networking stacks may lead to 3300 this differing from the size requested via the command line. 3301 Units: Bytes. 3302 3303 `RSR_SIZE_END' 3304 This will display the remote (netserver) receive socket buffer size 3305 (SO_RCVBUF) immediately before the data connection socket is 3306 closed. Peculiarities of different networking stacks may lead 3307 this to differ from the size requested via the command line and/or 3308 the size immediately after the data connection socket was created. 3309 Units: Bytes. 3310 3311 `LOCAL_SEND_SIZE' 3312 This will display the size of the buffers netperf passed in any 3313 "send" calls it made on the data connection for a 3314 non-request/response test. Units: Bytes. 3315 3316 `LOCAL_RECV_SIZE' 3317 This will display the size of the buffers netperf passed in any 3318 "receive" calls it made on the data connection for a 3319 non-request/response test. Units: Bytes. 3320 3321 `REMOTE_SEND_SIZE' 3322 This will display the size of the buffers netserver passed in any 3323 "send" calls it made on the data connection for a 3324 non-request/response test. Units: Bytes. 3325 3326 `REMOTE_RECV_SIZE' 3327 This will display the size of the buffers netserver passed in any 3328 "receive" calls it made on the data connection for a 3329 non-request/response test. Units: Bytes. 3330 3331 `REQUEST_SIZE' 3332 This will display the size of the requests netperf sent in a 3333 request-response test. Units: Bytes. 3334 3335 `RESPONSE_SIZE' 3336 This will display the size of the responses netserver sent in a 3337 request-response test. Units: Bytes. 3338 3339 `LOCAL_CPU_UTIL' 3340 This will display the overall CPU utilization during the test as 3341 measured by netperf. Units: 0 to 100 percent. 3342 3343 `LOCAL_CPU_PERCENT_USER' 3344 This will display the CPU fraction spent in user mode during the 3345 test as measured by netperf. Only supported by netcpu_procstat. 3346 Units: 0 to 100 percent. 3347 3348 `LOCAL_CPU_PERCENT_SYSTEM' 3349 This will display the CPU fraction spent in system mode during the 3350 test as measured by netperf. Only supported by netcpu_procstat. 3351 Units: 0 to 100 percent. 3352 3353 `LOCAL_CPU_PERCENT_IOWAIT' 3354 This will display the fraction of time waiting for I/O to complete 3355 during the test as measured by netperf. Only supported by 3356 netcpu_procstat. Units: 0 to 100 percent. 3357 3358 `LOCAL_CPU_PERCENT_IRQ' 3359 This will display the fraction of time servicing interrupts during 3360 the test as measured by netperf. Only supported by 3361 netcpu_procstat. Units: 0 to 100 percent. 3362 3363 `LOCAL_CPU_PERCENT_SWINTR' 3364 This will display the fraction of time servicing softirqs during 3365 the test as measured by netperf. Only supported by 3366 netcpu_procstat. Units: 0 to 100 percent. 3367 3368 `LOCAL_CPU_METHOD' 3369 This will display the method used by netperf to measure CPU 3370 utilization. Units: single character denoting method. 3371 3372 `LOCAL_SD' 3373 This will display the service demand, or units of CPU consumed per 3374 unit of work, as measured by netperf. Units: microseconds of CPU 3375 consumed per either KB (K==1024) of data transferred or 3376 request/response transaction. 3377 3378 `REMOTE_CPU_UTIL' 3379 This will display the overall CPU utilization during the test as 3380 measured by netserver. Units 0 to 100 percent. 3381 3382 `REMOTE_CPU_PERCENT_USER' 3383 This will display the CPU fraction spent in user mode during the 3384 test as measured by netserver. Only supported by netcpu_procstat. 3385 Units: 0 to 100 percent. 3386 3387 `REMOTE_CPU_PERCENT_SYSTEM' 3388 This will display the CPU fraction spent in system mode during the 3389 test as measured by netserver. Only supported by netcpu_procstat. 3390 Units: 0 to 100 percent. 3391 3392 `REMOTE_CPU_PERCENT_IOWAIT' 3393 This will display the fraction of time waiting for I/O to complete 3394 during the test as measured by netserver. Only supported by 3395 netcpu_procstat. Units: 0 to 100 percent. 3396 3397 `REMOTE_CPU_PERCENT_IRQ' 3398 This will display the fraction of time servicing interrupts during 3399 the test as measured by netserver. Only supported by 3400 netcpu_procstat. Units: 0 to 100 percent. 3401 3402 `REMOTE_CPU_PERCENT_SWINTR' 3403 This will display the fraction of time servicing softirqs during 3404 the test as measured by netserver. Only supported by 3405 netcpu_procstat. Units: 0 to 100 percent. 3406 3407 `REMOTE_CPU_METHOD' 3408 This will display the method used by netserver to measure CPU 3409 utilization. Units: single character denoting method. 3410 3411 `REMOTE_SD' 3412 This will display the service demand, or units of CPU consumed per 3413 unit of work, as measured by netserver. Units: microseconds of CPU 3414 consumed per either KB (K==1024) of data transferred or 3415 request/response transaction. 3416 3417 `SD_UNITS' 3418 This will display the units for LOCAL_SD and REMOTE_SD 3419 3420 `CONFIDENCE_LEVEL' 3421 This will display the confidence level requested by the user either 3422 explicitly via the global `-I' option, or implicitly via the 3423 global `-i' option. The value will be either 95 or 99 if 3424 confidence intervals have been requested or 0 if they were not. 3425 Units: Percent 3426 3427 `CONFIDENCE_INTERVAL' 3428 This will display the width of the confidence interval requested 3429 either explicitly via the global `-I' option or implicitly via the 3430 global `-i' option. Units: Width in percent of mean value 3431 computed. A value of -1.0 means that confidence intervals were not 3432 requested. 3433 3434 `CONFIDENCE_ITERATION' 3435 This will display the number of test iterations netperf undertook, 3436 perhaps while attempting to achieve the requested confidence 3437 interval and level. If confidence intervals were requested via the 3438 command line then the value will be between 3 and 30. If 3439 confidence intervals were not requested the value will be 1. 3440 Units: Iterations 3441 3442 `THROUGHPUT_CONFID' 3443 This will display the width of the confidence interval actually 3444 achieved for `THROUGHPUT' during the test. Units: Width of 3445 interval as percentage of reported throughput value. 3446 3447 `LOCAL_CPU_CONFID' 3448 This will display the width of the confidence interval actually 3449 achieved for overall CPU utilization on the system running netperf 3450 (`LOCAL_CPU_UTIL') during the test, if CPU utilization measurement 3451 was enabled. Units: Width of interval as percentage of reported 3452 CPU utilization. 3453 3454 `REMOTE_CPU_CONFID' 3455 This will display the width of the confidence interval actually 3456 achieved for overall CPU utilization on the system running 3457 netserver (`REMOTE_CPU_UTIL') during the test, if CPU utilization 3458 measurement was enabled. Units: Width of interval as percentage of 3459 reported CPU utilization. 3460 3461 `TRANSACTION_RATE' 3462 This will display the transaction rate in transactions per second 3463 for a request/response test even if the user has requested a 3464 throughput in units of bits or bytes per second via the global `-f' 3465 option. It is undefined for a non-request/response test. Units: 3466 Transactions per second. 3467 3468 `RT_LATENCY' 3469 This will display the average round-trip latency for a 3470 request/response test, accounting for number of transactions in 3471 flight at one time. It is undefined for a non-request/response 3472 test. Units: Microseconds per transaction 3473 3474 `BURST_SIZE' 3475 This will display the "burst size" or added transactions in flight 3476 in a request/response test as requested via a test-specific `-b' 3477 option. The number of transactions in flight at one time will be 3478 one greater than this value. It is undefined for a 3479 non-request/response test. Units: added Transactions in flight. 3480 3481 `LOCAL_TRANSPORT_RETRANS' 3482 This will display the number of retransmissions experienced on the 3483 data connection during the test as determined by netperf. A value 3484 of -1 means the attempt to determine the number of retransmissions 3485 failed or the concept was not valid for the given protocol or the 3486 mechanism is not known for the platform. A value of -2 means it 3487 was not attempted. As of version 2.5.0 the meaning of values are 3488 in flux and subject to change. Units: number of retransmissions. 3489 3490 `REMOTE_TRANSPORT_RETRANS' 3491 This will display the number of retransmissions experienced on the 3492 data connection during the test as determined by netserver. A 3493 value of -1 means the attempt to determine the number of 3494 retransmissions failed or the concept was not valid for the given 3495 protocol or the mechanism is not known for the platform. A value 3496 of -2 means it was not attempted. As of version 2.5.0 the meaning 3497 of values are in flux and subject to change. Units: number of 3498 retransmissions. 3499 3500 `TRANSPORT_MSS' 3501 This will display the Maximum Segment Size (aka MSS) or its 3502 equivalent for the protocol being used during the test. A value 3503 of -1 means either the concept of an MSS did not apply to the 3504 protocol being used, or there was an error in retrieving it. 3505 Units: Bytes. 3506 3507 `LOCAL_SEND_THROUGHPUT' 3508 The throughput as measured by netperf for the successful "send" 3509 calls it made on the data connection. Units: as requested via the 3510 global `-f' option and displayed via the `THROUGHPUT_UNITS' output 3511 selector. 3512 3513 `LOCAL_RECV_THROUGHPUT' 3514 The throughput as measured by netperf for the successful "receive" 3515 calls it made on the data connection. Units: as requested via the 3516 global `-f' option and displayed via the `THROUGHPUT_UNITS' output 3517 selector. 3518 3519 `REMOTE_SEND_THROUGHPUT' 3520 The throughput as measured by netserver for the successful "send" 3521 calls it made on the data connection. Units: as requested via the 3522 global `-f' option and displayed via the `THROUGHPUT_UNITS' output 3523 selector. 3524 3525 `REMOTE_RECV_THROUGHPUT' 3526 The throughput as measured by netserver for the successful 3527 "receive" calls it made on the data connection. Units: as 3528 requested via the global `-f' option and displayed via the 3529 `THROUGHPUT_UNITS' output selector. 3530 3531 `LOCAL_CPU_BIND' 3532 The CPU to which netperf was bound, if at all, during the test. A 3533 value of -1 means that netperf was not explicitly bound to a CPU 3534 during the test. Units: CPU ID 3535 3536 `LOCAL_CPU_COUNT' 3537 The number of CPUs (cores, threads) detected by netperf. Units: 3538 CPU count. 3539 3540 `LOCAL_CPU_PEAK_UTIL' 3541 The utilization of the CPU most heavily utilized during the test, 3542 as measured by netperf. This can be used to see if any one CPU of a 3543 multi-CPU system was saturated even though the overall CPU 3544 utilization as reported by `LOCAL_CPU_UTIL' was low. Units: 0 to 3545 100% 3546 3547 `LOCAL_CPU_PEAK_ID' 3548 The id of the CPU most heavily utilized during the test as 3549 determined by netperf. Units: CPU ID. 3550 3551 `LOCAL_CPU_MODEL' 3552 Model information for the processor(s) present on the system 3553 running netperf. Assumes all processors in the system (as 3554 perceived by netperf) on which netperf is running are the same 3555 model. Units: Text 3556 3557 `LOCAL_CPU_FREQUENCY' 3558 The frequency of the processor(s) on the system running netperf, at 3559 the time netperf made the call. Assumes that all processors 3560 present in the system running netperf are running at the same 3561 frequency. Units: MHz 3562 3563 `REMOTE_CPU_BIND' 3564 The CPU to which netserver was bound, if at all, during the test. A 3565 value of -1 means that netperf was not explicitly bound to a CPU 3566 during the test. Units: CPU ID 3567 3568 `REMOTE_CPU_COUNT' 3569 The number of CPUs (cores, threads) detected by netserver. Units: 3570 CPU count. 3571 3572 `REMOTE_CPU_PEAK_UTIL' 3573 The utilization of the CPU most heavily utilized during the test, 3574 as measured by netserver. This can be used to see if any one CPU 3575 of a multi-CPU system was saturated even though the overall CPU 3576 utilization as reported by `REMOTE_CPU_UTIL' was low. Units: 0 to 3577 100% 3578 3579 `REMOTE_CPU_PEAK_ID' 3580 The id of the CPU most heavily utilized during the test as 3581 determined by netserver. Units: CPU ID. 3582 3583 `REMOTE_CPU_MODEL' 3584 Model information for the processor(s) present on the system 3585 running netserver. Assumes all processors in the system (as 3586 perceived by netserver) on which netserver is running are the same 3587 model. Units: Text 3588 3589 `REMOTE_CPU_FREQUENCY' 3590 The frequency of the processor(s) on the system running netserver, 3591 at the time netserver made the call. Assumes that all processors 3592 present in the system running netserver are running at the same 3593 frequency. Units: MHz 3594 3595 `SOURCE_PORT' 3596 The port ID/service name to which the data socket created by 3597 netperf was bound. A value of 0 means the data socket was not 3598 explicitly bound to a port number. Units: ASCII text. 3599 3600 `SOURCE_ADDR' 3601 The name/address to which the data socket created by netperf was 3602 bound. A value of 0.0.0.0 means the data socket was not explicitly 3603 bound to an address. Units: ASCII text. 3604 3605 `SOURCE_FAMILY' 3606 The address family to which the data socket created by netperf was 3607 bound. A value of 0 means the data socket was not explicitly 3608 bound to a given address family. Units: ASCII text. 3609 3610 `DEST_PORT' 3611 The port ID to which the data socket created by netserver was 3612 bound. A value of 0 means the data socket was not explicitly bound 3613 to a port number. Units: ASCII text. 3614 3615 `DEST_ADDR' 3616 The name/address of the data socket created by netserver. Units: 3617 ASCII text. 3618 3619 `DEST_FAMILY' 3620 The address family to which the data socket created by netserver 3621 was bound. A value of 0 means the data socket was not explicitly 3622 bound to a given address family. Units: ASCII text. 3623 3624 `LOCAL_SEND_CALLS' 3625 The number of successful "send" calls made by netperf against its 3626 data socket. Units: Calls. 3627 3628 `LOCAL_RECV_CALLS' 3629 The number of successful "receive" calls made by netperf against 3630 its data socket. Units: Calls. 3631 3632 `LOCAL_BYTES_PER_RECV' 3633 The average number of bytes per "receive" call made by netperf 3634 against its data socket. Units: Bytes. 3635 3636 `LOCAL_BYTES_PER_SEND' 3637 The average number of bytes per "send" call made by netperf against 3638 its data socket. Units: Bytes. 3639 3640 `LOCAL_BYTES_SENT' 3641 The number of bytes successfully sent by netperf through its data 3642 socket. Units: Bytes. 3643 3644 `LOCAL_BYTES_RECVD' 3645 The number of bytes successfully received by netperf through its 3646 data socket. Units: Bytes. 3647 3648 `LOCAL_BYTES_XFERD' 3649 The sum of bytes sent and received by netperf through its data 3650 socket. Units: Bytes. 3651 3652 `LOCAL_SEND_OFFSET' 3653 The offset from the alignment of the buffers passed by netperf in 3654 its "send" calls. Specified via the global `-o' option and 3655 defaults to 0. Units: Bytes. 3656 3657 `LOCAL_RECV_OFFSET' 3658 The offset from the alignment of the buffers passed by netperf in 3659 its "receive" calls. Specified via the global `-o' option and 3660 defaults to 0. Units: Bytes. 3661 3662 `LOCAL_SEND_ALIGN' 3663 The alignment of the buffers passed by netperf in its "send" calls 3664 as specified via the global `-a' option. Defaults to 8. Units: 3665 Bytes. 3666 3667 `LOCAL_RECV_ALIGN' 3668 The alignment of the buffers passed by netperf in its "receive" 3669 calls as specified via the global `-a' option. Defaults to 8. 3670 Units: Bytes. 3671 3672 `LOCAL_SEND_WIDTH' 3673 The "width" of the ring of buffers through which netperf cycles as 3674 it makes its "send" calls. Defaults to one more than the local 3675 send socket buffer size divided by the send size as determined at 3676 the time the data socket is created. Can be used to make netperf 3677 more processor data cache unfriendly. Units: number of buffers. 3678 3679 `LOCAL_RECV_WIDTH' 3680 The "width" of the ring of buffers through which netperf cycles as 3681 it makes its "receive" calls. Defaults to one more than the local 3682 receive socket buffer size divided by the receive size as 3683 determined at the time the data socket is created. Can be used to 3684 make netperf more processor data cache unfriendly. Units: number 3685 of buffers. 3686 3687 `LOCAL_SEND_DIRTY_COUNT' 3688 The number of bytes to "dirty" (write to) before netperf makes a 3689 "send" call. Specified via the global `-k' option, which requires 3690 that -enable-dirty=yes was specified with the configure command 3691 prior to building netperf. Units: Bytes. 3692 3693 `LOCAL_RECV_DIRTY_COUNT' 3694 The number of bytes to "dirty" (write to) before netperf makes a 3695 "recv" call. Specified via the global `-k' option which requires 3696 that -enable-dirty was specified with the configure command prior 3697 to building netperf. Units: Bytes. 3698 3699 `LOCAL_RECV_CLEAN_COUNT' 3700 The number of bytes netperf should read "cleanly" before making a 3701 "receive" call. Specified via the global `-k' option which 3702 requires that -enable-dirty was specified with configure command 3703 prior to building netperf. Clean reads start were dirty writes 3704 ended. Units: Bytes. 3705 3706 `LOCAL_NODELAY' 3707 Indicates whether or not setting the test protocol-specific "no 3708 delay" (eg TCP_NODELAY) option on the data socket used by netperf 3709 was requested by the test-specific `-D' option and successful. 3710 Units: 0 means no, 1 means yes. 3711 3712 `LOCAL_CORK' 3713 Indicates whether or not TCP_CORK was set on the data socket used 3714 by netperf as requested via the test-specific `-C' option. 1 means 3715 yes, 0 means no/not applicable. 3716 3717 `REMOTE_SEND_CALLS' 3718 3719 `REMOTE_RECV_CALLS' 3720 3721 `REMOTE_BYTES_PER_RECV' 3722 3723 `REMOTE_BYTES_PER_SEND' 3724 3725 `REMOTE_BYTES_SENT' 3726 3727 `REMOTE_BYTES_RECVD' 3728 3729 `REMOTE_BYTES_XFERD' 3730 3731 `REMOTE_SEND_OFFSET' 3732 3733 `REMOTE_RECV_OFFSET' 3734 3735 `REMOTE_SEND_ALIGN' 3736 3737 `REMOTE_RECV_ALIGN' 3738 3739 `REMOTE_SEND_WIDTH' 3740 3741 `REMOTE_RECV_WIDTH' 3742 3743 `REMOTE_SEND_DIRTY_COUNT' 3744 3745 `REMOTE_RECV_DIRTY_COUNT' 3746 3747 `REMOTE_RECV_CLEAN_COUNT' 3748 3749 `REMOTE_NODELAY' 3750 3751 `REMOTE_CORK' 3752 These are all like their "LOCAL_" counterparts only for the 3753 netserver rather than netperf. 3754 3755 `LOCAL_SYSNAME' 3756 The name of the OS (eg "Linux") running on the system on which 3757 netperf was running. Units: ASCII Text 3758 3759 `LOCAL_SYSTEM_MODEL' 3760 The model name of the system on which netperf was running. Units: 3761 ASCII Text. 3762 3763 `LOCAL_RELEASE' 3764 The release name/number of the OS running on the system on which 3765 netperf was running. Units: ASCII Text 3766 3767 `LOCAL_VERSION' 3768 The version number of the OS running on the system on which netperf 3769 was running. Units: ASCII Text 3770 3771 `LOCAL_MACHINE' 3772 The machine architecture of the machine on which netperf was 3773 running. Units: ASCII Text. 3774 3775 `REMOTE_SYSNAME' 3776 3777 `REMOTE_SYSTEM_MODEL' 3778 3779 `REMOTE_RELEASE' 3780 3781 `REMOTE_VERSION' 3782 3783 `REMOTE_MACHINE' 3784 These are all like their "LOCAL_" counterparts only for the 3785 netserver rather than netperf. 3786 3787 `LOCAL_INTERFACE_NAME' 3788 The name of the probable egress interface through which the data 3789 connection went on the system running netperf. Example: eth0. 3790 Units: ASCII Text. 3791 3792 `LOCAL_INTERFACE_VENDOR' 3793 The vendor ID of the probable egress interface through which 3794 traffic on the data connection went on the system running netperf. 3795 Units: Hexadecimal IDs as might be found in a `pci.ids' file or at 3796 the PCI ID Repository (http://pciids.sourceforge.net/). 3797 3798 `LOCAL_INTERFACE_DEVICE' 3799 The device ID of the probable egress interface through which 3800 traffic on the data connection went on the system running netperf. 3801 Units: Hexadecimal IDs as might be found in a `pci.ids' file or at 3802 the PCI ID Repository (http://pciids.sourceforge.net/). 3803 3804 `LOCAL_INTERFACE_SUBVENDOR' 3805 The sub-vendor ID of the probable egress interface through which 3806 traffic on the data connection went on the system running netperf. 3807 Units: Hexadecimal IDs as might be found in a `pci.ids' file or at 3808 the PCI ID Repository (http://pciids.sourceforge.net/). 3809 3810 `LOCAL_INTERFACE_SUBDEVICE' 3811 The sub-device ID of the probable egress interface through which 3812 traffic on the data connection went on the system running netperf. 3813 Units: Hexadecimal IDs as might be found in a `pci.ids' file or at 3814 the PCI ID Repository (http://pciids.sourceforge.net/). 3815 3816 `LOCAL_DRIVER_NAME' 3817 The name of the driver used for the probable egress interface 3818 through which traffic on the data connection went on the system 3819 running netperf. Units: ASCII Text. 3820 3821 `LOCAL_DRIVER_VERSION' 3822 The version string for the driver used for the probable egress 3823 interface through which traffic on the data connection went on the 3824 system running netperf. Units: ASCII Text. 3825 3826 `LOCAL_DRIVER_FIRMWARE' 3827 The firmware version for the driver used for the probable egress 3828 interface through which traffic on the data connection went on the 3829 system running netperf. Units: ASCII Text. 3830 3831 `LOCAL_DRIVER_BUS' 3832 The bus address of the probable egress interface through which 3833 traffic on the data connection went on the system running netperf. 3834 Units: ASCII Text. 3835 3836 `LOCAL_INTERFACE_SLOT' 3837 The slot ID of the probable egress interface through which traffic 3838 on the data connection went on the system running netperf. Units: 3839 ASCII Text. 3840 3841 `REMOTE_INTERFACE_NAME' 3842 3843 `REMOTE_INTERFACE_VENDOR' 3844 3845 `REMOTE_INTERFACE_DEVICE' 3846 3847 `REMOTE_INTERFACE_SUBVENDOR' 3848 3849 `REMOTE_INTERFACE_SUBDEVICE' 3850 3851 `REMOTE_DRIVER_NAME' 3852 3853 `REMOTE_DRIVER_VERSION' 3854 3855 `REMOTE_DRIVER_FIRMWARE' 3856 3857 `REMOTE_DRIVER_BUS' 3858 3859 `REMOTE_INTERFACE_SLOT' 3860 These are all like their "LOCAL_" counterparts only for the 3861 netserver rather than netperf. 3862 3863 `LOCAL_INTERVAL_USECS' 3864 The interval at which bursts of operations (sends, receives, 3865 transactions) were attempted by netperf. Specified by the global 3866 `-w' option which requires -enable-intervals to have been 3867 specified with the configure command prior to building netperf. 3868 Units: Microseconds (though specified by default in milliseconds 3869 on the command line) 3870 3871 `LOCAL_INTERVAL_BURST' 3872 The number of operations (sends, receives, transactions depending 3873 on the test) which were attempted by netperf each 3874 LOCAL_INTERVAL_USECS units of time. Specified by the global `-b' 3875 option which requires -enable-intervals to have been specified 3876 with the configure command prior to building netperf. Units: 3877 number of operations per burst. 3878 3879 `REMOTE_INTERVAL_USECS' 3880 The interval at which bursts of operations (sends, receives, 3881 transactions) were attempted by netserver. Specified by the 3882 global `-w' option which requires -enable-intervals to have been 3883 specified with the configure command prior to building netperf. 3884 Units: Microseconds (though specified by default in milliseconds 3885 on the command line) 3886 3887 `REMOTE_INTERVAL_BURST' 3888 The number of operations (sends, receives, transactions depending 3889 on the test) which were attempted by netperf each 3890 LOCAL_INTERVAL_USECS units of time. Specified by the global `-b' 3891 option which requires -enable-intervals to have been specified 3892 with the configure command prior to building netperf. Units: 3893 number of operations per burst. 3894 3895 `LOCAL_SECURITY_TYPE_ID' 3896 3897 `LOCAL_SECURITY_TYPE' 3898 3899 `LOCAL_SECURITY_ENABLED_NUM' 3900 3901 `LOCAL_SECURITY_ENABLED' 3902 3903 `LOCAL_SECURITY_SPECIFIC' 3904 3905 `REMOTE_SECURITY_TYPE_ID' 3906 3907 `REMOTE_SECURITY_TYPE' 3908 3909 `REMOTE_SECURITY_ENABLED_NUM' 3910 3911 `REMOTE_SECURITY_ENABLED' 3912 3913 `REMOTE_SECURITY_SPECIFIC' 3914 A bunch of stuff related to what sort of security mechanisms (eg 3915 SELINUX) were enabled on the systems during the test. 3916 3917 `RESULT_BRAND' 3918 The string specified by the user with the global `-B' option. 3919 Units: ASCII Text. 3920 3921 `UUID' 3922 The universally unique identifier associated with this test, either 3923 generated automagically by netperf, or passed to netperf via an 3924 omni test-specific `-u' option. Note: Future versions may make this 3925 a global command-line option. Units: ASCII Text. 3926 3927 `MIN_LATENCY' 3928 The minimum "latency" or operation time (send, receive or 3929 request/response exchange depending on the test) as measured on the 3930 netperf side when the global `-j' option was specified. Units: 3931 Microseconds. 3932 3933 `MAX_LATENCY' 3934 The maximum "latency" or operation time (send, receive or 3935 request/response exchange depending on the test) as measured on the 3936 netperf side when the global `-j' option was specified. Units: 3937 Microseconds. 3938 3939 `P50_LATENCY' 3940 The 50th percentile value of "latency" or operation time (send, 3941 receive or request/response exchange depending on the test) as 3942 measured on the netperf side when the global `-j' option was 3943 specified. Units: Microseconds. 3944 3945 `P90_LATENCY' 3946 The 90th percentile value of "latency" or operation time (send, 3947 receive or request/response exchange depending on the test) as 3948 measured on the netperf side when the global `-j' option was 3949 specified. Units: Microseconds. 3950 3951 `P99_LATENCY' 3952 The 99th percentile value of "latency" or operation time (send, 3953 receive or request/response exchange depending on the test) as 3954 measured on the netperf side when the global `-j' option was 3955 specified. Units: Microseconds. 3956 3957 `MEAN_LATENCY' 3958 The average "latency" or operation time (send, receive or 3959 request/response exchange depending on the test) as measured on the 3960 netperf side when the global `-j' option was specified. Units: 3961 Microseconds. 3962 3963 `STDDEV_LATENCY' 3964 The standard deviation of "latency" or operation time (send, 3965 receive or request/response exchange depending on the test) as 3966 measured on the netperf side when the global `-j' option was 3967 specified. Units: Microseconds. 3968 3969 `COMMAND_LINE' 3970 The full command line used when invoking netperf. Units: ASCII 3971 Text. 3972 3973 `OUTPUT_END' 3974 While emitted with the list of output selectors, it is ignored when 3975 specified as an output selector. 3976 3977 3978 File: netperf.info, Node: Other Netperf Tests, Next: Address Resolution, Prev: The Omni Tests, Up: Top 3979 3980 10 Other Netperf Tests 3981 ********************** 3982 3983 Apart from the typical performance tests, netperf contains some tests 3984 which can be used to streamline measurements and reporting. These 3985 include CPU rate calibration (present) and host identification (future 3986 enhancement). 3987 3988 * Menu: 3989 3990 * CPU rate calibration:: 3991 * UUID Generation:: 3992 3993 3994 File: netperf.info, Node: CPU rate calibration, Next: UUID Generation, Prev: Other Netperf Tests, Up: Other Netperf Tests 3995 3996 10.1 CPU rate calibration 3997 ========================= 3998 3999 Some of the CPU utilization measurement mechanisms of netperf work by 4000 comparing the rate at which some counter increments when the system is 4001 idle with the rate at which that same counter increments when the 4002 system is running a netperf test. The ratio of those rates is used to 4003 arrive at a CPU utilization percentage. 4004 4005 This means that netperf must know the rate at which the counter 4006 increments when the system is presumed to be "idle." If it does not 4007 know the rate, netperf will measure it before starting a data transfer 4008 test. This calibration step takes 40 seconds for each of the local or 4009 remote systems, and if repeated for each netperf test would make taking 4010 repeated measurements rather slow. 4011 4012 Thus, the netperf CPU utilization options `-c' and and `-C' can take 4013 an optional calibration value. This value is used as the "idle rate" 4014 and the calibration step is not performed. To determine the idle rate, 4015 netperf can be used to run special tests which only report the value of 4016 the calibration - they are the LOC_CPU and REM_CPU tests. These return 4017 the calibration value for the local and remote system respectively. A 4018 common way to use these tests is to store their results into an 4019 environment variable and use that in subsequent netperf commands: 4020 4021 LOC_RATE=`netperf -t LOC_CPU` 4022 REM_RATE=`netperf -H <remote> -t REM_CPU` 4023 netperf -H <remote> -c $LOC_RATE -C $REM_RATE ... -- ... 4024 ... 4025 netperf -H <remote> -c $LOC_RATE -C $REM_RATE ... -- ... 4026 4027 If you are going to use netperf to measure aggregate results, it is 4028 important to use the LOC_CPU and REM_CPU tests to get the calibration 4029 values first to avoid issues with some of the aggregate netperf tests 4030 transferring data while others are "idle" and getting bogus calibration 4031 values. When running aggregate tests, it is very important to remember 4032 that any one instance of netperf does not know about the other 4033 instances of netperf. It will report global CPU utilization and will 4034 calculate service demand believing it was the only thing causing that 4035 CPU utilization. So, you can use the CPU utilization reported by 4036 netperf in an aggregate test, but you have to calculate service demands 4037 by hand. 4038 4039 4040 File: netperf.info, Node: UUID Generation, Prev: CPU rate calibration, Up: Other Netperf Tests 4041 4042 10.2 UUID Generation 4043 ==================== 4044 4045 Beginning with version 2.5.0 netperf can generate Universally Unique 4046 IDentifiers (UUIDs). This can be done explicitly via the "UUID" test: 4047 $ netperf -t UUID 4048 2c8561ae-9ebd-11e0-a297-0f5bfa0349d0 4049 4050 In and of itself, this is not terribly useful, but used in 4051 conjunction with the test-specific `-u' option of an "omni" test to set 4052 the UUID emitted by the *note UUID: Omni Output Selectors. output 4053 selector, it can be used to tie-together the separate instances of an 4054 aggregate netperf test. Say, for instance if they were inserted into a 4055 database of some sort. 4056 4057 4058 File: netperf.info, Node: Address Resolution, Next: Enhancing Netperf, Prev: Other Netperf Tests, Up: Top 4059 4060 11 Address Resolution 4061 ********************* 4062 4063 Netperf versions 2.4.0 and later have merged IPv4 and IPv6 tests so the 4064 functionality of the tests in `src/nettest_ipv6.c' has been subsumed 4065 into the tests in `src/nettest_bsd.c' This has been accomplished in 4066 part by switching from `gethostbyname()'to `getaddrinfo()' exclusively. 4067 While it was theoretically possible to get multiple results for a 4068 hostname from `gethostbyname()' it was generally unlikely and netperf's 4069 ignoring of the second and later results was not much of an issue. 4070 4071 Now with `getaddrinfo' and particularly with AF_UNSPEC it is 4072 increasingly likely that a given hostname will have multiple associated 4073 addresses. The `establish_control()' routine of `src/netlib.c' will 4074 indeed attempt to chose from among all the matching IP addresses when 4075 establishing the control connection. Netperf does not _really_ care if 4076 the control connection is IPv4 or IPv6 or even mixed on either end. 4077 4078 However, the individual tests still ass-u-me that the first result in 4079 the address list is the one to be used. Whether or not this will 4080 turn-out to be an issue has yet to be determined. 4081 4082 If you do run into problems with this, the easiest workaround is to 4083 specify IP addresses for the data connection explicitly in the 4084 test-specific `-H' and `-L' options. At some point, the netperf tests 4085 _may_ try to be more sophisticated in their parsing of returns from 4086 `getaddrinfo()' - straw-man patches to <netperf-feedback (a] netperf.org> 4087 would of course be most welcome :) 4088 4089 Netperf has leveraged code from other open-source projects with 4090 amenable licensing to provide a replacement `getaddrinfo()' call on 4091 those platforms where the `configure' script believes there is no 4092 native getaddrinfo call. As of this writing, the replacement 4093 `getaddrinfo()' as been tested on HP-UX 11.0 and then presumed to run 4094 elsewhere. 4095 4096 4097 File: netperf.info, Node: Enhancing Netperf, Next: Netperf4, Prev: Address Resolution, Up: Top 4098 4099 12 Enhancing Netperf 4100 ******************** 4101 4102 Netperf is constantly evolving. If you find you want to make 4103 enhancements to netperf, by all means do so. If you wish to add a new 4104 "suite" of tests to netperf the general idea is to: 4105 4106 1. Add files `src/nettest_mumble.c' and `src/nettest_mumble.h' where 4107 mumble is replaced with something meaningful for the test-suite. 4108 4109 2. Add support for an appropriate `--enable-mumble' option in 4110 `configure.ac'. 4111 4112 3. Edit `src/netperf.c', `netsh.c', and `netserver.c' as required, 4113 using #ifdef WANT_MUMBLE. 4114 4115 4. Compile and test 4116 4117 However, with the addition of the "omni" tests in version 2.5.0 it 4118 is preferred that one attempt to make the necessary changes to 4119 `src/nettest_omni.c' rather than adding new source files, unless this 4120 would make the omni tests entirely too complicated. 4121 4122 If you wish to submit your changes for possible inclusion into the 4123 mainline sources, please try to base your changes on the latest 4124 available sources. (*Note Getting Netperf Bits::.) and then send email 4125 describing the changes at a high level to 4126 <netperf-feedback (a] netperf.org> or perhaps <netperf-talk (a] netperf.org>. 4127 If the consensus is positive, then sending context `diff' results to 4128 <netperf-feedback (a] netperf.org> is the next step. From that point, it 4129 is a matter of pestering the Netperf Contributing Editor until he gets 4130 the changes incorporated :) 4131 4132 4133 File: netperf.info, Node: Netperf4, Next: Concept Index, Prev: Enhancing Netperf, Up: Top 4134 4135 13 Netperf4 4136 *********** 4137 4138 Netperf4 is the shorthand name given to version 4.X.X of netperf. This 4139 is really a separate benchmark more than a newer version of netperf, 4140 but it is a descendant of netperf so the netperf name is kept. The 4141 facetious way to describe netperf4 is to say it is the 4142 egg-laying-woolly-milk-pig version of netperf :) The more respectful 4143 way to describe it is to say it is the version of netperf with support 4144 for synchronized, multiple-thread, multiple-test, multiple-system, 4145 network-oriented benchmarking. 4146 4147 Netperf4 is still undergoing evolution. Those wishing to work with or 4148 on netperf4 are encouraged to join the netperf-dev 4149 (http://www.netperf.org/cgi-bin/mailman/listinfo/netperf-dev) mailing 4150 list and/or peruse the current sources 4151 (http://www.netperf.org/svn/netperf4/trunk). 4152 4153 4154 File: netperf.info, Node: Concept Index, Next: Option Index, Prev: Netperf4, Up: Top 4155 4156 Concept Index 4157 ************* 4158 4159 [index] 4160 * Menu: 4161 4162 * Aggregate Performance: Using Netperf to Measure Aggregate Performance. 4163 (line 6) 4164 * Bandwidth Limitation: Installing Netperf Bits. 4165 (line 64) 4166 * Connection Latency: TCP_CC. (line 6) 4167 * CPU Utilization: CPU Utilization. (line 6) 4168 * Design of Netperf: The Design of Netperf. 4169 (line 6) 4170 * Installation: Installing Netperf. (line 6) 4171 * Introduction: Introduction. (line 6) 4172 * Latency, Connection Establishment <1>: XTI_TCP_CRR. (line 6) 4173 * Latency, Connection Establishment <2>: XTI_TCP_CC. (line 6) 4174 * Latency, Connection Establishment <3>: TCP_CRR. (line 6) 4175 * Latency, Connection Establishment: TCP_CC. (line 6) 4176 * Latency, Request-Response <1>: SCTP_RR. (line 6) 4177 * Latency, Request-Response <2>: DLCO_RR. (line 6) 4178 * Latency, Request-Response <3>: DLCL_RR. (line 6) 4179 * Latency, Request-Response <4>: XTI_UDP_RR. (line 6) 4180 * Latency, Request-Response <5>: XTI_TCP_CRR. (line 6) 4181 * Latency, Request-Response <6>: XTI_TCP_RR. (line 6) 4182 * Latency, Request-Response <7>: UDP_RR. (line 6) 4183 * Latency, Request-Response <8>: TCP_CRR. (line 6) 4184 * Latency, Request-Response: TCP_RR. (line 6) 4185 * Limiting Bandwidth <1>: UDP_STREAM. (line 9) 4186 * Limiting Bandwidth: Installing Netperf Bits. 4187 (line 64) 4188 * Measuring Latency: TCP_RR. (line 6) 4189 * Packet Loss: UDP_RR. (line 6) 4190 * Port Reuse: TCP_CC. (line 13) 4191 * TIME_WAIT: TCP_CC. (line 13) 4192 4193 4194 File: netperf.info, Node: Option Index, Prev: Concept Index, Up: Top 4195 4196 Option Index 4197 ************ 4198 4199 [index] 4200 * Menu: 4201 4202 * --enable-burst, Configure: Using Netperf to Measure Aggregate Performance. 4203 (line 6) 4204 * --enable-cpuutil, Configure: Installing Netperf Bits. 4205 (line 24) 4206 * --enable-dlpi, Configure: Installing Netperf Bits. 4207 (line 30) 4208 * --enable-histogram, Configure: Installing Netperf Bits. 4209 (line 64) 4210 * --enable-intervals, Configure: Installing Netperf Bits. 4211 (line 64) 4212 * --enable-omni, Configure: Installing Netperf Bits. 4213 (line 36) 4214 * --enable-sctp, Configure: Installing Netperf Bits. 4215 (line 30) 4216 * --enable-unixdomain, Configure: Installing Netperf Bits. 4217 (line 30) 4218 * --enable-xti, Configure: Installing Netperf Bits. 4219 (line 30) 4220 * -4, Global: Global Options. (line 489) 4221 * -4, Test-specific <1>: Options Common to TCP UDP and SCTP _RR tests. 4222 (line 88) 4223 * -4, Test-specific: Options common to TCP UDP and SCTP tests. 4224 (line 110) 4225 * -6 Test-specific: Options Common to TCP UDP and SCTP _RR tests. 4226 (line 94) 4227 * -6, Global: Global Options. (line 498) 4228 * -6, Test-specific: Options common to TCP UDP and SCTP tests. 4229 (line 116) 4230 * -A, Global: Global Options. (line 18) 4231 * -a, Global: Global Options. (line 6) 4232 * -B, Global: Global Options. (line 29) 4233 * -b, Global: Global Options. (line 22) 4234 * -C, Global: Global Options. (line 42) 4235 * -c, Global: Global Options. (line 33) 4236 * -c, Test-specific: Native Omni Tests. (line 13) 4237 * -D, Global: Global Options. (line 56) 4238 * -d, Global: Global Options. (line 47) 4239 * -d, Test-specific: Native Omni Tests. (line 17) 4240 * -F, Global: Global Options. (line 76) 4241 * -f, Global: Global Options. (line 67) 4242 * -H, Global: Global Options. (line 95) 4243 * -h, Global: Global Options. (line 91) 4244 * -H, Test-specific: Options Common to TCP UDP and SCTP _RR tests. 4245 (line 17) 4246 * -h, Test-specific <1>: Options Common to TCP UDP and SCTP _RR tests. 4247 (line 10) 4248 * -h, Test-specific: Options common to TCP UDP and SCTP tests. 4249 (line 10) 4250 * -i, Global: Global Options. (line 179) 4251 * -I, Global: Global Options. (line 130) 4252 * -j, Global: Global Options. (line 205) 4253 * -k, Test-specific: Native Omni Tests. (line 37) 4254 * -L, Global: Global Options. (line 263) 4255 * -l, Global: Global Options. (line 242) 4256 * -L, Test-specific <1>: Options Common to TCP UDP and SCTP _RR tests. 4257 (line 26) 4258 * -L, Test-specific: Options common to TCP UDP and SCTP tests. 4259 (line 25) 4260 * -M, Test-specific: Options common to TCP UDP and SCTP tests. 4261 (line 48) 4262 * -m, Test-specific: Options common to TCP UDP and SCTP tests. 4263 (line 32) 4264 * -N, Global: Global Options. (line 293) 4265 * -n, Global: Global Options. (line 275) 4266 * -O, Global: Global Options. (line 338) 4267 * -o, Global: Global Options. (line 329) 4268 * -O, Test-specific: Native Omni Tests. (line 62) 4269 * -o, Test-specific: Native Omni Tests. (line 50) 4270 * -P, Global: Global Options. (line 363) 4271 * -p, Global: Global Options. (line 343) 4272 * -P, Test-specific <1>: Options Common to TCP UDP and SCTP _RR tests. 4273 (line 33) 4274 * -P, Test-specific: Options common to TCP UDP and SCTP tests. 4275 (line 61) 4276 * -r, Test-specific: Options Common to TCP UDP and SCTP _RR tests. 4277 (line 36) 4278 * -S Test-specific: Options common to TCP UDP and SCTP tests. 4279 (line 87) 4280 * -S, Global: Global Options. (line 381) 4281 * -s, Global: Global Options. (line 372) 4282 * -S, Test-specific: Options Common to TCP UDP and SCTP _RR tests. 4283 (line 68) 4284 * -s, Test-specific <1>: Options Common to TCP UDP and SCTP _RR tests. 4285 (line 48) 4286 * -s, Test-specific: Options common to TCP UDP and SCTP tests. 4287 (line 64) 4288 * -T, Global: Global Options. (line 423) 4289 * -t, Global: Global Options. (line 391) 4290 * -T, Test-specific: Native Omni Tests. (line 81) 4291 * -t, Test-specific: Native Omni Tests. (line 76) 4292 * -V, Global: Global Options. (line 468) 4293 * -v, Global: Global Options. (line 440) 4294 * -W, Global: Global Options. (line 480) 4295 * -w, Global: Global Options. (line 473) 4296 4297 4298 4299 Tag Table: 4300 Node: Top439 4301 Node: Introduction1476 4302 Node: Conventions4150 4303 Node: Installing Netperf5913 4304 Node: Getting Netperf Bits7467 4305 Node: Installing Netperf Bits9326 4306 Node: Verifying Installation17820 4307 Node: The Design of Netperf18524 4308 Node: CPU Utilization20120 4309 Node: CPU Utilization in a Virtual Guest28844 4310 Node: Global Command-line Options30431 4311 Node: Command-line Options Syntax30970 4312 Node: Global Options32366 4313 Node: Using Netperf to Measure Bulk Data Transfer56529 4314 Node: Issues in Bulk Transfer57202 4315 Node: Options common to TCP UDP and SCTP tests61463 4316 Node: TCP_STREAM67788 4317 Node: TCP_MAERTS71873 4318 Node: TCP_SENDFILE73110 4319 Node: UDP_STREAM75610 4320 Node: XTI_TCP_STREAM79046 4321 Node: XTI_UDP_STREAM79691 4322 Node: SCTP_STREAM80336 4323 Node: DLCO_STREAM81036 4324 Node: DLCL_STREAM83009 4325 Node: STREAM_STREAM83883 4326 Node: DG_STREAM84741 4327 Node: Using Netperf to Measure Request/Response85422 4328 Node: Issues in Request/Response87740 4329 Node: Options Common to TCP UDP and SCTP _RR tests90114 4330 Node: TCP_RR95138 4331 Node: TCP_CC97538 4332 Node: TCP_CRR99772 4333 Node: UDP_RR100834 4334 Node: XTI_TCP_RR103138 4335 Node: XTI_TCP_CC103721 4336 Node: XTI_TCP_CRR104226 4337 Node: XTI_UDP_RR104738 4338 Node: DLCL_RR105315 4339 Node: DLCO_RR105468 4340 Node: SCTP_RR105620 4341 Node: Using Netperf to Measure Aggregate Performance105756 4342 Node: Running Concurrent Netperf Tests106788 4343 Node: Issues in Running Concurrent Tests111429 4344 Node: Using --enable-burst113693 4345 Node: Using --enable-demo120592 4346 Node: Using Netperf to Measure Bidirectional Transfer126148 4347 Node: Bidirectional Transfer with Concurrent Tests127280 4348 Node: Bidirectional Transfer with TCP_RR129636 4349 Node: Implications of Concurrent Tests vs Burst Request/Response132020 4350 Node: The Omni Tests133834 4351 Node: Native Omni Tests134881 4352 Node: Migrated Tests140159 4353 Node: Omni Output Selection142264 4354 Node: Omni Output Selectors145247 4355 Node: Other Netperf Tests174980 4356 Node: CPU rate calibration175415 4357 Node: UUID Generation177783 4358 Node: Address Resolution178499 4359 Node: Enhancing Netperf180475 4360 Node: Netperf4181970 4361 Node: Concept Index182875 4362 Node: Option Index185201 4363 4364 End Tag Table 4365