Home | History | Annotate | Download | only in doc
      1 \input texinfo   @c -*-texinfo-*-
      2 @c %**start of header
      3 @setfilename netperf.info
      4 @settitle Care and Feeding of Netperf 2.6.X
      5 @c %**end of header
      6 
      7 @copying
      8 This is Rick Jones' feeble attempt at a Texinfo-based manual for the
      9 netperf benchmark. 
     10 
     11 Copyright @copyright{} 2005-2012 Hewlett-Packard Company
     12 @quotation
     13 Permission is granted to copy, distribute and/or modify this document
     14 per the terms of the netperf source license, a copy of which can be
     15 found in the file @file{COPYING} of the basic netperf distribution.
     16 @end quotation
     17 @end copying
     18 
     19 @titlepage
     20 @title Care and Feeding of Netperf
     21 @subtitle Versions 2.6.0 and Later
     22 @author Rick Jones @email{rick.jones2@@hp.com}
     23 @c this is here to start the copyright page
     24 @page
     25 @vskip 0pt plus 1filll
     26 @insertcopying
     27 @end titlepage
     28 
     29 @c begin with a table of contents
     30 @contents
     31 
     32 @ifnottex
     33 @node Top, Introduction, (dir), (dir)
     34 @top Netperf Manual
     35 
     36 @insertcopying
     37 @end ifnottex
     38 
     39 @menu
     40 * Introduction::                An introduction to netperf - what it
     41 is and what it is not.
     42 * Installing Netperf::          How to go about installing netperf.
     43 * The Design of Netperf::       
     44 * Global Command-line Options::  
     45 * Using Netperf to Measure Bulk Data Transfer::  
     46 * Using Netperf to Measure Request/Response ::  
     47 * Using Netperf to Measure Aggregate Performance::  
     48 * Using Netperf to Measure Bidirectional Transfer::  
     49 * The Omni Tests::              
     50 * Other Netperf Tests::         
     51 * Address Resolution::          
     52 * Enhancing Netperf::           
     53 * Netperf4::                    
     54 * Concept Index::               
     55 * Option Index::                
     56 @end menu
     57 
     58 @node Introduction, Installing Netperf, Top, Top
     59 @chapter Introduction
     60 
     61 @cindex Introduction
     62 
     63 Netperf is a benchmark that can be use to measure various aspect of
     64 networking performance.  The primary foci are bulk (aka
     65 unidirectional) data transfer and request/response performance using
     66 either TCP or UDP and the Berkeley Sockets interface.  As of this
     67 writing, the tests available either unconditionally or conditionally
     68 include:
     69 
     70 @itemize @bullet
     71 @item
     72 TCP and UDP unidirectional transfer and request/response over IPv4 and
     73 IPv6 using the Sockets interface.
     74 @item
     75 TCP and UDP unidirectional transfer and request/response over IPv4
     76 using the XTI interface.
     77 @item
     78 Link-level unidirectional transfer and request/response using the DLPI
     79 interface. 
     80 @item
     81 Unix domain sockets
     82 @item
     83 SCTP unidirectional transfer and request/response over IPv4 and IPv6
     84 using the sockets interface.
     85 @end itemize
     86 
     87 While not every revision of netperf will work on every platform
     88 listed, the intention is that at least some version of netperf will
     89 work on the following platforms:
     90 
     91 @itemize @bullet
     92 @item
     93 Unix - at least all the major variants.
     94 @item
     95 Linux
     96 @item
     97 Windows
     98 @item
     99 Others
    100 @end itemize
    101 
    102 Netperf is maintained and informally supported primarily by Rick
    103 Jones, who can perhaps be best described as Netperf Contributing
    104 Editor.  Non-trivial and very appreciated assistance comes from others
    105 in the network performance community, who are too numerous to mention
    106 here. While it is often used by them, netperf is NOT supported via any
    107 of the formal Hewlett-Packard support channels.  You should feel free
    108 to make enhancements and modifications to netperf to suit your
    109 nefarious porpoises, so long as you stay within the guidelines of the
    110 netperf copyright.  If you feel so inclined, you can send your changes
    111 to
    112 @email{netperf-feedback@@netperf.org,netperf-feedback} for possible
    113 inclusion into subsequent versions of netperf.
    114 
    115 It is the Contributing Editor's belief that the netperf license walks
    116 like open source and talks like open source. However, the license was
    117 never submitted for ``certification'' as an open source license.  If
    118 you would prefer to make contributions to a networking benchmark using
    119 a certified open source license, please consider netperf4, which is
    120 distributed under the terms of the GPLv2.
    121 
    122 The @email{netperf-talk@@netperf.org,netperf-talk} mailing list is
    123 available to discuss the care and feeding of netperf with others who
    124 share your interest in network performance benchmarking. The
    125 netperf-talk mailing list is a closed list (to deal with spam) and you
    126 must first subscribe by sending email to
    127 @email{netperf-talk-request@@netperf.org,netperf-talk-request}.
    128 
    129 
    130 @menu
    131 * Conventions::                 
    132 @end menu
    133 
    134 @node Conventions,  , Introduction, Introduction
    135 @section Conventions
    136 
    137 A @dfn{sizespec} is a one or two item, comma-separated list used as an
    138 argument to a command-line option that can set one or two, related
    139 netperf parameters.  If you wish to set both parameters to separate
    140 values, items should be separated by a comma:
    141 
    142 @example
    143 parameter1,parameter2
    144 @end example
    145 
    146 If you wish to set the first parameter without altering the value of
    147 the second from its default, you should follow the first item with a
    148 comma:
    149 
    150 @example
    151 parameter1,
    152 @end example
    153 
    154 
    155 Likewise, precede the item with a comma if you wish to set only the
    156 second parameter:
    157 
    158 @example
    159 ,parameter2
    160 @end example
    161 
    162 An item with no commas:
    163 
    164 @example
    165 parameter1and2
    166 @end example
    167 
    168 will set both parameters to the same value.  This last mode is one of
    169 the most frequently used.
    170 
    171 There is another variant of the comma-separated, two-item list called
    172 a @dfn{optionspec} which is like a sizespec with the exception that a
    173 single item with no comma:
    174 
    175 @example
    176 parameter1
    177 @end example
    178 
    179 will only set the value of the first parameter and will leave the
    180 second parameter at its default value.
    181 
    182 Netperf has two types of command-line options.  The first are global
    183 command line options.  They are essentially any option not tied to a
    184 particular test or group of tests.  An example of a global
    185 command-line option is the one which sets the test type - @option{-t}.
    186 
    187 The second type of options are test-specific options.  These are
    188 options which are only applicable to a particular test or set of
    189 tests.  An example of a test-specific option would be the send socket
    190 buffer size for a TCP_STREAM test.
    191 
    192 Global command-line options are specified first with test-specific
    193 options following after a @code{--} as in:
    194 
    195 @example
    196 netperf <global> -- <test-specific>
    197 @end example
    198 
    199 
    200 @node Installing Netperf, The Design of Netperf, Introduction, Top
    201 @chapter Installing Netperf
    202 
    203 @cindex Installation
    204 
    205 Netperf's primary form of distribution is source code.  This allows
    206 installation on systems other than those to which the authors have
    207 ready access and thus the ability to create binaries.  There are two
    208 styles of netperf installation.  The first runs the netperf server
    209 program - netserver - as a child of inetd.  This requires the
    210 installer to have sufficient privileges to edit the files
    211 @file{/etc/services} and @file{/etc/inetd.conf} or their
    212 platform-specific equivalents.
    213 
    214 The second style is to run netserver as a standalone daemon.  This
    215 second method does not require edit privileges on @file{/etc/services}
    216 and @file{/etc/inetd.conf} but does mean you must remember to run the
    217 netserver program explicitly after every system reboot.
    218 
    219 This manual assumes that those wishing to measure networking
    220 performance already know how to use anonymous FTP and/or a web
    221 browser. It is also expected that you have at least a passing
    222 familiarity with the networking protocols and interfaces involved. In
    223 all honesty, if you do not have such familiarity, likely as not you
    224 have some experience to gain before attempting network performance
    225 measurements.  The excellent texts by authors such as Stevens, Fenner
    226 and Rudoff and/or Stallings would be good starting points. There are
    227 likely other excellent sources out there as well.
    228 
    229 @menu
    230 * Getting Netperf Bits::        
    231 * Installing Netperf Bits::     
    232 * Verifying Installation::      
    233 @end menu
    234 
    235 @node Getting Netperf Bits, Installing Netperf Bits, Installing Netperf, Installing Netperf
    236 @section Getting Netperf Bits
    237 
    238 Gzipped tar files of netperf sources can be retrieved via 
    239 @uref{ftp://ftp.netperf.org/netperf,anonymous FTP}
    240 for ``released'' versions of the bits.  Pre-release versions of the
    241 bits can be retrieved via anonymous FTP from the
    242 @uref{ftp://ftp.netperf.org/netperf/experimental,experimental} subdirectory.
    243 
    244 For convenience and ease of remembering, a link to the download site
    245 is provided via the 
    246 @uref{http://www.netperf.org/, NetperfPage}
    247 
    248 The bits corresponding to each discrete release of netperf are
    249 @uref{http://www.netperf.org/svn/netperf2/tags,tagged} for retrieval
    250 via subversion.  For example, there is a tag for the first version
    251 corresponding to this version of the manual - 
    252 @uref{http://www.netperf.org/svn/netperf2/tags/netperf-2.6.0,netperf
    253 2.6.0}.  Those wishing to be on the bleeding edge of netperf
    254 development can use subversion to grab the
    255 @uref{http://www.netperf.org/svn/netperf2/trunk,top of trunk}.  When
    256 fixing bugs or making enhancements, patches against the top-of-trunk
    257 are preferred.
    258 
    259 There are likely other places around the Internet from which one can
    260 download netperf bits.  These may be simple mirrors of the main
    261 netperf site, or they may be local variants on netperf.  As with
    262 anything one downloads from the Internet, take care to make sure it is
    263 what you really wanted and isn't some malicious Trojan or whatnot.
    264 Caveat downloader.
    265 
    266 As a general rule, binaries of netperf and netserver are not
    267 distributed from ftp.netperf.org.  From time to time a kind soul or
    268 souls has packaged netperf as a Debian package available via the
    269 apt-get mechanism or as an RPM.  I would be most interested in
    270 learning how to enhance the makefiles to make that easier for people.
    271 
    272 @node Installing Netperf Bits, Verifying Installation, Getting Netperf Bits, Installing Netperf
    273 @section Installing Netperf
    274 
    275 Once you have downloaded the tar file of netperf sources onto your
    276 system(s), it is necessary to unpack the tar file, cd to the netperf
    277 directory, run configure and then make.  Most of the time it should be
    278 sufficient to just:
    279 
    280 @example
    281 gzcat netperf-<version>.tar.gz | tar xf -
    282 cd netperf-<version>
    283 ./configure
    284 make
    285 make install
    286 @end example
    287 
    288 Most of the ``usual'' configure script options should be present
    289 dealing with where to install binaries and whatnot.  
    290 @example
    291 ./configure --help
    292 @end example
    293 should list all of those and more.  You may find the @code{--prefix}
    294 option helpful in deciding where the binaries and such will be put
    295 during the @code{make install}.
    296 
    297 @vindex --enable-cpuutil, Configure
    298 If the netperf configure script does not know how to automagically
    299 detect which CPU utilization mechanism to use on your platform you may
    300 want to add a @code{--enable-cpuutil=mumble} option to the configure
    301 command.   If you have knowledge and/or experience to contribute to
    302 that area, feel free to contact @email{netperf-feedback@@netperf.org}.
    303 
    304 @vindex --enable-xti, Configure
    305 @vindex --enable-unixdomain, Configure
    306 @vindex --enable-dlpi, Configure
    307 @vindex --enable-sctp, Configure
    308 Similarly, if you want tests using the XTI interface, Unix Domain
    309 Sockets, DLPI or SCTP it will be necessary to add one or more
    310 @code{--enable-[xti|unixdomain|dlpi|sctp]=yes} options to the configure
    311 command.  As of this writing, the configure script will not include
    312 those tests automagically.
    313 
    314 @vindex --enable-omni, Configure
    315 Starting with version 2.5.0, netperf began migrating most of the
    316 ``classic'' netperf tests found in @file{src/nettest_bsd.c} to the
    317 so-called ``omni'' tests (aka ``two routines to run them all'') found
    318 in @file{src/nettest_omni.c}.  This migration enables a number of new
    319 features such as greater control over what output is included, and new
    320 things to output.  The ``omni'' test is enabled by default in 2.5.0
    321 and a number of the classic tests are migrated - you can tell if a
    322 test has been migrated
    323 from the presence of @code{MIGRATED} in the test banner.  If you
    324 encounter problems with either the omni or migrated tests, please
    325 first attempt to obtain resolution via
    326 @email{netperf-talk@@netperf.org} or
    327 @email{netperf-feedback@@netperf.org}.  If that is unsuccessful, you
    328 can add a @code{--enable-omni=no} to the configure command and the
    329 omni tests will not be compiled-in and the classic tests will not be
    330 migrated.
    331 
    332 Starting with version 2.5.0, netperf includes the ``burst mode''
    333 functionality in a default compilation of the bits.  If you encounter
    334 problems with this, please first attempt to obtain help via
    335 @email{netperf-talk@@netperf.org} or
    336 @email{netperf-feedback@@netperf.org}.  If that is unsuccessful, you
    337 can add a @code{--enable-burst=no} to the configure command and the
    338 burst mode functionality will not be compiled-in.
    339 
    340 On some platforms, it may be necessary to precede the configure
    341 command with a CFLAGS and/or LIBS variable as the netperf configure
    342 script is not yet smart enough to set them itself.  Whenever possible,
    343 these requirements will be found in @file{README.@var{platform}} files.
    344 Expertise and assistance in making that more automagic in the
    345 configure script would be most welcome.
    346 
    347 @cindex Limiting Bandwidth
    348 @cindex Bandwidth Limitation
    349 @vindex --enable-intervals, Configure
    350 @vindex --enable-histogram, Configure
    351 Other optional configure-time settings include
    352 @code{--enable-intervals=yes} to give netperf the ability to ``pace''
    353 its _STREAM tests and @code{--enable-histogram=yes} to have netperf
    354 keep a histogram of interesting times.  Each of these will have some
    355 effect on the measured result.  If your system supports
    356 @code{gethrtime()} the effect of the histogram measurement should be
    357 minimized but probably still measurable.  For example, the histogram
    358 of a netperf TCP_RR test will be of the individual transaction times:
    359 @example
    360 netperf -t TCP_RR -H lag -v 2
    361 TCP REQUEST/RESPONSE TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to lag.hpl.hp.com (15.4.89.214) port 0 AF_INET : histogram
    362 Local /Remote
    363 Socket Size   Request  Resp.   Elapsed  Trans.
    364 Send   Recv   Size     Size    Time     Rate         
    365 bytes  Bytes  bytes    bytes   secs.    per sec   
    366 
    367 16384  87380  1        1       10.00    3538.82   
    368 32768  32768 
    369 Alignment      Offset
    370 Local  Remote  Local  Remote
    371 Send   Recv    Send   Recv
    372     8      0       0      0
    373 Histogram of request/response times
    374 UNIT_USEC     :    0:    0:    0:    0:    0:    0:    0:    0:    0:    0
    375 TEN_USEC      :    0:    0:    0:    0:    0:    0:    0:    0:    0:    0
    376 HUNDRED_USEC  :    0: 34480:  111:   13:   12:    6:    9:    3:    4:    7
    377 UNIT_MSEC     :    0:   60:   50:   51:   44:   44:   72:  119:  100:  101
    378 TEN_MSEC      :    0:  105:    0:    0:    0:    0:    0:    0:    0:    0
    379 HUNDRED_MSEC  :    0:    0:    0:    0:    0:    0:    0:    0:    0:    0
    380 UNIT_SEC      :    0:    0:    0:    0:    0:    0:    0:    0:    0:    0
    381 TEN_SEC       :    0:    0:    0:    0:    0:    0:    0:    0:    0:    0
    382 >100_SECS: 0
    383 HIST_TOTAL:      35391
    384 @end example
    385 
    386 The histogram you see above is basically a base-10 log histogram where
    387 we can see that most of the transaction times were on the order of one
    388 hundred to one-hundred, ninety-nine microseconds, but they were
    389 occasionally as long as ten to nineteen milliseconds
    390 
    391 The @option{--enable-demo=yes} configure option will cause code to be
    392 included to report interim results during a test run.  The rate at
    393 which interim results are reported can then be controlled via the
    394 global @option{-D} option.  Here is an example of @option{-D} output:
    395 
    396 @example
    397 $ src/netperf -D 1.35 -H tardy.hpl.hp.com -f M
    398 MIGRATED TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to tardy.hpl.hp.com (15.9.116.144) port 0 AF_INET : demo
    399 Interim result:    5.41 MBytes/s over 1.35 seconds ending at 1308789765.848
    400 Interim result:   11.07 MBytes/s over 1.36 seconds ending at 1308789767.206
    401 Interim result:   16.00 MBytes/s over 1.36 seconds ending at 1308789768.566
    402 Interim result:   20.66 MBytes/s over 1.36 seconds ending at 1308789769.922
    403 Interim result:   22.74 MBytes/s over 1.36 seconds ending at 1308789771.285
    404 Interim result:   23.07 MBytes/s over 1.36 seconds ending at 1308789772.647
    405 Interim result:   23.77 MBytes/s over 1.37 seconds ending at 1308789774.016
    406 Recv   Send    Send                          
    407 Socket Socket  Message  Elapsed              
    408 Size   Size    Size     Time     Throughput  
    409 bytes  bytes   bytes    secs.    MBytes/sec  
    410 
    411  87380  16384  16384    10.06      17.81   
    412 @end example
    413 
    414 Notice how the units of the interim result track that requested by the
    415 @option{-f} option.  Also notice that sometimes the interval will be
    416 longer than the value specified in the @option{-D} option.  This is
    417 normal and stems from how demo mode is implemented not by relying on
    418 interval timers or frequent calls to get the current time, but by
    419 calculating how many units of work must be performed to take at least
    420 the desired interval.
    421 
    422 Those familiar with this option in earlier versions of netperf will
    423 note the addition of the ``ending at'' text.  This is the time as
    424 reported by a @code{gettimeofday()} call (or its emulation) with a
    425 @code{NULL} timezone pointer.  This addition is intended to make it
    426 easier to insert interim results into an
    427 @uref{http://oss.oetiker.ch/rrdtool/doc/rrdtool.en.html,rrdtool}
    428 Round-Robin Database (RRD).  A likely bug-riddled example of doing so
    429 can be found in @file{doc/examples/netperf_interim_to_rrd.sh}.  The
    430 time is reported out to milliseconds rather than microseconds because
    431 that is the most rrdtool understands as of the time of this writing.
    432 
    433 As of this writing, a @code{make install} will not actually update the
    434 files @file{/etc/services} and/or @file{/etc/inetd.conf} or their
    435 platform-specific equivalents.  It remains necessary to perform that
    436 bit of installation magic by hand.  Patches to the makefile sources to
    437 effect an automagic editing of the necessary files to have netperf
    438 installed as a child of inetd would be most welcome.
    439 
    440 Starting the netserver as a standalone daemon should be as easy as:
    441 @example
    442 $ netserver
    443 Starting netserver at port 12865
    444 Starting netserver at hostname 0.0.0.0 port 12865 and family 0
    445 @end example
    446 
    447 Over time the specifics of the messages netserver prints to the screen
    448 may change but the gist will remain the same.
    449 
    450 If the compilation of netperf or netserver happens to fail, feel free
    451 to contact @email{netperf-feedback@@netperf.org} or join and ask in
    452 @email{netperf-talk@@netperf.org}.  However, it is quite important
    453 that you include the actual compilation errors and perhaps even the
    454 configure log in your email.  Otherwise, it will be that much more
    455 difficult for someone to assist you.
    456 
    457 @node Verifying Installation,  , Installing Netperf Bits, Installing Netperf
    458 @section Verifying Installation
    459 
    460 Basically, once netperf is installed and netserver is configured as a
    461 child of inetd, or launched as a standalone daemon, simply typing:
    462 @example
    463 netperf
    464 @end example
    465 should result in output similar to the following:
    466 @example
    467 $ netperf
    468 TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to localhost.localdomain (127.0.0.1) port 0 AF_INET
    469 Recv   Send    Send                          
    470 Socket Socket  Message  Elapsed              
    471 Size   Size    Size     Time     Throughput  
    472 bytes  bytes   bytes    secs.    10^6bits/sec  
    473 
    474  87380  16384  16384    10.00    2997.84   
    475 @end example
    476 
    477 
    478 @node The Design of Netperf, Global Command-line Options, Installing Netperf, Top
    479 @chapter The Design of Netperf
    480 
    481 @cindex Design of Netperf
    482 
    483 Netperf is designed around a basic client-server model.  There are
    484 two executables - netperf and netserver.  Generally you will only
    485 execute the netperf program, with the netserver program being invoked
    486 by the remote system's inetd or having been previously started as its
    487 own standalone daemon.
    488 
    489 When you execute netperf it will establish a ``control connection'' to
    490 the remote system.  This connection will be used to pass test
    491 configuration information and results to and from the remote system.
    492 Regardless of the type of test to be run, the control connection will
    493 be a TCP connection using BSD sockets.  The control connection can use
    494 either IPv4 or IPv6.
    495 
    496 Once the control connection is up and the configuration information
    497 has been passed, a separate ``data'' connection will be opened for the
    498 measurement itself using the API's and protocols appropriate for the
    499 specified test.  When the test is completed, the data connection will
    500 be torn-down and results from the netserver will be passed-back via the
    501 control connection and combined with netperf's result for display to
    502 the user.
    503 
    504 Netperf places no traffic on the control connection while a test is in
    505 progress.  Certain TCP options, such as SO_KEEPALIVE, if set as your
    506 systems' default, may put packets out on the control connection while
    507 a test is in progress.  Generally speaking this will have no effect on
    508 the results.
    509 
    510 @menu
    511 * CPU Utilization::             
    512 @end menu
    513 
    514 @node CPU Utilization,  , The Design of Netperf, The Design of Netperf
    515 @section CPU Utilization
    516 @cindex CPU Utilization
    517 
    518 CPU utilization is an important, and alas all-too infrequently
    519 reported component of networking performance.  Unfortunately, it can
    520 be one of the most difficult metrics to measure accurately and
    521 portably.  Netperf will do its level best to report accurate
    522 CPU utilization figures, but some combinations of processor, OS and
    523 configuration may make that difficult.
    524 
    525 CPU utilization in netperf is reported as a value between 0 and 100%
    526 regardless of the number of CPUs involved.  In addition to CPU
    527 utilization, netperf will report a metric called a @dfn{service
    528 demand}.  The service demand is the normalization of CPU utilization
    529 and work performed.  For a _STREAM test it is the microseconds of CPU
    530 time consumed to transfer on KB (K == 1024) of data.  For a _RR test
    531 it is the microseconds of CPU time consumed processing a single
    532 transaction.   For both CPU utilization and service demand, lower is
    533 better. 
    534 
    535 Service demand can be particularly useful when trying to gauge the
    536 effect of a performance change.  It is essentially a measure of
    537 efficiency, with smaller values being more efficient and thus
    538 ``better.''
    539 
    540 Netperf is coded to be able to use one of several, generally
    541 platform-specific CPU utilization measurement mechanisms.  Single
    542 letter codes will be included in the CPU portion of the test banner to
    543 indicate which mechanism was used on each of the local (netperf) and
    544 remote (netserver) system.
    545 
    546 As of this writing those codes are:
    547 
    548 @table @code
    549 @item U
    550 The CPU utilization measurement mechanism was unknown to netperf or
    551 netperf/netserver was not compiled to include CPU utilization
    552 measurements. The code for the null CPU utilization mechanism can be
    553 found in @file{src/netcpu_none.c}.
    554 @item I
    555 An HP-UX-specific CPU utilization mechanism whereby the kernel
    556 incremented a per-CPU counter by one for each trip through the idle
    557 loop. This mechanism was only available on specially-compiled HP-UX
    558 kernels prior to HP-UX 10 and is mentioned here only for the sake of
    559 historical completeness and perhaps as a suggestion to those who might
    560 be altering other operating systems. While rather simple, perhaps even
    561 simplistic, this mechanism was quite robust and was not affected by
    562 the concerns of statistical methods, or methods attempting to track
    563 time in each of user, kernel, interrupt and idle modes which require
    564 quite careful accounting.  It can be thought-of as the in-kernel
    565 version of the looper @code{L} mechanism without the context switch
    566 overhead. This mechanism required calibration.
    567 @item P
    568 An HP-UX-specific CPU utilization mechanism whereby the kernel
    569 keeps-track of time (in the form of CPU cycles) spent in the kernel
    570 idle loop (HP-UX 10.0 to 11.31 inclusive), or where the kernel keeps
    571 track of time spent in idle, user, kernel and interrupt processing
    572 (HP-UX 11.23 and later).  The former requires calibration, the latter
    573 does not.  Values in either case are retrieved via one of the pstat(2)
    574 family of calls, hence the use of the letter @code{P}.  The code for
    575 these mechanisms is found in @file{src/netcpu_pstat.c} and
    576 @file{src/netcpu_pstatnew.c} respectively.
    577 @item K
    578 A Solaris-specific CPU utilization mechanism whereby the kernel keeps
    579 track of ticks (eg HZ) spent in the idle loop.  This method is
    580 statistical and is known to be inaccurate when the interrupt rate is
    581 above epsilon as time spent processing interrupts is not subtracted
    582 from idle.  The value is retrieved via a kstat() call - hence the use
    583 of the letter @code{K}.  Since this mechanism uses units of ticks (HZ)
    584 the calibration value should invariably match HZ. (Eg 100) The code
    585 for this mechanism is implemented in @file{src/netcpu_kstat.c}.
    586 @item M
    587 A Solaris-specific mechanism available on Solaris 10 and latter which
    588 uses the new microstate accounting mechanisms.  There are two, alas,
    589 overlapping, mechanisms.  The first tracks nanoseconds spent in user,
    590 kernel, and idle modes. The second mechanism tracks nanoseconds spent
    591 in interrupt.  Since the mechanisms overlap, netperf goes through some
    592 hand-waving to try to ``fix'' the problem.  Since the accuracy of the
    593 handwaving cannot be completely determined, one must presume that
    594 while better than the @code{K} mechanism, this mechanism too is not
    595 without issues.  The values are retrieved via kstat() calls, but the
    596 letter code is set to @code{M} to distinguish this mechanism from the
    597 even less accurate @code{K} mechanism.  The code for this mechanism is
    598 implemented in @file{src/netcpu_kstat10.c}.
    599 @item L
    600 A mechanism based on ``looper''or ``soaker'' processes which sit in
    601 tight loops counting as fast as they possibly can. This mechanism
    602 starts a looper process for each known CPU on the system.  The effect
    603 of processor hyperthreading on the mechanism is not yet known.  This
    604 mechanism definitely requires calibration.  The code for the
    605 ``looper''mechanism can be found in @file{src/netcpu_looper.c}
    606 @item N
    607 A Microsoft Windows-specific mechanism, the code for which can be
    608 found in @file{src/netcpu_ntperf.c}.  This mechanism too is based on
    609 what appears to be a form of micro-state accounting and requires no
    610 calibration.  On laptops, or other systems which may dynamically alter
    611 the CPU frequency to minimize power consumption, it has been suggested
    612 that this mechanism may become slightly confused, in which case using
    613 BIOS/uEFI settings to disable the power saving would be indicated.
    614 
    615 @item S
    616 This mechanism uses @file{/proc/stat} on Linux to retrieve time
    617 (ticks) spent in idle mode.  It is thought but not known to be
    618 reasonably accurate.  The code for this mechanism can be found in
    619 @file{src/netcpu_procstat.c}.
    620 @item C
    621 A mechanism somewhat similar to @code{S} but using the sysctl() call
    622 on BSD-like Operating systems (*BSD and MacOS X).  The code for this
    623 mechanism can be found in @file{src/netcpu_sysctl.c}.
    624 @item Others
    625 Other mechanisms included in netperf in the past have included using
    626 the times() and getrusage() calls.  These calls are actually rather
    627 poorly suited to the task of measuring CPU overhead for networking as
    628 they tend to be process-specific and much network-related processing
    629 can happen outside the context of a process, in places where it is not
    630 a given it will be charged to the correct, or even a process.  They
    631 are mentioned here as a warning to anyone seeing those mechanisms used
    632 in other networking benchmarks.  These mechanisms are not available in
    633 netperf 2.4.0 and later.
    634 @end table
    635 
    636 For many platforms, the configure script will chose the best available
    637 CPU utilization mechanism.  However, some platforms have no
    638 particularly good mechanisms.  On those platforms, it is probably best
    639 to use the ``LOOPER'' mechanism which is basically some number of
    640 processes (as many as there are processors) sitting in tight little
    641 loops counting as fast as they can.  The rate at which the loopers
    642 count when the system is believed to be idle is compared with the rate
    643 when the system is running netperf and the ratio is used to compute
    644 CPU utilization.
    645 
    646 In the past, netperf included some mechanisms that only reported CPU
    647 time charged to the calling process.  Those mechanisms have been
    648 removed from netperf versions 2.4.0 and later because they are
    649 hopelessly inaccurate.  Networking can and often results in CPU time
    650 being spent in places - such as interrupt contexts - that do not get
    651 charged to a or the correct process.
    652 
    653 In fact, time spent in the processing of interrupts is a common issue
    654 for many CPU utilization mechanisms.  In particular, the ``PSTAT''
    655 mechanism was eventually known to have problems accounting for certain
    656 interrupt time prior to HP-UX 11.11 (11iv1).  HP-UX 11iv2 and later
    657 are known/presumed to be good. The ``KSTAT'' mechanism is known to
    658 have problems on all versions of Solaris up to and including Solaris
    659 10.  Even the microstate accounting available via kstat in Solaris 10
    660 has issues, though perhaps not as bad as those of prior versions.
    661 
    662 The /proc/stat mechanism under Linux is in what the author would
    663 consider an ``uncertain'' category as it appears to be statistical,
    664 which may also have issues with time spent processing interrupts.
    665 
    666 In summary, be sure to ``sanity-check'' the CPU utilization figures
    667 with other mechanisms.  However, platform tools such as top, vmstat or
    668 mpstat are often based on the same mechanisms used by netperf.
    669 
    670 @menu
    671 * CPU Utilization in a Virtual Guest::  
    672 @end menu
    673 
    674 @node CPU Utilization in a Virtual Guest,  , CPU Utilization, CPU Utilization
    675 @subsection CPU Utilization in a Virtual Guest
    676 
    677 The CPU utilization mechanisms used by netperf are ``inline'' in that
    678 they are run by the same netperf or netserver process as is running
    679 the test itself.  This works just fine for ``bare iron'' tests but
    680 runs into a problem when using virtual machines.
    681 
    682 The relationship between virtual guest and hypervisor can be thought
    683 of as being similar to that between a process and kernel in a bare
    684 iron system.  As such, (m)any CPU utilization mechanisms used in the
    685 virtual guest are similar to ``process-local'' mechanisms in a bare
    686 iron situation.  However, just as with bare iron and process-local
    687 mechanisms, much networking processing happens outside the context of
    688 the virtual guest.  It takes place in the hypervisor, and is not
    689 visible to mechanisms running in the guest(s).  For this reason, one
    690 should not really trust CPU utilization figures reported by netperf or
    691 netserver when running in a virtual guest.
    692 
    693 If one is looking to measure the added overhead of a virtualization
    694 mechanism, rather than rely on CPU utilization, one can rely instead
    695 on netperf _RR tests - path-lengths and overheads can be a significant
    696 fraction of the latency, so increases in overhead should appear as
    697 decreases in transaction rate.  Whatever you do, @b{DO NOT} rely on
    698 the throughput of a _STREAM test.  Achieving link-rate can be done via
    699 a multitude of options that mask overhead rather than eliminate it.
    700 
    701 @node Global Command-line Options, Using Netperf to Measure Bulk Data Transfer, The Design of Netperf, Top
    702 @chapter Global Command-line Options
    703 
    704 This section describes each of the global command-line options
    705 available in the netperf and netserver binaries.  Essentially, it is
    706 an expanded version of the usage information displayed by netperf or
    707 netserver when invoked with the @option{-h} global command-line
    708 option.
    709 
    710 @menu
    711 * Command-line Options Syntax::  
    712 * Global Options::              
    713 @end menu
    714 
    715 @node Command-line Options Syntax, Global Options, Global Command-line Options, Global Command-line Options
    716 @comment  node-name,  next,  previous,  up
    717 @section Command-line Options Syntax
    718 
    719 Revision 1.8 of netperf introduced enough new functionality to overrun
    720 the English alphabet for mnemonic command-line option names, and the
    721 author was not and is not quite ready to switch to the contemporary
    722 @option{--mumble} style of command-line options. (Call him a Luddite
    723 if you wish :).
    724 
    725 For this reason, the command-line options were split into two parts -
    726 the first are the global command-line options.  They are options that
    727 affect nearly any and every test type of netperf.  The second type are
    728 the test-specific command-line options.  Both are entered on the same
    729 command line, but they must be separated from one another by a @code{--}
    730 for correct parsing.  Global command-line options come first, followed
    731 by the @code{--} and then test-specific command-line options.  If there
    732 are no test-specific options to be set, the @code{--} may be omitted.  If
    733 there are no global command-line options to be set, test-specific
    734 options must still be preceded by a @code{--}.  For example:
    735 @example
    736 netperf <global> -- <test-specific>
    737 @end example
    738 sets both global and test-specific options:
    739 @example
    740 netperf <global>
    741 @end example
    742 sets just global options and:
    743 @example
    744 netperf -- <test-specific>
    745 @end example
    746 sets just test-specific options.
    747 
    748 @node Global Options,  , Command-line Options Syntax, Global Command-line Options
    749 @comment  node-name,  next,  previous,  up
    750 @section Global Options
    751 
    752 @table @code
    753 @vindex -a, Global
    754 @item -a <sizespec>
    755 This option allows you to alter the alignment of the buffers used in
    756 the sending and receiving calls on the local system.. Changing the
    757 alignment of the buffers can force the system to use different copy
    758 schemes, which can have a measurable effect on performance.  If the
    759 page size for the system were 4096 bytes, and you want to pass
    760 page-aligned buffers beginning on page boundaries, you could use
    761 @samp{-a 4096}.  By default the units are bytes, but suffix of ``G,''
    762 ``M,'' or ``K'' will specify the units to be 2^30 (GB), 2^20 (MB) or
    763 2^10 (KB) respectively. A suffix of ``g,'' ``m'' or ``k'' will specify
    764 units of 10^9, 10^6 or 10^3 bytes respectively. [Default: 8 bytes]
    765 
    766 @vindex -A, Global
    767 @item -A <sizespec>
    768 This option is identical to the @option{-a} option with the difference
    769 being it affects alignments for the remote system.
    770 
    771 @vindex -b, Global
    772 @item -b <size>
    773 This option is only present when netperf has been configure with
    774 --enable-intervals=yes prior to compilation.  It sets the size of the
    775 burst of send calls in a _STREAM test.  When used in conjunction with
    776 the @option{-w} option it can cause the rate at which data is sent to
    777 be ``paced.''
    778 
    779 @vindex -B, Global
    780 @item -B <string>
    781 This option will cause @option{<string>} to be appended to the brief
    782 (see -P) output of netperf.
    783 
    784 @vindex -c, Global
    785 @item -c [rate]
    786 This option will ask that CPU utilization and service demand be
    787 calculated for the local system.  For those CPU utilization mechanisms
    788 requiring calibration, the options rate parameter may be specified to
    789 preclude running another calibration step, saving 40 seconds of time.
    790 For those CPU utilization mechanisms requiring no calibration, the
    791 optional rate parameter will be utterly and completely ignored.
    792 [Default: no CPU measurements]
    793 
    794 @vindex -C, Global
    795 @item -C [rate]
    796 This option requests CPU utilization and service demand calculations
    797 for the remote system.  It is otherwise identical to the @option{-c}
    798 option.
    799 
    800 @vindex -d, Global
    801 @item -d
    802 Each instance of this option will increase the quantity of debugging
    803 output displayed during a test.  If the debugging output level is set
    804 high enough, it may have a measurable effect on performance.
    805 Debugging information for the local system is printed to stdout.
    806 Debugging information for the remote system is sent by default to the
    807 file @file{/tmp/netperf.debug}. [Default: no debugging output]
    808 
    809 @vindex -D, Global
    810 @item -D [interval,units]
    811 This option is only available when netperf is configured with
    812 --enable-demo=yes.  When set, it will cause netperf to emit periodic
    813 reports of performance during the run.  [@var{interval},@var{units}]
    814 follow the semantics of an optionspec. If specified,
    815 @var{interval} gives the minimum interval in real seconds, it does not
    816 have to be whole seconds.  The @var{units} value can be used for the
    817 first guess as to how many units of work (bytes or transactions) must
    818 be done to take at least @var{interval} seconds. If omitted,
    819 @var{interval} defaults to one second and @var{units} to values
    820 specific to each test type.
    821 
    822 @vindex -f, Global
    823 @item -f G|M|K|g|m|k|x
    824 This option can be used to change the reporting units for _STREAM
    825 tests.  Arguments of ``G,'' ``M,'' or ``K'' will set the units to
    826 2^30, 2^20 or 2^10 bytes/s respectively (EG power of two GB, MB or
    827 KB).  Arguments of ``g,'' ``,m'' or ``k'' will set the units to 10^9,
    828 10^6 or 10^3 bits/s respectively.  An argument of ``x'' requests the
    829 units be transactions per second and is only meaningful for a
    830 request-response test. [Default: ``m'' or 10^6 bits/s]
    831 
    832 @vindex -F, Global
    833 @item -F <fillfile>
    834 This option specified the file from which send which buffers will be
    835 pre-filled .  While the buffers will contain data from the specified
    836 file, the file is not fully transferred to the remote system as the
    837 receiving end of the test will not write the contents of what it
    838 receives to a file.  This can be used to pre-fill the send buffers
    839 with data having different compressibility and so is useful when
    840 measuring performance over mechanisms which perform compression. 
    841 
    842 While previously required for a TCP_SENDFILE test, later versions of
    843 netperf removed that restriction, creating a temporary file as
    844 needed.  While the author cannot recall exactly when that took place,
    845 it is known to be unnecessary in version 2.5.0 and later.
    846 
    847 @vindex -h, Global
    848 @item -h
    849 This option causes netperf to display its ``global'' usage string and
    850 exit to the exclusion of all else.
    851 
    852 @vindex -H, Global
    853 @item -H <optionspec>
    854 This option will set the name of the remote system and or the address
    855 family used for the control connection.  For example:
    856 @example
    857 -H linger,4
    858 @end example
    859 will set the name of the remote system to ``linger'' and tells netperf to
    860 use IPv4 addressing only.
    861 @example
    862 -H ,6
    863 @end example
    864 will leave the name of the remote system at its default, and request
    865 that only IPv6 addresses be used for the control connection.
    866 @example
    867 -H lag
    868 @end example
    869 will set the name of the remote system to ``lag'' and leave the
    870 address family to AF_UNSPEC which means selection of IPv4 vs IPv6 is
    871 left to the system's address resolution.  
    872 
    873 A value of ``inet'' can be used in place of ``4'' to request IPv4 only
    874 addressing.  Similarly, a value of ``inet6'' can be used in place of
    875 ``6'' to request IPv6 only addressing.  A value of ``0'' can be used
    876 to request either IPv4 or IPv6 addressing as name resolution dictates.
    877 
    878 By default, the options set with the global @option{-H} option are
    879 inherited by the test for its data connection, unless a test-specific
    880 @option{-H} option is specified.
    881 
    882 If a @option{-H} option follows either the @option{-4} or @option{-6}
    883 options, the family setting specified with the -H option will override
    884 the @option{-4} or @option{-6} options for the remote address
    885 family. If no address family is specified, settings from a previous
    886 @option{-4} or @option{-6} option will remain.  In a nutshell, the
    887 last explicit global command-line option wins.
    888 
    889 [Default:  ``localhost'' for the remote name/IP address and ``0'' (eg
    890 AF_UNSPEC) for the remote address family.]
    891 
    892 @vindex -I, Global
    893 @item -I <optionspec>
    894 This option enables the calculation of confidence intervals and sets
    895 the confidence and width parameters with the first half of the
    896 optionspec being either 99 or 95 for 99% or 95% confidence
    897 respectively.  The second value of the optionspec specifies the width
    898 of the desired confidence interval.  For example
    899 @example
    900 -I 99,5
    901 @end example
    902 asks netperf to be 99% confident that the measured mean values for
    903 throughput and CPU utilization are within +/- 2.5% of the ``real''
    904 mean values.  If the @option{-i} option is specified and the
    905 @option{-I} option is omitted, the confidence defaults to 99% and the
    906 width to 5% (giving +/- 2.5%)
    907 
    908 If classic netperf test calculates that the desired confidence
    909 intervals have not been met, it emits a noticeable warning that cannot
    910 be suppressed with the @option{-P} or @option{-v} options:
    911 
    912 @example
    913 netperf -H tardy.cup -i 3 -I 99,5
    914 TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to tardy.cup.hp.com (15.244.44.58) port 0 AF_INET : +/-2.5% @ 99% conf.
    915 !!! WARNING
    916 !!! Desired confidence was not achieved within the specified iterations.
    917 !!! This implies that there was variability in the test environment that
    918 !!! must be investigated before going further.
    919 !!! Confidence intervals: Throughput      :  6.8%
    920 !!!                       Local CPU util  :  0.0%
    921 !!!                       Remote CPU util :  0.0%
    922 
    923 Recv   Send    Send                          
    924 Socket Socket  Message  Elapsed              
    925 Size   Size    Size     Time     Throughput  
    926 bytes  bytes   bytes    secs.    10^6bits/sec  
    927 
    928  32768  16384  16384    10.01      40.23   
    929 @end example
    930 
    931 In the example above we see that netperf did not meet the desired
    932 confidence intervals.  Instead of being 99% confident it was within
    933 +/- 2.5% of the real mean value of throughput it is only confident it
    934 was within +/-3.4%.  In this example, increasing the @option{-i}
    935 option (described below) and/or increasing the iteration length with
    936 the @option{-l} option might resolve the situation.
    937 
    938 In an explicit ``omni'' test, failure to meet the confidence intervals
    939 will not result in netperf emitting a warning.  To verify the hitting,
    940 or not, of the confidence intervals one will need to include them as
    941 part of an @ref{Omni Output Selection,output selection} in the
    942 test-specific @option{-o}, @option{-O} or @option{k} output selection
    943 options.  The warning about not hitting the confidence intervals will
    944 remain in a ``migrated'' classic netperf test.
    945 
    946 @vindex -i, Global
    947 @item -i <sizespec>
    948 This option enables the calculation of confidence intervals and sets
    949 the minimum and maximum number of iterations to run in attempting to
    950 achieve the desired confidence interval.  The first value sets the
    951 maximum number of iterations to run, the second, the minimum.  The
    952 maximum number of iterations is silently capped at 30 and the minimum
    953 is silently floored at 3.  Netperf repeats the measurement the minimum
    954 number of iterations and continues until it reaches either the
    955 desired confidence interval, or the maximum number of iterations,
    956 whichever comes first.  A classic or migrated netperf test will not
    957 display the actual number of iterations run. An @ref{The Omni
    958 Tests,omni test} will emit the number of iterations run if the
    959 @code{CONFIDENCE_ITERATION} output selector is included in the
    960 @ref{Omni Output Selection,output selection}.
    961 
    962 If the @option{-I} option is specified and the @option{-i} option
    963 omitted the maximum number of iterations is set to 10 and the minimum
    964 to three.
    965 
    966 Output of a warning upon not hitting the desired confidence intervals
    967 follows the description provided for the @option{-I} option.
    968 
    969 The total test time will be somewhere between the minimum and maximum
    970 number of iterations multiplied by the test length supplied by the
    971 @option{-l} option.
    972 
    973 @vindex -j, Global
    974 @item -j
    975 This option instructs netperf to keep additional timing statistics
    976 when explicitly running an @ref{The Omni Tests,omni test}.  These can
    977 be output when the test-specific @option{-o}, @option{-O} or
    978 @option{-k} @ref{Omni Output Selectors,output selectors} include one
    979 or more of:
    980 
    981 @itemize
    982 @item MIN_LATENCY
    983 @item MAX_LATENCY
    984 @item P50_LATENCY
    985 @item P90_LATENCY
    986 @item P99_LATENCY
    987 @item MEAN_LATENCY
    988 @item STDDEV_LATENCY
    989 @end itemize
    990 
    991 These statistics will be based on an expanded (100 buckets per row
    992 rather than 10) histogram of times rather than a terribly long list of
    993 individual times.  As such, there will be some slight error thanks to
    994 the bucketing. However, the reduction in storage and processing
    995 overheads is well worth it.  When running a request/response test, one
    996 might get some idea of the error by comparing the @ref{Omni Output
    997 Selectors,@code{MEAN_LATENCY}} calculated from the histogram with the
    998 @code{RT_LATENCY} calculated from the number of request/response
    999 transactions and the test run time.
   1000 
   1001 In the case of a request/response test the latencies will be
   1002 transaction latencies.  In the case of a receive-only test they will
   1003 be time spent in the receive call.  In the case of a send-only test
   1004 they will be time spent in the send call. The units will be
   1005 microseconds. Added in netperf 2.5.0.
   1006 
   1007 @vindex -l, Global
   1008 @item -l testlen
   1009 This option controls the length of any @b{one} iteration of the requested
   1010 test.  A positive value for @var{testlen} will run each iteration of
   1011 the test for at least @var{testlen} seconds.  A negative value for
   1012 @var{testlen} will run each iteration for the absolute value of
   1013 @var{testlen} transactions for a _RR test or bytes for a _STREAM test.
   1014 Certain tests, notably those using UDP can only be timed, they cannot
   1015 be limited by transaction or byte count.  This limitation may be
   1016 relaxed in an @ref{The Omni Tests,omni} test.
   1017 
   1018 In some situations, individual iterations of a test may run for longer
   1019 for the number of seconds specified by the @option{-l} option.  In
   1020 particular, this may occur for those tests where the socket buffer
   1021 size(s) are significantly longer than the bandwidthXdelay product of
   1022 the link(s) over which the data connection passes, or those tests
   1023 where there may be non-trivial numbers of retransmissions.
   1024 
   1025 If confidence intervals are enabled via either @option{-I} or
   1026 @option{-i} the total length of the netperf test will be somewhere
   1027 between the minimum and maximum iteration count multiplied by
   1028 @var{testlen}.
   1029 
   1030 @vindex -L, Global
   1031 @item -L <optionspec>
   1032 This option is identical to the @option{-H} option with the difference
   1033 being it sets the _local_ hostname/IP and/or address family
   1034 information.  This option is generally unnecessary, but can be useful
   1035 when you wish to make sure that the netperf control and data
   1036 connections go via different paths.  It can also come-in handy if one
   1037 is trying to run netperf through those evil, end-to-end breaking
   1038 things known as firewalls.
   1039 
   1040 [Default: 0.0.0.0 (eg INADDR_ANY) for IPv4 and ::0 for IPv6 for the
   1041 local name.  AF_UNSPEC for the local address family.]
   1042 
   1043 @vindex -n, Global
   1044 @item -n numcpus
   1045 This option tells netperf how many CPUs it should ass-u-me are active
   1046 on the system running netperf.  In particular, this is used for the
   1047 @ref{CPU Utilization,CPU utilization} and service demand calculations.
   1048 On certain systems, netperf is able to determine the number of CPU's
   1049 automagically. This option will override any number netperf might be
   1050 able to determine on its own.
   1051 
   1052 Note that this option does _not_ set the number of CPUs on the system
   1053 running netserver.  When netperf/netserver cannot automagically
   1054 determine the number of CPUs that can only be set for netserver via a
   1055 netserver @option{-n} command-line option.
   1056 
   1057 As it is almost universally possible for netperf/netserver to
   1058 determine the number of CPUs on the system automagically, 99 times out
   1059 of 10 this option should not be necessary and may be removed in a
   1060 future release of netperf.
   1061 
   1062 @vindex -N, Global
   1063 @item -N
   1064 This option tells netperf to forgo establishing a control
   1065 connection. This makes it is possible to run some limited netperf
   1066 tests without a corresponding netserver on the remote system.
   1067 
   1068 With this option set, the test to be run is to get all the addressing
   1069 information it needs to establish its data connection from the command
   1070 line or internal defaults.  If not otherwise specified by
   1071 test-specific command line options, the data connection for a
   1072 ``STREAM'' or ``SENDFILE'' test will be to the ``discard'' port, an
   1073 ``RR'' test will be to the ``echo'' port, and a ``MEARTS'' test will
   1074 be to the chargen port.  
   1075 
   1076 The response size of an ``RR'' test will be silently set to be the
   1077 same as the request size.  Otherwise the test would hang if the
   1078 response size was larger than the request size, or would report an
   1079 incorrect, inflated transaction rate if the response size was less
   1080 than the request size.
   1081 
   1082 Since there is no control connection when this option is specified, it
   1083 is not possible to set ``remote'' properties such as socket buffer
   1084 size and the like via the netperf command line. Nor is it possible to
   1085 retrieve such interesting remote information as CPU utilization.
   1086 These items will be displayed as values which should make it
   1087 immediately obvious that was the case.
   1088 
   1089 The only way to change remote characteristics such as socket buffer
   1090 size or to obtain information such as CPU utilization is to employ
   1091 platform-specific methods on the remote system.  Frankly, if one has
   1092 access to the remote system to employ those methods one aught to be
   1093 able to run a netserver there.  However, that ability may not be
   1094 present in certain ``support'' situations, hence the addition of this
   1095 option.
   1096 
   1097 Added in netperf 2.4.3.
   1098 
   1099 @vindex -o, Global
   1100 @item -o <sizespec>
   1101 The value(s) passed-in with this option will be used as an offset
   1102 added to the alignment specified with the @option{-a} option.  For
   1103 example:
   1104 @example
   1105 -o 3 -a 4096
   1106 @end example
   1107 will cause the buffers passed to the local (netperf) send and receive
   1108 calls to begin three bytes past an address aligned to 4096
   1109 bytes. [Default: 0 bytes]
   1110 
   1111 @vindex -O, Global
   1112 @item -O <sizespec>
   1113 This option behaves just as the @option{-o} option but on the remote
   1114 (netserver) system and in conjunction with the @option{-A}
   1115 option. [Default: 0 bytes]
   1116 
   1117 @vindex -p, Global
   1118 @item -p <optionspec>
   1119 The first value of the optionspec passed-in with this option tells
   1120 netperf the port number at which it should expect the remote netserver
   1121 to be listening for control connections.  The second value of the
   1122 optionspec will request netperf to bind to that local port number
   1123 before establishing the control connection.  For example
   1124 @example
   1125 -p 12345
   1126 @end example
   1127 tells netperf that the remote netserver is listening on port 12345 and
   1128 leaves selection of the local port number for the control connection
   1129 up to the local TCP/IP stack whereas
   1130 @example
   1131 -p ,32109
   1132 @end example
   1133 leaves the remote netserver port at the default value of 12865 and
   1134 causes netperf to bind to the local port number 32109 before
   1135 connecting to the remote netserver.
   1136 
   1137 In general, setting the local port number is only necessary when one
   1138 is looking to run netperf through those evil, end-to-end breaking
   1139 things known as firewalls.
   1140 
   1141 @vindex -P, Global
   1142 @item -P 0|1
   1143 A value of ``1'' for the @option{-P} option will enable display of
   1144 the test banner.  A value of ``0'' will disable display of the test
   1145 banner. One might want to disable display of the test banner when
   1146 running the same basic test type (eg TCP_STREAM) multiple times in
   1147 succession where the test banners would then simply be redundant and
   1148 unnecessarily clutter the output. [Default: 1 - display test banners]
   1149 
   1150 @vindex -s, Global
   1151 @item -s <seconds>
   1152 This option will cause netperf to sleep @samp{<seconds>} before
   1153 actually transferring data over the data connection.  This may be
   1154 useful in situations where one wishes to start a great many netperf
   1155 instances and do not want the earlier ones affecting the ability of
   1156 the later ones to get established.
   1157 
   1158 Added somewhere between versions 2.4.3 and 2.5.0.
   1159 
   1160 @vindex -S, Global
   1161 @item -S
   1162 This option will cause an attempt to be made to set SO_KEEPALIVE on
   1163 the data socket of a test using the BSD sockets interface.  The
   1164 attempt will be made on the netperf side of all tests, and will be
   1165 made on the netserver side of an @ref{The Omni Tests,omni} or
   1166 @ref{Migrated Tests,migrated} test.  No indication of failure is given
   1167 unless debug output is enabled with the global @option{-d} option.
   1168 
   1169 Added in version 2.5.0.
   1170 
   1171 @vindex -t, Global
   1172 @item -t testname
   1173 This option is used to tell netperf which test you wish to run.  As of
   1174 this writing, valid values for @var{testname} include:
   1175 @itemize
   1176 @item
   1177 @ref{TCP_STREAM}, @ref{TCP_MAERTS}, @ref{TCP_SENDFILE}, @ref{TCP_RR}, @ref{TCP_CRR}, @ref{TCP_CC}
   1178 @item
   1179 @ref{UDP_STREAM}, @ref{UDP_RR}
   1180 @item
   1181 @ref{XTI_TCP_STREAM},  @ref{XTI_TCP_RR}, @ref{XTI_TCP_CRR}, @ref{XTI_TCP_CC}
   1182 @item
   1183 @ref{XTI_UDP_STREAM}, @ref{XTI_UDP_RR}
   1184 @item
   1185 @ref{SCTP_STREAM}, @ref{SCTP_RR}
   1186 @item
   1187 @ref{DLCO_STREAM}, @ref{DLCO_RR},  @ref{DLCL_STREAM}, @ref{DLCL_RR}
   1188 @item
   1189 @ref{Other Netperf Tests,LOC_CPU}, @ref{Other Netperf Tests,REM_CPU}
   1190 @item
   1191 @ref{The Omni Tests,OMNI}
   1192 @end itemize
   1193 Not all tests are always compiled into netperf.  In particular, the
   1194 ``XTI,'' ``SCTP,'' ``UNIXDOMAIN,'' and ``DL*'' tests are only included in
   1195 netperf when configured with
   1196 @option{--enable-[xti|sctp|unixdomain|dlpi]=yes}.
   1197 
   1198 Netperf only runs one type of test no matter how many @option{-t}
   1199 options may be present on the command-line.  The last @option{-t}
   1200 global command-line option will determine the test to be
   1201 run. [Default: TCP_STREAM]
   1202 
   1203 @vindex -T, Global
   1204 @item -T <optionspec>
   1205 This option controls the CPU, and probably by extension memory,
   1206 affinity of netperf and/or netserver.
   1207 @example
   1208 netperf -T 1
   1209 @end example
   1210 will bind both netperf and netserver to ``CPU 1'' on their respective
   1211 systems.
   1212 @example
   1213 netperf -T 1,
   1214 @end example
   1215 will bind just netperf to ``CPU 1'' and will leave netserver unbound.
   1216 @example
   1217 netperf -T ,2
   1218 @end example
   1219 will leave netperf unbound and will bind netserver to ``CPU 2.''
   1220 @example
   1221 netperf -T 1,2
   1222 @end example
   1223 will bind netperf to ``CPU 1'' and netserver to ``CPU 2.''
   1224 
   1225 This can be particularly useful when investigating performance issues
   1226 involving where processes run relative to where NIC interrupts are
   1227 processed or where NICs allocate their DMA buffers.
   1228 
   1229 @vindex -v, Global
   1230 @item -v verbosity
   1231 This option controls how verbose netperf will be in its output, and is
   1232 often used in conjunction with the @option{-P} option. If the
   1233 verbosity is set to a value of ``0'' then only the test's SFM (Single
   1234 Figure of Merit) is displayed.  If local @ref{CPU Utilization,CPU
   1235 utilization} is requested via the @option{-c} option then the SFM is
   1236 the local service demand.  Othersise, if remote CPU utilization is
   1237 requested via the @option{-C} option then the SFM is the remote
   1238 service demand.  If neither local nor remote CPU utilization are
   1239 requested the SFM will be the measured throughput or transaction rate
   1240 as implied by the test specified with the @option{-t} option.
   1241 
   1242 If the verbosity level is set to ``1'' then the ``normal'' netperf
   1243 result output for each test is displayed.
   1244 
   1245 If the verbosity level is set to ``2'' then ``extra'' information will
   1246 be displayed.  This may include, but is not limited to the number of
   1247 send or recv calls made and the average number of bytes per send or
   1248 recv call, or a histogram of the time spent in each send() call or for
   1249 each transaction if netperf was configured with
   1250 @option{--enable-histogram=yes}. [Default: 1 - normal verbosity]
   1251 
   1252 In an @ref{The Omni Tests,omni} test the verbosity setting is largely
   1253 ignored, save for when asking for the time histogram to be displayed.
   1254 In version 2.5.0 and later there is no @ref{Omni Output Selectors,output
   1255 selector} for the histogram and so it remains displayed only when the
   1256 verbosity level is set to 2.
   1257 
   1258 @vindex -V, Global
   1259 @item -V
   1260 This option displays the netperf version and then exits.
   1261 
   1262 Added in netperf 2.4.4.
   1263 
   1264 @vindex -w, Global
   1265 @item -w time
   1266 If netperf was configured with @option{--enable-intervals=yes} then
   1267 this value will set the inter-burst time to time milliseconds, and the
   1268 @option{-b} option will set the number of sends per burst.  The actual
   1269 inter-burst time may vary depending on the system's timer resolution.
   1270 
   1271 @vindex -W, Global
   1272 @item -W <sizespec>
   1273 This option controls the number of buffers in the send (first or only
   1274 value) and or receive (second or only value) buffer rings.  Unlike
   1275 some benchmarks, netperf does not continuously send or receive from a
   1276 single buffer.  Instead it rotates through a ring of
   1277 buffers. [Default: One more than the size of the send or receive
   1278 socket buffer sizes (@option{-s} and/or @option{-S} options) divided
   1279 by the send @option{-m} or receive @option{-M} buffer size
   1280 respectively]
   1281 
   1282 @vindex -4, Global
   1283 @item -4
   1284 Specifying this option will set both the local and remote address
   1285 families to AF_INET - that is use only IPv4 addresses on the control
   1286 connection.  This can be overridden by a subsequent @option{-6},
   1287 @option{-H} or @option{-L} option.  Basically, the last option
   1288 explicitly specifying an address family wins.  Unless overridden by a
   1289 test-specific option, this will be inherited for the data connection
   1290 as well.
   1291 
   1292 @vindex -6, Global
   1293 @item -6
   1294 Specifying this option will set both local and and remote address
   1295 families to AF_INET6 - that is use only IPv6 addresses on the control
   1296 connection.  This can be overridden by a subsequent @option{-4},
   1297 @option{-H} or @option{-L} option.  Basically, the last address family
   1298 explicitly specified wins.  Unless overridden by a test-specific
   1299 option, this will be inherited for the data connection as well.
   1300 
   1301 @end table
   1302 
   1303 
   1304 @node Using Netperf to Measure Bulk Data Transfer, Using Netperf to Measure Request/Response , Global Command-line Options, Top
   1305 @chapter Using Netperf to Measure Bulk Data Transfer
   1306 
   1307 The most commonly measured aspect of networked system performance is
   1308 that of bulk or unidirectional transfer performance.  Everyone wants
   1309 to know how many bits or bytes per second they can push across the
   1310 network. The classic netperf convention for a bulk data transfer test
   1311 name is to tack a ``_STREAM'' suffix to a test name.
   1312 
   1313 @menu
   1314 * Issues in Bulk Transfer::     
   1315 * Options common to TCP UDP and SCTP tests::  
   1316 @end menu
   1317 
   1318 @node Issues in Bulk Transfer, Options common to TCP UDP and SCTP tests, Using Netperf to Measure Bulk Data Transfer, Using Netperf to Measure Bulk Data Transfer
   1319 @comment  node-name,  next,  previous,  up
   1320 @section Issues in Bulk Transfer
   1321 
   1322 There are any number of things which can affect the performance of a
   1323 bulk transfer test.  
   1324 
   1325 Certainly, absent compression, bulk-transfer tests can be limited by
   1326 the speed of the slowest link in the path from the source to the
   1327 destination.  If testing over a gigabit link, you will not see more
   1328 than a gigabit :) Such situations can be described as being
   1329 @dfn{network-limited} or @dfn{NIC-limited}.
   1330 
   1331 CPU utilization can also affect the results of a bulk-transfer test.
   1332 If the networking stack requires a certain number of instructions or
   1333 CPU cycles per KB of data transferred, and the CPU is limited in the
   1334 number of instructions or cycles it can provide, then the transfer can
   1335 be described as being @dfn{CPU-bound}.  
   1336 
   1337 A bulk-transfer test can be CPU bound even when netperf reports less
   1338 than 100% CPU utilization.  This can happen on an MP system where one
   1339 or more of the CPUs saturate at 100% but other CPU's remain idle.
   1340 Typically, a single flow of data, such as that from a single instance
   1341 of a netperf _STREAM test cannot make use of much more than the power
   1342 of one CPU. Exceptions to this generally occur when netperf and/or
   1343 netserver run on CPU(s) other than the CPU(s) taking interrupts from
   1344 the NIC(s). In that case, one might see as much as two CPUs' worth of
   1345 processing being used to service the flow of data.
   1346 
   1347 Distance and the speed-of-light can affect performance for a
   1348 bulk-transfer; often this can be mitigated by using larger windows.
   1349 One common limit to the performance of a transport using window-based
   1350 flow-control is:
   1351 @example
   1352 Throughput <= WindowSize/RoundTripTime
   1353 @end example
   1354 As the sender can only have a window's-worth of data outstanding on
   1355 the network at any one time, and the soonest the sender can receive a
   1356 window update from the receiver is one RoundTripTime (RTT).  TCP and
   1357 SCTP are examples of such protocols.
   1358 
   1359 Packet losses and their effects can be particularly bad for
   1360 performance.  This is especially true if the packet losses result in
   1361 retransmission timeouts for the protocol(s) involved.  By the time a
   1362 retransmission timeout has happened, the flow or connection has sat
   1363 idle for a considerable length of time.
   1364 
   1365 On many platforms, some variant on the @command{netstat} command can
   1366 be used to retrieve statistics about packet loss and
   1367 retransmission. For example:
   1368 @example
   1369 netstat -p tcp
   1370 @end example
   1371 will retrieve TCP statistics on the HP-UX Operating System.  On other
   1372 platforms, it may not be possible to retrieve statistics for a
   1373 specific protocol and something like:
   1374 @example
   1375 netstat -s
   1376 @end example
   1377 would be used instead.
   1378 
   1379 Many times, such network statistics are keep since the time the stack
   1380 started, and we are only really interested in statistics from when
   1381 netperf was running.  In such situations something along the lines of:
   1382 @example
   1383 netstat -p tcp > before
   1384 netperf -t TCP_mumble...
   1385 netstat -p tcp > after
   1386 @end example
   1387 is indicated.  The
   1388 @uref{ftp://ftp.cup.hp.com/dist/networking/tools/,beforeafter} utility
   1389 can be used to subtract the statistics in @file{before} from the
   1390 statistics in @file{after}:
   1391 @example
   1392 beforeafter before after > delta
   1393 @end example
   1394 and then one can look at the statistics in @file{delta}.  Beforeafter
   1395 is distributed in source form so one can compile it on the platform(s)
   1396 of interest. 
   1397 
   1398 If running a version 2.5.0 or later ``omni'' test under Linux one can
   1399 include either or both of:
   1400 @itemize
   1401 @item LOCAL_TRANSPORT_RETRANS
   1402 @item REMOTE_TRANSPORT_RETRANS
   1403 @end itemize
   1404 
   1405 in the values provided via a test-specific @option{-o}, @option{-O},
   1406 or @option{-k} output selction option and netperf will report the
   1407 retransmissions experienced on the data connection, as reported via a
   1408 @code{getsockopt(TCP_INFO)} call.  If confidence intervals have been
   1409 requested via the global @option{-I} or @option{-i} options, the
   1410 reported value(s) will be for the last iteration.  If the test is over
   1411 a protocol other than TCP, or on a platform other than Linux, the
   1412 results are undefined.
   1413 
   1414 While it was written with HP-UX's netstat in mind, the
   1415 @uref{ftp://ftp.cup.hp.com/dist/networking/briefs/annotated_netstat.txt,annotated
   1416 netstat} writeup may be helpful with other platforms as well.
   1417 
   1418 @node Options common to TCP UDP and SCTP tests,  , Issues in Bulk Transfer, Using Netperf to Measure Bulk Data Transfer
   1419 @comment  node-name,  next,  previous,  up
   1420 @section Options common to TCP UDP and SCTP tests
   1421 
   1422 Many ``test-specific'' options are actually common across the
   1423 different tests.  For those tests involving TCP, UDP and SCTP, whether
   1424 using the BSD Sockets or the XTI interface those common options
   1425 include:
   1426 
   1427 @table @code
   1428 @vindex -h, Test-specific
   1429 @item -h
   1430 Display the test-suite-specific usage string and exit.  For a TCP_ or
   1431 UDP_ test this will be the usage string from the source file
   1432 nettest_bsd.c.  For an XTI_ test, this will be the usage string from
   1433 the source file nettest_xti.c.  For an SCTP test, this will be the
   1434 usage string from the source file nettest_sctp.c.
   1435 
   1436 @item -H <optionspec>
   1437 Normally, the remote hostname|IP and address family information is
   1438 inherited from the settings for the control connection (eg global
   1439 command-line @option{-H}, @option{-4} and/or @option{-6} options).
   1440 The test-specific @option{-H} will override those settings for the
   1441 data (aka test) connection only.  Settings for the control connection
   1442 are left unchanged.
   1443 
   1444 @vindex -L, Test-specific
   1445 @item -L <optionspec>
   1446 The test-specific @option{-L} option is identical to the test-specific
   1447 @option{-H} option except it affects the local hostname|IP and address
   1448 family information.  As with its global command-line counterpart, this
   1449 is generally only useful when measuring though those evil, end-to-end
   1450 breaking things called firewalls.
   1451 
   1452 @vindex -m, Test-specific
   1453 @item -m bytes
   1454 Set the size of the buffer passed-in to the ``send'' calls of a
   1455 _STREAM test.  Note that this may have only an indirect effect on the
   1456 size of the packets sent over the network, and certain Layer 4
   1457 protocols do _not_ preserve or enforce message boundaries, so setting
   1458 @option{-m} for the send size does not necessarily mean the receiver
   1459 will receive that many bytes at any one time. By default the units are
   1460 bytes, but suffix of ``G,'' ``M,'' or ``K'' will specify the units to
   1461 be 2^30 (GB), 2^20 (MB) or 2^10 (KB) respectively. A suffix of ``g,''
   1462 ``m'' or ``k'' will specify units of 10^9, 10^6 or 10^3 bytes
   1463 respectively. For example:
   1464 @example
   1465 @code{-m 32K}
   1466 @end example
   1467 will set the size to 32KB or 32768 bytes. [Default: the local send
   1468 socket buffer size for the connection - either the system's default or
   1469 the value set via the @option{-s} option.]
   1470 
   1471 @vindex -M, Test-specific
   1472 @item -M bytes
   1473 Set the size of the buffer passed-in to the ``recv'' calls of a
   1474 _STREAM test.  This will be an upper bound on the number of bytes
   1475 received per receive call. By default the units are bytes, but suffix
   1476 of ``G,'' ``M,'' or ``K'' will specify the units to be 2^30 (GB), 2^20
   1477 (MB) or 2^10 (KB) respectively.  A suffix of ``g,'' ``m'' or ``k''
   1478 will specify units of 10^9, 10^6 or 10^3 bytes respectively. For
   1479 example:
   1480 @example
   1481 @code{-M 32K}
   1482 @end example
   1483 will set the size to 32KB or 32768 bytes. [Default: the remote receive
   1484 socket buffer size for the data connection - either the system's
   1485 default or the value set via the @option{-S} option.]
   1486 
   1487 @vindex -P, Test-specific
   1488 @item -P <optionspec>
   1489 Set the local and/or remote port numbers for the data connection.
   1490 
   1491 @vindex -s, Test-specific
   1492 @item -s <sizespec>
   1493 This option sets the local (netperf) send and receive socket buffer
   1494 sizes for the data connection to the value(s) specified.  Often, this
   1495 will affect the advertised and/or effective TCP or other window, but
   1496 on some platforms it may not. By default the units are bytes, but
   1497 suffix of ``G,'' ``M,'' or ``K'' will specify the units to be 2^30
   1498 (GB), 2^20 (MB) or 2^10 (KB) respectively.  A suffix of ``g,'' ``m''
   1499 or ``k'' will specify units of 10^9, 10^6 or 10^3 bytes
   1500 respectively. For example:
   1501 @example
   1502 @code{-s 128K}
   1503 @end example
   1504 Will request the local send and receive socket buffer sizes to be
   1505 128KB or 131072 bytes. 
   1506 
   1507 While the historic expectation is that setting the socket buffer size
   1508 has a direct effect on say the TCP window, today that may not hold
   1509 true for all stacks. Further, while the historic expectation is that
   1510 the value specified in a @code{setsockopt()} call will be the value returned
   1511 via a @code{getsockopt()} call, at least one stack is known to deliberately
   1512 ignore history.  When running under Windows a value of 0 may be used
   1513 which will be an indication to the stack the user wants to enable a
   1514 form of copy avoidance. [Default: -1 - use the system's default socket
   1515 buffer sizes]
   1516 
   1517 @vindex -S Test-specific
   1518 @item -S <sizespec>
   1519 This option sets the remote (netserver) send and/or receive socket
   1520 buffer sizes for the data connection to the value(s) specified.
   1521 Often, this will affect the advertised and/or effective TCP or other
   1522 window, but on some platforms it may not. By default the units are
   1523 bytes, but suffix of ``G,'' ``M,'' or ``K'' will specify the units to
   1524 be 2^30 (GB), 2^20 (MB) or 2^10 (KB) respectively.  A suffix of ``g,''
   1525 ``m'' or ``k'' will specify units of 10^9, 10^6 or 10^3 bytes
   1526 respectively.  For example:
   1527 @example
   1528 @code{-S 128K}
   1529 @end example
   1530 Will request the remote send and receive socket buffer sizes to be
   1531 128KB or 131072 bytes. 
   1532 
   1533 While the historic expectation is that setting the socket buffer size
   1534 has a direct effect on say the TCP window, today that may not hold
   1535 true for all stacks.  Further, while the historic expectation is that
   1536 the value specified in a @code{setsockopt()} call will be the value returned
   1537 via a @code{getsockopt()} call, at least one stack is known to deliberately
   1538 ignore history.  When running under Windows a value of 0 may be used
   1539 which will be an indication to the stack the user wants to enable a
   1540 form of copy avoidance. [Default: -1 - use the system's default socket
   1541 buffer sizes]
   1542 
   1543 @vindex -4, Test-specific
   1544 @item -4
   1545 Set the local and remote address family for the data connection to
   1546 AF_INET - ie use IPv4 addressing only.  Just as with their global
   1547 command-line counterparts the last of the @option{-4}, @option{-6},
   1548 @option{-H} or @option{-L} option wins for their respective address
   1549 families.
   1550 
   1551 @vindex -6, Test-specific
   1552 @item -6
   1553 This option is identical to its @option{-4} cousin, but requests IPv6
   1554 addresses for the local and remote ends of the data connection.
   1555 
   1556 @end table
   1557 
   1558 
   1559 @menu
   1560 * TCP_STREAM::                  
   1561 * TCP_MAERTS::                  
   1562 * TCP_SENDFILE::                
   1563 * UDP_STREAM::                  
   1564 * XTI_TCP_STREAM::              
   1565 * XTI_UDP_STREAM::              
   1566 * SCTP_STREAM::                 
   1567 * DLCO_STREAM::                 
   1568 * DLCL_STREAM::                 
   1569 * STREAM_STREAM::               
   1570 * DG_STREAM::                   
   1571 @end menu
   1572 
   1573 @node TCP_STREAM, TCP_MAERTS, Options common to TCP UDP and SCTP tests, Options common to TCP UDP and SCTP tests
   1574 @subsection TCP_STREAM
   1575 
   1576 The TCP_STREAM test is the default test in netperf.  It is quite
   1577 simple, transferring some quantity of data from the system running
   1578 netperf to the system running netserver.  While time spent
   1579 establishing the connection is not included in the throughput
   1580 calculation, time spent flushing the last of the data to the remote at
   1581 the end of the test is.  This is how netperf knows that all the data
   1582 it sent was received by the remote.  In addition to the @ref{Options
   1583 common to TCP UDP and SCTP tests,options common to STREAM tests}, the
   1584 following test-specific options can be included to possibly alter the
   1585 behavior of the test:
   1586 
   1587 @table @code
   1588 @item -C
   1589 This option will set TCP_CORK mode on the data connection on those
   1590 systems where TCP_CORK is defined (typically Linux).  A full
   1591 description of TCP_CORK is beyond the scope of this manual, but in a
   1592 nutshell it forces sub-MSS sends to be buffered so every segment sent
   1593 is Maximum Segment Size (MSS) unless the application performs an
   1594 explicit flush operation or the connection is closed.  At present
   1595 netperf does not perform any explicit flush operations.  Setting
   1596 TCP_CORK may improve the bitrate of tests where the ``send size''
   1597 (@option{-m} option) is smaller than the MSS.  It should also improve
   1598 (make smaller) the service demand.
   1599 
   1600 The Linux tcp(7) manpage states that TCP_CORK cannot be used in
   1601 conjunction with TCP_NODELAY (set via the @option{-d} option), however
   1602 netperf does not validate command-line options to enforce that.
   1603 
   1604 @item -D
   1605 This option will set TCP_NODELAY on the data connection on those
   1606 systems where TCP_NODELAY is defined.  This disables something known
   1607 as the Nagle Algorithm, which is intended to make the segments TCP
   1608 sends as large as reasonably possible.  Setting TCP_NODELAY for a
   1609 TCP_STREAM test should either have no effect when the send size
   1610 (@option{-m} option) is larger than the MSS or will decrease reported
   1611 bitrate and increase service demand when the send size is smaller than
   1612 the MSS.  This stems from TCP_NODELAY causing each sub-MSS send to be
   1613 its own TCP segment rather than being aggregated with other small
   1614 sends.  This means more trips up and down the protocol stack per KB of
   1615 data transferred, which means greater CPU utilization.
   1616 
   1617 If setting TCP_NODELAY with @option{-D} affects throughput and/or
   1618 service demand for tests where the send size (@option{-m}) is larger
   1619 than the MSS it suggests the TCP/IP stack's implementation of the
   1620 Nagle Algorithm _may_ be broken, perhaps interpreting the Nagle
   1621 Algorithm on a segment by segment basis rather than the proper user
   1622 send by user send basis.  However, a better test of this can be
   1623 achieved with the @ref{TCP_RR} test.
   1624 
   1625 @end table
   1626 
   1627 Here is an example of a basic TCP_STREAM test, in this case from a
   1628 Debian Linux (2.6 kernel) system to an HP-UX 11iv2 (HP-UX 11.23)
   1629 system:
   1630 
   1631 @example
   1632 $ netperf -H lag
   1633 TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to lag.hpl.hp.com (15.4.89.214) port 0 AF_INET
   1634 Recv   Send    Send                          
   1635 Socket Socket  Message  Elapsed              
   1636 Size   Size    Size     Time     Throughput  
   1637 bytes  bytes   bytes    secs.    10^6bits/sec  
   1638 
   1639  32768  16384  16384    10.00      80.42   
   1640 @end example
   1641 
   1642 We see that the default receive socket buffer size for the receiver
   1643 (lag - HP-UX 11.23) is 32768 bytes, and the default socket send buffer
   1644 size for the sender (Debian 2.6 kernel) is 16384 bytes, however Linux
   1645 does ``auto tuning'' of socket buffer and TCP window sizes, which
   1646 means the send socket buffer size may be different at the end of the
   1647 test than it was at the beginning.  This is addressed in the @ref{The
   1648 Omni Tests,omni tests} added in version 2.5.0 and @ref{Omni Output
   1649 Selection,output selection}.  Throughput is expressed as 10^6 (aka
   1650 Mega) bits per second, and the test ran for 10 seconds.  IPv4
   1651 addresses (AF_INET) were used.
   1652 
   1653 @node TCP_MAERTS, TCP_SENDFILE, TCP_STREAM, Options common to TCP UDP and SCTP tests
   1654 @comment  node-name,  next,  previous,  up
   1655 @subsection TCP_MAERTS
   1656 
   1657 A TCP_MAERTS (MAERTS is STREAM backwards) test is ``just like'' a
   1658 @ref{TCP_STREAM} test except the data flows from the netserver to the
   1659 netperf. The global command-line @option{-F} option is ignored for
   1660 this test type.  The test-specific command-line @option{-C} option is
   1661 ignored for this test type.
   1662 
   1663 Here is an example of a TCP_MAERTS test between the same two systems
   1664 as in the example for the @ref{TCP_STREAM} test.  This time we request
   1665 larger socket buffers with @option{-s} and @option{-S} options:
   1666 
   1667 @example
   1668 $ netperf -H lag -t TCP_MAERTS -- -s 128K -S 128K
   1669 TCP MAERTS TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to lag.hpl.hp.com (15.4.89.214) port 0 AF_INET
   1670 Recv   Send    Send                          
   1671 Socket Socket  Message  Elapsed              
   1672 Size   Size    Size     Time     Throughput  
   1673 bytes  bytes   bytes    secs.    10^6bits/sec  
   1674 
   1675 221184 131072 131072    10.03      81.14   
   1676 @end example
   1677 
   1678 Where we see that Linux, unlike HP-UX, may not return the same value
   1679 in a @code{getsockopt()} as was requested in the prior @code{setsockopt()}.
   1680 
   1681 This test is included more for benchmarking convenience than anything
   1682 else.
   1683 
   1684 @node TCP_SENDFILE, UDP_STREAM, TCP_MAERTS, Options common to TCP UDP and SCTP tests
   1685 @comment  node-name,  next,  previous,  up
   1686 @subsection TCP_SENDFILE
   1687 
   1688 The TCP_SENDFILE test is ``just like'' a @ref{TCP_STREAM} test except
   1689 netperf the platform's @code{sendfile()} call instead of calling
   1690 @code{send()}.  Often this results in a @dfn{zero-copy} operation
   1691 where data is sent directly from the filesystem buffer cache.  This
   1692 _should_ result in lower CPU utilization and possibly higher
   1693 throughput.  If it does not, then you may want to contact your
   1694 vendor(s) because they have a problem on their hands.
   1695 
   1696 Zero-copy mechanisms may also alter the characteristics (size and
   1697 number of buffers per) of packets passed to the NIC.  In many stacks,
   1698 when a copy is performed, the stack can ``reserve'' space at the
   1699 beginning of the destination buffer for things like TCP, IP and Link
   1700 headers.  This then has the packet contained in a single buffer which
   1701 can be easier to DMA to the NIC.  When no copy is performed, there is
   1702 no opportunity to reserve space for headers and so a packet will be
   1703 contained in two or more buffers.
   1704 
   1705 As of some time before version 2.5.0, the @ref{Global Options,global
   1706 @option{-F} option} is no longer required for this test.  If it is not
   1707 specified, netperf will create a temporary file, which it will delete
   1708 at the end of the test.  If the @option{-F} option is specified it
   1709 must reference a file of at least the size of the send ring
   1710 (@xref{Global Options,the global @option{-W} option}.) multiplied by
   1711 the send size (@xref{Options common to TCP UDP and SCTP tests,the
   1712 test-specific @option{-m} option}.).  All other TCP-specific options
   1713 remain available and optional.
   1714 
   1715 In this first example:
   1716 @example
   1717 $ netperf -H lag -F ../src/netperf -t TCP_SENDFILE -- -s 128K -S 128K
   1718 TCP SENDFILE TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to lag.hpl.hp.com (15.4.89.214) port 0 AF_INET
   1719 alloc_sendfile_buf_ring: specified file too small.
   1720 file must be larger than send_width * send_size
   1721 @end example
   1722 
   1723 we see what happens when the file is too small.  Here:
   1724 
   1725 @example
   1726 $ netperf -H lag -F /boot/vmlinuz-2.6.8-1-686 -t TCP_SENDFILE -- -s 128K -S 128K
   1727 TCP SENDFILE TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to lag.hpl.hp.com (15.4.89.214) port 0 AF_INET
   1728 Recv   Send    Send                          
   1729 Socket Socket  Message  Elapsed              
   1730 Size   Size    Size     Time     Throughput  
   1731 bytes  bytes   bytes    secs.    10^6bits/sec  
   1732 
   1733 131072 221184 221184    10.02      81.83   
   1734 @end example
   1735 
   1736 we resolve that issue by selecting a larger file.
   1737 
   1738 
   1739 @node UDP_STREAM, XTI_TCP_STREAM, TCP_SENDFILE, Options common to TCP UDP and SCTP tests
   1740 @subsection UDP_STREAM
   1741 
   1742 A UDP_STREAM test is similar to a @ref{TCP_STREAM} test except UDP is
   1743 used as the transport rather than TCP.
   1744 
   1745 @cindex Limiting Bandwidth
   1746 A UDP_STREAM test has no end-to-end flow control - UDP provides none
   1747 and neither does netperf.  However, if you wish, you can configure
   1748 netperf with @code{--enable-intervals=yes} to enable the global
   1749 command-line @option{-b} and @option{-w} options to pace bursts of
   1750 traffic onto the network.
   1751 
   1752 This has a number of implications.
   1753 
   1754 The biggest of these implications is the data which is sent might not
   1755 be received by the remote.  For this reason, the output of a
   1756 UDP_STREAM test shows both the sending and receiving throughput.  On
   1757 some platforms, it may be possible for the sending throughput to be
   1758 reported as a value greater than the maximum rate of the link.  This
   1759 is common when the CPU(s) are faster than the network and there is no
   1760 @dfn{intra-stack} flow-control.
   1761 
   1762 Here is an example of a UDP_STREAM test between two systems connected
   1763 by a 10 Gigabit Ethernet link:
   1764 @example
   1765 $ netperf -t UDP_STREAM -H 192.168.2.125 -- -m 32768
   1766 UDP UNIDIRECTIONAL SEND TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 192.168.2.125 (192.168.2.125) port 0 AF_INET
   1767 Socket  Message  Elapsed      Messages                
   1768 Size    Size     Time         Okay Errors   Throughput
   1769 bytes   bytes    secs            #      #   10^6bits/sec
   1770 
   1771 124928   32768   10.00      105672      0    2770.20
   1772 135168           10.00      104844           2748.50
   1773 
   1774 @end example
   1775 
   1776 The first line of numbers are statistics from the sending (netperf)
   1777 side. The second line of numbers are from the receiving (netserver)
   1778 side.  In this case, 105672 - 104844 or 828 messages did not make it
   1779 all the way to the remote netserver process.
   1780 
   1781 If the value of the @option{-m} option is larger than the local send
   1782 socket buffer size (@option{-s} option) netperf will likely abort with
   1783 an error message about how the send call failed:
   1784 
   1785 @example
   1786 netperf -t UDP_STREAM -H 192.168.2.125
   1787 UDP UNIDIRECTIONAL SEND TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 192.168.2.125 (192.168.2.125) port 0 AF_INET
   1788 udp_send: data send error: Message too long
   1789 @end example
   1790 
   1791 If the value of the @option{-m} option is larger than the remote
   1792 socket receive buffer, the reported receive throughput will likely be
   1793 zero as the remote UDP will discard the messages as being too large to
   1794 fit into the socket buffer.
   1795 
   1796 @example
   1797 $ netperf -t UDP_STREAM -H 192.168.2.125 -- -m 65000 -S 32768
   1798 UDP UNIDIRECTIONAL SEND TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 192.168.2.125 (192.168.2.125) port 0 AF_INET
   1799 Socket  Message  Elapsed      Messages                
   1800 Size    Size     Time         Okay Errors   Throughput
   1801 bytes   bytes    secs            #      #   10^6bits/sec
   1802 
   1803 124928   65000   10.00       53595      0    2786.99
   1804  65536           10.00           0              0.00
   1805 @end example
   1806 
   1807 The example above was between a pair of systems running a ``Linux''
   1808 kernel. Notice that the remote Linux system returned a value larger
   1809 than that passed-in to the @option{-S} option.  In fact, this value
   1810 was larger than the message size set with the @option{-m} option.
   1811 That the remote socket buffer size is reported as 65536 bytes would
   1812 suggest to any sane person that a message of 65000 bytes would fit,
   1813 but the socket isn't _really_ 65536 bytes, even though Linux is
   1814 telling us so.  Go figure.
   1815 
   1816 @node XTI_TCP_STREAM, XTI_UDP_STREAM, UDP_STREAM, Options common to TCP UDP and SCTP tests
   1817 @subsection XTI_TCP_STREAM
   1818 
   1819 An XTI_TCP_STREAM test is simply a @ref{TCP_STREAM} test using the XTI
   1820 rather than BSD Sockets interface.  The test-specific @option{-X
   1821 <devspec>} option can be used to specify the name of the local and/or
   1822 remote XTI device files, which is required by the @code{t_open()} call
   1823 made by netperf XTI tests.
   1824 
   1825 The XTI_TCP_STREAM test is only present if netperf was configured with
   1826 @code{--enable-xti=yes}.  The remote netserver must have also been
   1827 configured with @code{--enable-xti=yes}.
   1828 
   1829 @node XTI_UDP_STREAM, SCTP_STREAM, XTI_TCP_STREAM, Options common to TCP UDP and SCTP tests
   1830 @subsection XTI_UDP_STREAM
   1831 
   1832 An XTI_UDP_STREAM test is simply a @ref{UDP_STREAM} test using the XTI
   1833 rather than BSD Sockets Interface.  The test-specific @option{-X
   1834 <devspec>} option can be used to specify the name of the local and/or
   1835 remote XTI device files, which is required by the @code{t_open()} call
   1836 made by netperf XTI tests.
   1837 
   1838 The XTI_UDP_STREAM test is only present if netperf was configured with
   1839 @code{--enable-xti=yes}. The remote netserver must have also been
   1840 configured with @code{--enable-xti=yes}.
   1841 
   1842 @node SCTP_STREAM, DLCO_STREAM, XTI_UDP_STREAM, Options common to TCP UDP and SCTP tests
   1843 @subsection SCTP_STREAM
   1844 
   1845 An SCTP_STREAM test is essentially a @ref{TCP_STREAM} test using the SCTP
   1846 rather than TCP.  The @option{-D} option will set SCTP_NODELAY, which
   1847 is much like the TCP_NODELAY option for TCP.  The @option{-C} option
   1848 is not applicable to an SCTP test as there is no corresponding
   1849 SCTP_CORK option.  The author is still figuring-out what the
   1850 test-specific @option{-N} option does :)
   1851 
   1852 The SCTP_STREAM test is only present if netperf was configured with
   1853 @code{--enable-sctp=yes}. The remote netserver must have also been
   1854 configured with @code{--enable-sctp=yes}.
   1855 
   1856 @node DLCO_STREAM, DLCL_STREAM, SCTP_STREAM, Options common to TCP UDP and SCTP tests
   1857 @subsection DLCO_STREAM
   1858 
   1859 A DLPI Connection Oriented Stream (DLCO_STREAM) test is very similar
   1860 in concept to a @ref{TCP_STREAM} test.  Both use reliable,
   1861 connection-oriented protocols.  The DLPI test differs from the TCP
   1862 test in that its protocol operates only at the link-level and does not
   1863 include TCP-style segmentation and reassembly.  This last difference
   1864 means that the value  passed-in  with the @option{-m} option must be
   1865 less than the interface MTU.  Otherwise, the @option{-m} and
   1866 @option{-M} options are just like their TCP/UDP/SCTP counterparts.
   1867 
   1868 Other DLPI-specific options include:
   1869 
   1870 @table @code
   1871 @item -D <devspec>
   1872 This option is used to provide the fully-qualified names for the local
   1873 and/or remote DLPI device files.  The syntax is otherwise identical to
   1874 that of a @dfn{sizespec}.
   1875 @item -p <ppaspec>
   1876 This option is used to specify the local and/or remote DLPI PPA(s).
   1877 The PPA is used to identify the interface over which traffic is to be
   1878 sent/received. The syntax of a @dfn{ppaspec} is otherwise the same as
   1879 a @dfn{sizespec}.
   1880 @item -s sap 
   1881 This option specifies the 802.2 SAP for the test.  A SAP is somewhat
   1882 like either the port field of a TCP or UDP header or the protocol
   1883 field of an IP header.  The specified SAP should not conflict with any
   1884 other active SAPs on the specified PPA's (@option{-p} option).
   1885 @item -w <sizespec>
   1886 This option specifies the local send and receive window sizes in units
   1887 of frames on those platforms which support setting such things.
   1888 @item -W <sizespec>
   1889 This option specifies the remote send and receive window sizes in
   1890 units of frames on those platforms which support setting such things.
   1891 @end table
   1892 
   1893 The DLCO_STREAM test is only present if netperf was configured with
   1894 @code{--enable-dlpi=yes}. The remote netserver must have also been
   1895 configured with @code{--enable-dlpi=yes}.
   1896 
   1897 
   1898 @node DLCL_STREAM, STREAM_STREAM, DLCO_STREAM, Options common to TCP UDP and SCTP tests
   1899 @subsection DLCL_STREAM
   1900 
   1901 A DLPI ConnectionLess Stream (DLCL_STREAM) test is analogous to a
   1902 @ref{UDP_STREAM} test in that both make use of unreliable/best-effort,
   1903 connection-less transports.  The DLCL_STREAM test differs from the
   1904 @ref{UDP_STREAM} test in that the message size (@option{-m} option) must
   1905 always be less than the link MTU as there is no IP-like fragmentation
   1906 and reassembly available and netperf does not presume to provide one.
   1907 
   1908 The test-specific command-line options for a DLCL_STREAM test are the
   1909 same as those for a @ref{DLCO_STREAM} test.
   1910 
   1911 The DLCL_STREAM test is only present if netperf was configured with
   1912 @code{--enable-dlpi=yes}. The remote netserver must have also been
   1913 configured with @code{--enable-dlpi=yes}.
   1914 
   1915 @node STREAM_STREAM, DG_STREAM, DLCL_STREAM, Options common to TCP UDP and SCTP tests
   1916 @comment  node-name,  next,  previous,  up
   1917 @subsection STREAM_STREAM
   1918 
   1919 A Unix Domain Stream Socket Stream test (STREAM_STREAM) is similar in
   1920 concept to a @ref{TCP_STREAM} test, but using Unix Domain sockets.  It is,
   1921 naturally, limited to intra-machine traffic.  A STREAM_STREAM test
   1922 shares the @option{-m}, @option{-M}, @option{-s} and @option{-S}
   1923 options of the other _STREAM tests.  In a STREAM_STREAM test the
   1924 @option{-p} option sets the directory in which the pipes will be
   1925 created rather than setting a port number.  The default is to create
   1926 the pipes in the system default for the @code{tempnam()} call.
   1927 
   1928 The STREAM_STREAM test is only present if netperf was configured with
   1929 @code{--enable-unixdomain=yes}. The remote netserver must have also been
   1930 configured with @code{--enable-unixdomain=yes}.
   1931 
   1932 @node DG_STREAM,  , STREAM_STREAM, Options common to TCP UDP and SCTP tests
   1933 @comment  node-name,  next,  previous,  up
   1934 @subsection DG_STREAM
   1935 
   1936 A Unix Domain Datagram Socket Stream test (SG_STREAM) is very much
   1937 like a @ref{TCP_STREAM} test except that message boundaries are preserved.
   1938 In this way, it may also be considered similar to certain flavors of
   1939 SCTP test which can also preserve message boundaries.
   1940 
   1941 All the options of a @ref{STREAM_STREAM} test are applicable to a DG_STREAM
   1942 test. 
   1943 
   1944 The DG_STREAM test is only present if netperf was configured with
   1945 @code{--enable-unixdomain=yes}. The remote netserver must have also been
   1946 configured with @code{--enable-unixdomain=yes}.
   1947 
   1948 
   1949 @node Using Netperf to Measure Request/Response , Using Netperf to Measure Aggregate Performance, Using Netperf to Measure Bulk Data Transfer, Top
   1950 @chapter Using Netperf to Measure Request/Response 
   1951 
   1952 Request/response performance is often overlooked, yet it is just as
   1953 important as bulk-transfer performance.  While things like larger
   1954 socket buffers and TCP windows, and stateless offloads like TSO and
   1955 LRO can cover a multitude of latency and even path-length sins, those
   1956 sins cannot easily hide from a request/response test.  The convention
   1957 for a request/response test is to have a _RR suffix.  There are
   1958 however a few ``request/response'' tests that have other suffixes.
   1959 
   1960 A request/response test, particularly synchronous, one transaction at
   1961 a time test such as those found by default in netperf, is particularly
   1962 sensitive to the path-length of the networking stack.  An _RR test can
   1963 also uncover those platforms where the NICs are strapped by default
   1964 with overbearing interrupt avoidance settings in an attempt to
   1965 increase the bulk-transfer performance (or rather, decrease the CPU
   1966 utilization of a bulk-transfer test).  This sensitivity is most acute
   1967 for small request and response sizes, such as the single-byte default
   1968 for a netperf _RR test.
   1969 
   1970 While a bulk-transfer test reports its results in units of bits or
   1971 bytes transferred per second, by default a mumble_RR test reports
   1972 transactions per second where a transaction is defined as the
   1973 completed exchange of a request and a response.  One can invert the
   1974 transaction rate to arrive at the average round-trip latency.  If one
   1975 is confident about the symmetry of the connection, the average one-way
   1976 latency can be taken as one-half the average round-trip latency. As of
   1977 version 2.5.0 (actually slightly before) netperf still does not do the
   1978 latter, but will do the former if one sets the verbosity to 2 for a
   1979 classic netperf test, or includes the appropriate @ref{Omni Output
   1980 Selectors,output selector} in an @ref{The Omni Tests,omni test}.  It
   1981 will also allow the user to switch the throughput units from
   1982 transactions per second to bits or bytes per second with the global
   1983 @option{-f} option.
   1984 
   1985 @menu
   1986 * Issues in Request/Response::  
   1987 * Options Common to TCP UDP and SCTP _RR tests::  
   1988 @end menu
   1989 
   1990 @node Issues in Request/Response, Options Common to TCP UDP and SCTP _RR tests, Using Netperf to Measure Request/Response , Using Netperf to Measure Request/Response
   1991 @comment  node-name,  next,  previous,  up
   1992 @section Issues in Request/Response
   1993 
   1994 Most if not all the @ref{Issues in Bulk Transfer} apply to
   1995 request/response.  The issue of round-trip latency is even more
   1996 important as netperf generally only has one transaction outstanding at
   1997 a time.
   1998 
   1999 A single instance of a one transaction outstanding _RR test should
   2000 _never_ completely saturate the CPU of a system.  If testing between
   2001 otherwise evenly matched systems, the symmetric nature of a _RR test
   2002 with equal request and response sizes should result in equal CPU
   2003 loading on both systems. However, this may not hold true on MP
   2004 systems, particularly if one CPU binds the netperf and netserver
   2005 differently via the global @option{-T} option.
   2006 
   2007 For smaller request and response sizes packet loss is a bigger issue
   2008 as there is no opportunity for a @dfn{fast retransmit} or
   2009 retransmission prior to a retransmission timer expiring.
   2010 
   2011 Virtualization may considerably increase the effective path length of
   2012 a networking stack.  While this may not preclude achieving link-rate
   2013 on a comparatively slow link (eg 1 Gigabit Ethernet) on a _STREAM
   2014 test, it can show-up as measurably fewer transactions per second on an
   2015 _RR test.  However, this may still be masked by interrupt coalescing
   2016 in the NIC/driver.
   2017 
   2018 Certain NICs have ways to minimize the number of interrupts sent to
   2019 the host.  If these are strapped badly they can significantly reduce
   2020 the performance of something like a single-byte request/response test.
   2021 Such setups are distinguished by seriously low reported CPU utilization
   2022 and what seems like a low (even if in the thousands) transaction per
   2023 second rate.  Also, if you run such an OS/driver combination on faster
   2024 or slower hardware and do not see a corresponding change in the
   2025 transaction rate, chances are good that the driver is strapping the
   2026 NIC with aggressive interrupt avoidance settings.  Good for bulk
   2027 throughput, but bad for latency.
   2028 
   2029 Some drivers may try to automagically adjust the interrupt avoidance
   2030 settings.  If they are not terribly good at it, you will see
   2031 considerable run-to-run variation in reported transaction rates.
   2032 Particularly if you ``mix-up'' _STREAM and _RR tests.
   2033 
   2034 
   2035 @node Options Common to TCP UDP and SCTP _RR tests,  , Issues in Request/Response, Using Netperf to Measure Request/Response
   2036 @comment  node-name,  next,  previous,  up
   2037 @section Options Common to TCP UDP and SCTP _RR tests
   2038 
   2039 Many ``test-specific'' options are actually common across the
   2040 different tests.  For those tests involving TCP, UDP and SCTP, whether
   2041 using the BSD Sockets or the XTI interface those common options
   2042 include:
   2043 
   2044 @table @code
   2045 @vindex -h, Test-specific
   2046 @item -h
   2047 Display the test-suite-specific usage string and exit.  For a TCP_ or
   2048 UDP_ test this will be the usage string from the source file
   2049 @file{nettest_bsd.c}.  For an XTI_ test, this will be the usage string
   2050 from the source file @file{src/nettest_xti.c}.  For an SCTP test, this
   2051 will be the usage string from the source file
   2052 @file{src/nettest_sctp.c}.
   2053 
   2054 @vindex -H, Test-specific
   2055 @item -H <optionspec>
   2056 Normally, the remote hostname|IP and address family information is
   2057 inherited from the settings for the control connection (eg global
   2058 command-line @option{-H}, @option{-4} and/or @option{-6} options.
   2059 The test-specific @option{-H} will override those settings for the
   2060 data (aka test) connection only.  Settings for the control connection
   2061 are left unchanged.  This might be used to cause the control and data
   2062 connections to take different paths through the network.
   2063 
   2064 @vindex -L, Test-specific
   2065 @item -L <optionspec>
   2066 The test-specific @option{-L} option is identical to the test-specific
   2067 @option{-H} option except it affects the local hostname|IP and address
   2068 family information.  As with its global command-line counterpart, this
   2069 is generally only useful when measuring though those evil, end-to-end
   2070 breaking things called firewalls.
   2071 
   2072 @vindex -P, Test-specific
   2073 @item -P <optionspec>
   2074 Set the local and/or remote port numbers for the data connection.
   2075 
   2076 @vindex -r, Test-specific
   2077 @item -r <sizespec>
   2078 This option sets the request (first value) and/or response (second
   2079 value) sizes for an _RR test. By default the units are bytes, but a
   2080 suffix of ``G,'' ``M,'' or ``K'' will specify the units to be 2^30
   2081 (GB), 2^20 (MB) or 2^10 (KB) respectively.  A suffix of ``g,'' ``m''
   2082 or ``k'' will specify units of 10^9, 10^6 or 10^3 bytes
   2083 respectively. For example:
   2084 @example
   2085 @code{-r 128,16K}
   2086 @end example
   2087 Will set the request size to 128 bytes and the response size to 16 KB
   2088 or 16384 bytes. [Default: 1 - a single-byte request and response ]
   2089 
   2090 @vindex -s, Test-specific
   2091 @item -s <sizespec>
   2092 This option sets the local (netperf) send and receive socket buffer
   2093 sizes for the data connection to the value(s) specified.  Often, this
   2094 will affect the advertised and/or effective TCP or other window, but
   2095 on some platforms it may not. By default the units are bytes, but a
   2096 suffix of ``G,'' ``M,'' or ``K'' will specify the units to be 2^30
   2097 (GB), 2^20 (MB) or 2^10 (KB) respectively.  A suffix of ``g,'' ``m''
   2098 or ``k'' will specify units of 10^9, 10^6 or 10^3 bytes
   2099 respectively. For example:
   2100 @example
   2101 @code{-s 128K}
   2102 @end example
   2103 Will request the local send (netperf) and receive socket buffer sizes
   2104 to be 128KB or 131072 bytes.
   2105 
   2106 While the historic expectation is that setting the socket buffer size
   2107 has a direct effect on say the TCP window, today that may not hold
   2108 true for all stacks.  When running under Windows a value of 0 may be
   2109 used which will be an indication to the stack the user wants to enable
   2110 a form of copy avoidance. [Default: -1 - use the system's default
   2111 socket buffer sizes]
   2112 
   2113 @vindex -S, Test-specific
   2114 @item -S <sizespec>
   2115 This option sets the remote (netserver) send and/or receive socket
   2116 buffer sizes for the data connection to the value(s) specified.
   2117 Often, this will affect the advertised and/or effective TCP or other
   2118 window, but on some platforms it may not. By default the units are
   2119 bytes, but a suffix of ``G,'' ``M,'' or ``K'' will specify the units
   2120 to be 2^30 (GB), 2^20 (MB) or 2^10 (KB) respectively.  A suffix of
   2121 ``g,'' ``m'' or ``k'' will specify units of 10^9, 10^6 or 10^3 bytes
   2122 respectively.  For example:
   2123 @example
   2124 @code{-S 128K}
   2125 @end example
   2126 Will request the remote (netserver) send and receive socket buffer
   2127 sizes to be 128KB or 131072 bytes.
   2128 
   2129 While the historic expectation is that setting the socket buffer size
   2130 has a direct effect on say the TCP window, today that may not hold
   2131 true for all stacks.  When running under Windows a value of 0 may be
   2132 used which will be an indication to the stack the user wants to enable
   2133 a form of copy avoidance.  [Default: -1 - use the system's default
   2134 socket buffer sizes]
   2135 
   2136 @vindex -4, Test-specific
   2137 @item -4
   2138 Set the local and remote address family for the data connection to
   2139 AF_INET - ie use IPv4 addressing only.  Just as with their global
   2140 command-line counterparts the last of the @option{-4}, @option{-6},
   2141 @option{-H} or @option{-L} option wins for their respective address
   2142 families.
   2143 
   2144 @vindex -6 Test-specific
   2145 @item -6
   2146 This option is identical to its @option{-4} cousin, but requests IPv6
   2147 addresses for the local and remote ends of the data connection.
   2148 
   2149 @end table
   2150 
   2151 @menu
   2152 * TCP_RR::                      
   2153 * TCP_CC::                      
   2154 * TCP_CRR::                     
   2155 * UDP_RR::                      
   2156 * XTI_TCP_RR::                  
   2157 * XTI_TCP_CC::                  
   2158 * XTI_TCP_CRR::                 
   2159 * XTI_UDP_RR::                  
   2160 * DLCL_RR::                     
   2161 * DLCO_RR::                     
   2162 * SCTP_RR::                     
   2163 @end menu
   2164 
   2165 @node TCP_RR, TCP_CC, Options Common to TCP UDP and SCTP _RR tests, Options Common to TCP UDP and SCTP _RR tests
   2166 @subsection TCP_RR
   2167 @cindex Measuring Latency
   2168 @cindex Latency, Request-Response
   2169 
   2170 A TCP_RR (TCP Request/Response) test is requested by passing a value
   2171 of ``TCP_RR'' to the global @option{-t} command-line option.  A TCP_RR
   2172 test can be thought-of as a user-space to user-space @code{ping} with
   2173 no think time - it is by default a synchronous, one transaction at a
   2174 time, request/response test.
   2175 
   2176 The transaction rate is the number of complete transactions exchanged
   2177 divided by the length of time it took to perform those transactions.
   2178 
   2179 If the two Systems Under Test are otherwise identical, a TCP_RR test
   2180 with the same request and response size should be symmetric - it
   2181 should not matter which way the test is run, and the CPU utilization
   2182 measured should be virtually the same on each system.  If not, it
   2183 suggests that the CPU utilization mechanism being used may have some,
   2184 well, issues measuring CPU utilization completely and accurately.
   2185 
   2186 Time to establish the TCP connection is not counted in the result.  If
   2187 you want connection setup overheads included, you should consider the
   2188 @ref{TCP_CC,TPC_CC} or @ref{TCP_CRR,TCP_CRR} tests.
   2189 
   2190 If specifying the @option{-D} option to set TCP_NODELAY and disable
   2191 the Nagle Algorithm increases the transaction rate reported by a
   2192 TCP_RR test, it implies the stack(s) over which the TCP_RR test is
   2193 running have a broken implementation of the Nagle Algorithm.  Likely
   2194 as not they are interpreting Nagle on a segment by segment basis
   2195 rather than a user send by user send basis.  You should contact your
   2196 stack vendor(s) to report the problem to them.
   2197 
   2198 Here is an example of two systems running a basic TCP_RR test over a
   2199 10 Gigabit Ethernet link:
   2200 
   2201 @example
   2202 netperf -t TCP_RR -H 192.168.2.125
   2203 TCP REQUEST/RESPONSE TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 192.168.2.125 (192.168.2.125) port 0 AF_INET
   2204 Local /Remote
   2205 Socket Size   Request  Resp.   Elapsed  Trans.
   2206 Send   Recv   Size     Size    Time     Rate         
   2207 bytes  Bytes  bytes    bytes   secs.    per sec   
   2208 
   2209 16384  87380  1        1       10.00    29150.15   
   2210 16384  87380 
   2211 @end example
   2212 
   2213 In this example the request and response sizes were one byte, the
   2214 socket buffers were left at their defaults, and the test ran for all
   2215 of 10 seconds.  The transaction per second rate was rather good for
   2216 the time :)
   2217 
   2218 @node TCP_CC, TCP_CRR, TCP_RR, Options Common to TCP UDP and SCTP _RR tests
   2219 @subsection TCP_CC
   2220 @cindex Connection Latency
   2221 @cindex Latency, Connection Establishment
   2222 
   2223 A TCP_CC (TCP Connect/Close) test is requested by passing a value of
   2224 ``TCP_CC'' to the global @option{-t} option.  A TCP_CC test simply
   2225 measures how fast the pair of systems can open and close connections
   2226 between one another in a synchronous (one at a time) manner.  While
   2227 this is considered an _RR test, no request or response is exchanged
   2228 over the connection.
   2229 
   2230 @cindex Port Reuse
   2231 @cindex TIME_WAIT
   2232 The issue of TIME_WAIT reuse is an important one for a TCP_CC test.
   2233 Basically, TIME_WAIT reuse is when a pair of systems churn through
   2234 connections fast enough that they wrap the 16-bit port number space in
   2235 less time than the length of the TIME_WAIT state.  While it is indeed
   2236 theoretically possible to ``reuse'' a connection in TIME_WAIT, the
   2237 conditions under which such reuse is possible are rather rare.  An
   2238 attempt to reuse a connection in TIME_WAIT can result in a non-trivial
   2239 delay in connection establishment.
   2240 
   2241 Basically, any time the connection churn rate approaches:
   2242 
   2243 Sizeof(clientportspace) / Lengthof(TIME_WAIT)
   2244 
   2245 there is the risk of TIME_WAIT reuse.  To minimize the chances of this
   2246 happening, netperf will by default select its own client port numbers
   2247 from the range of 5000 to 65535.  On systems with a 60 second
   2248 TIME_WAIT state, this should allow roughly 1000 transactions per
   2249 second.  The size of the client port space used by netperf can be
   2250 controlled via the test-specific @option{-p} option, which takes a
   2251 @dfn{sizespec} as a value setting the minimum (first value) and
   2252 maximum (second value) port numbers used by netperf at the client end.
   2253 
   2254 Since no requests or responses are exchanged during a TCP_CC test,
   2255 only the @option{-H}, @option{-L}, @option{-4} and @option{-6} of the
   2256 ``common'' test-specific options are likely to have an effect, if any,
   2257 on the results.  The @option{-s} and @option{-S} options _may_ have
   2258 some effect if they alter the number and/or type of options carried in
   2259 the TCP SYNchronize segments, such as Window Scaling or Timestamps.
   2260 The @option{-P} and @option{-r} options are utterly ignored.
   2261 
   2262 Since connection establishment and tear-down for TCP is not symmetric,
   2263 a TCP_CC test is not symmetric in its loading of the two systems under
   2264 test.
   2265 
   2266 @node TCP_CRR, UDP_RR, TCP_CC, Options Common to TCP UDP and SCTP _RR tests
   2267 @subsection TCP_CRR
   2268 @cindex Latency, Connection Establishment
   2269 @cindex Latency, Request-Response
   2270 
   2271 The TCP Connect/Request/Response (TCP_CRR) test is requested by
   2272 passing a value of ``TCP_CRR'' to the global @option{-t} command-line
   2273 option.  A TCP_CRR test is like a merger of a @ref{TCP_RR} and
   2274 @ref{TCP_CC} test which measures the performance of establishing a
   2275 connection, exchanging a single request/response transaction, and
   2276 tearing-down that connection.  This is very much like what happens in
   2277 an HTTP 1.0 or HTTP 1.1 connection when HTTP Keepalives are not used.
   2278 In fact, the TCP_CRR test was added to netperf to simulate just that.
   2279 
   2280 Since a request and response are exchanged the @option{-r},
   2281 @option{-s} and @option{-S} options can have an effect on the
   2282 performance.
   2283 
   2284 The issue of TIME_WAIT reuse exists for the TCP_CRR test just as it
   2285 does for the TCP_CC test.  Similarly, since connection establishment
   2286 and tear-down is not symmetric, a TCP_CRR test is not symmetric even
   2287 when the request and response sizes are the same.
   2288 
   2289 @node UDP_RR, XTI_TCP_RR, TCP_CRR, Options Common to TCP UDP and SCTP _RR tests
   2290 @subsection UDP_RR
   2291 @cindex Latency, Request-Response
   2292 @cindex Packet Loss
   2293 
   2294 A UDP Request/Response (UDP_RR) test is requested by passing a value
   2295 of ``UDP_RR'' to a global @option{-t} option.  It is very much the
   2296 same as a TCP_RR test except UDP is used rather than TCP.
   2297 
   2298 UDP does not provide for retransmission of lost UDP datagrams, and
   2299 netperf does not add anything for that either.  This means that if
   2300 _any_ request or response is lost, the exchange of requests and
   2301 responses will stop from that point until the test timer expires.
   2302 Netperf will not really ``know'' this has happened - the only symptom
   2303 will be a low transaction per second rate.  If @option{--enable-burst}
   2304 was included in the @code{configure} command and a test-specific
   2305 @option{-b} option used, the UDP_RR test will ``survive'' the loss of
   2306 requests and responses until the sum is one more than the value passed
   2307 via the @option{-b} option. It will though almost certainly run more
   2308 slowly.
   2309 
   2310 The netperf side of a UDP_RR test will call @code{connect()} on its
   2311 data socket and thenceforth use the @code{send()} and @code{recv()}
   2312 socket calls.  The netserver side of a UDP_RR test will not call
   2313 @code{connect()} and will use @code{recvfrom()} and @code{sendto()}
   2314 calls.  This means that even if the request and response sizes are the
   2315 same, a UDP_RR test is _not_ symmetric in its loading of the two
   2316 systems under test.
   2317 
   2318 Here is an example of a UDP_RR test between two otherwise
   2319 identical two-CPU systems joined via a 1 Gigabit Ethernet network:
   2320 
   2321 @example
   2322 $ netperf -T 1 -H 192.168.1.213 -t UDP_RR -c -C
   2323 UDP REQUEST/RESPONSE TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 192.168.1.213 (192.168.1.213) port 0 AF_INET
   2324 Local /Remote
   2325 Socket Size   Request Resp.  Elapsed Trans.   CPU    CPU    S.dem   S.dem
   2326 Send   Recv   Size    Size   Time    Rate     local  remote local   remote
   2327 bytes  bytes  bytes   bytes  secs.   per sec  % I    % I    us/Tr   us/Tr
   2328 
   2329 65535  65535  1       1      10.01   15262.48   13.90  16.11  18.221  21.116
   2330 65535  65535 
   2331 @end example
   2332 
   2333 This example includes the @option{-c} and @option{-C} options to
   2334 enable CPU utilization reporting and shows the asymmetry in CPU
   2335 loading.  The @option{-T} option was used to make sure netperf and
   2336 netserver ran on a given CPU and did not move around during the test.
   2337 
   2338 @node XTI_TCP_RR, XTI_TCP_CC, UDP_RR, Options Common to TCP UDP and SCTP _RR tests
   2339 @subsection XTI_TCP_RR
   2340 @cindex Latency, Request-Response
   2341 
   2342 An XTI_TCP_RR test is essentially the same as a @ref{TCP_RR} test only
   2343 using the XTI rather than BSD Sockets interface. It is requested by
   2344 passing a value of ``XTI_TCP_RR'' to the @option{-t} global
   2345 command-line option.
   2346 
   2347 The test-specific options for an XTI_TCP_RR test are the same as those
   2348 for a TCP_RR test with the addition of the @option{-X <devspec>} option to
   2349 specify the names of the local and/or remote XTI device file(s).
   2350 
   2351 @node XTI_TCP_CC, XTI_TCP_CRR, XTI_TCP_RR, Options Common to TCP UDP and SCTP _RR tests
   2352 @comment  node-name,  next,  previous,  up
   2353 @subsection XTI_TCP_CC
   2354 @cindex Latency, Connection Establishment
   2355 
   2356 An XTI_TCP_CC test is essentially the same as a @ref{TCP_CC,TCP_CC}
   2357 test, only using the XTI rather than BSD Sockets interface.
   2358 
   2359 The test-specific options for an XTI_TCP_CC test are the same as those
   2360 for a TCP_CC test with the addition of the @option{-X <devspec>} option to
   2361 specify the names of the local and/or remote XTI device file(s).
   2362 
   2363 @node XTI_TCP_CRR, XTI_UDP_RR, XTI_TCP_CC, Options Common to TCP UDP and SCTP _RR tests
   2364 @comment  node-name,  next,  previous,  up
   2365 @subsection XTI_TCP_CRR
   2366 @cindex Latency, Connection Establishment
   2367 @cindex Latency, Request-Response
   2368 
   2369 The XTI_TCP_CRR test is essentially the same as a
   2370 @ref{TCP_CRR,TCP_CRR} test, only using the XTI rather than BSD Sockets
   2371 interface.
   2372 
   2373 The test-specific options for an XTI_TCP_CRR test are the same as those
   2374 for a TCP_RR test with the addition of the @option{-X <devspec>} option to
   2375 specify the names of the local and/or remote XTI device file(s).
   2376 
   2377 @node XTI_UDP_RR, DLCL_RR, XTI_TCP_CRR, Options Common to TCP UDP and SCTP _RR tests
   2378 @subsection XTI_UDP_RR
   2379 @cindex Latency, Request-Response
   2380 
   2381 An XTI_UDP_RR test is essentially the same as a UDP_RR test only using
   2382 the XTI rather than BSD Sockets interface.  It is requested by passing
   2383 a value of ``XTI_UDP_RR'' to the @option{-t} global command-line
   2384 option.
   2385 
   2386 The test-specific options for an XTI_UDP_RR test are the same as those
   2387 for a UDP_RR test with the addition of the @option{-X <devspec>}
   2388 option to specify the name of the local and/or remote XTI device
   2389 file(s).
   2390 
   2391 @node DLCL_RR, DLCO_RR, XTI_UDP_RR, Options Common to TCP UDP and SCTP _RR tests
   2392 @comment  node-name,  next,  previous,  up
   2393 @subsection DLCL_RR
   2394 @cindex Latency, Request-Response
   2395 
   2396 @node DLCO_RR, SCTP_RR, DLCL_RR, Options Common to TCP UDP and SCTP _RR tests
   2397 @comment  node-name,  next,  previous,  up
   2398 @subsection DLCO_RR
   2399 @cindex Latency, Request-Response
   2400 
   2401 @node SCTP_RR,  , DLCO_RR, Options Common to TCP UDP and SCTP _RR tests
   2402 @comment  node-name,  next,  previous,  up
   2403 @subsection SCTP_RR
   2404 @cindex Latency, Request-Response
   2405 
   2406 @node Using Netperf to Measure Aggregate Performance, Using Netperf to Measure Bidirectional Transfer, Using Netperf to Measure Request/Response , Top
   2407 @comment  node-name,  next,  previous,  up
   2408 @chapter Using Netperf to Measure Aggregate Performance
   2409 @cindex Aggregate Performance
   2410 @vindex --enable-burst, Configure
   2411 
   2412 Ultimately, @ref{Netperf4,Netperf4} will be the preferred benchmark to
   2413 use when one wants to measure aggregate performance because netperf
   2414 has no support for explicit synchronization of concurrent tests. Until
   2415 netperf4 is ready for prime time, one can make use of the heuristics
   2416 and procedures mentioned here for the 85% solution.
   2417 
   2418 There are a few ways to measure aggregate performance with netperf.
   2419 The first is to run multiple, concurrent netperf tests and can be
   2420 applied to any of the netperf tests.  The second is to configure
   2421 netperf with @code{--enable-burst} and is applicable to the TCP_RR
   2422 test. The third is a variation on the first.
   2423 
   2424 @menu
   2425 * Running Concurrent Netperf Tests::  
   2426 * Using --enable-burst::        
   2427 * Using --enable-demo::         
   2428 @end menu
   2429 
   2430 @node  Running Concurrent Netperf Tests, Using --enable-burst, Using Netperf to Measure Aggregate Performance, Using Netperf to Measure Aggregate Performance
   2431 @comment  node-name,  next,  previous,  up
   2432 @section Running Concurrent Netperf Tests
   2433 
   2434 @ref{Netperf4,Netperf4} is the preferred benchmark to use when one
   2435 wants to measure aggregate performance because netperf has no support
   2436 for explicit synchronization of concurrent tests.  This leaves
   2437 netperf2 results vulnerable to @dfn{skew} errors.
   2438 
   2439 However, since there are times when netperf4 is unavailable it may be
   2440 necessary to run netperf. The skew error can be minimized by making
   2441 use of the confidence interval functionality.  Then one simply
   2442 launches multiple tests from the shell using a @code{for} loop or the
   2443 like:
   2444 
   2445 @example
   2446 for i in 1 2 3 4
   2447 do
   2448 netperf -t TCP_STREAM -H tardy.cup.hp.com -i 10 -P 0 &
   2449 done
   2450 @end example
   2451 
   2452 which will run four, concurrent @ref{TCP_STREAM,TCP_STREAM} tests from
   2453 the system on which it is executed to tardy.cup.hp.com.  Each
   2454 concurrent netperf will iterate 10 times thanks to the @option{-i}
   2455 option and will omit the test banners (option @option{-P}) for
   2456 brevity.  The output looks something like this:
   2457 
   2458 @example
   2459  87380  16384  16384    10.03     235.15   
   2460  87380  16384  16384    10.03     235.09   
   2461  87380  16384  16384    10.03     235.38   
   2462  87380  16384  16384    10.03     233.96
   2463 @end example
   2464 
   2465 We can take the sum of the results and be reasonably confident that
   2466 the aggregate performance was 940 Mbits/s.  This method does not need
   2467 to be limited to one system speaking to one other system.  It can be
   2468 extended to one system talking to N other systems.  It could be as simple as:
   2469 @example
   2470 for host in 'foo bar baz bing'
   2471 do
   2472 netperf -t TCP_STREAM -H $hosts -i 10 -P 0 &
   2473 done
   2474 @end example
   2475 A more complicated/sophisticated example can be found in
   2476 @file{doc/examples/runemomniagg2.sh} where.
   2477 
   2478 If you see warnings about netperf not achieving the confidence
   2479 intervals, the best thing to do is to increase the number of
   2480 iterations with @option{-i} and/or increase the run length of each
   2481 iteration with @option{-l}.
   2482 
   2483 You can also enable local (@option{-c}) and/or remote (@option{-C})
   2484 CPU utilization:
   2485 
   2486 @example
   2487 for i in 1 2 3 4
   2488 do
   2489 netperf -t TCP_STREAM -H tardy.cup.hp.com -i 10 -P 0 -c -C &
   2490 done
   2491 
   2492 87380  16384  16384    10.03       235.47   3.67     5.09     10.226  14.180 
   2493 87380  16384  16384    10.03       234.73   3.67     5.09     10.260  14.225 
   2494 87380  16384  16384    10.03       234.64   3.67     5.10     10.263  14.231 
   2495 87380  16384  16384    10.03       234.87   3.67     5.09     10.253  14.215
   2496 @end example
   2497 
   2498 If the CPU utilizations reported for the same system are the same or
   2499 very very close you can be reasonably confident that skew error is
   2500 minimized.  Presumably one could then omit @option{-i} but that is
   2501 not advised, particularly when/if the CPU utilization approaches 100
   2502 percent.  In the example above we see that the CPU utilization on the
   2503 local system remains the same for all four tests, and is only off by
   2504 0.01 out of 5.09 on the remote system.  As the number of CPUs in the
   2505 system increases, and so too the odds of saturating a single CPU, the
   2506 accuracy of similar CPU utilization implying little skew error is
   2507 diminished.  This is also the case for those increasingly rare single
   2508 CPU systems if the utilization is reported as 100% or very close to
   2509 it.
   2510 
   2511 @quotation
   2512 @b{NOTE: It is very important to remember that netperf is calculating
   2513 system-wide CPU utilization.  When calculating the service demand
   2514 (those last two columns in the output above) each netperf assumes it
   2515 is the only thing running on the system.  This means that for
   2516 concurrent tests the service demands reported by netperf will be
   2517 wrong.  One has to compute service demands for concurrent tests by
   2518 hand.}
   2519 @end quotation
   2520 
   2521 If you wish you can add a unique, global @option{-B} option to each
   2522 command line to append the given string to the output:
   2523 
   2524 @example
   2525 for i in 1 2 3 4
   2526 do
   2527 netperf -t TCP_STREAM -H tardy.cup.hp.com -B "this is test $i" -i 10 -P 0 &
   2528 done
   2529 
   2530 87380  16384  16384    10.03     234.90   this is test 4
   2531 87380  16384  16384    10.03     234.41   this is test 2
   2532 87380  16384  16384    10.03     235.26   this is test 1
   2533 87380  16384  16384    10.03     235.09   this is test 3
   2534 @end example
   2535 
   2536 You will notice that the tests completed in an order other than they
   2537 were started from the shell.  This underscores why there is a threat
   2538 of skew error and why netperf4 will eventually be the preferred tool
   2539 for aggregate tests.  Even if you see the Netperf Contributing Editor
   2540 acting to the contrary!-)
   2541 
   2542 @menu
   2543 * Issues in Running Concurrent Tests::  
   2544 @end menu
   2545 
   2546 @node Issues in Running Concurrent Tests,  , Running Concurrent Netperf Tests, Running Concurrent Netperf Tests
   2547 @subsection Issues in Running Concurrent Tests
   2548 
   2549 In addition to the aforementioned issue of skew error, there can be
   2550 other issues to consider when running concurrent netperf tests.
   2551 
   2552 For example, when running concurrent tests over multiple interfaces,
   2553 one is not always assured that the traffic one thinks went over a
   2554 given interface actually did so.  In particular, the Linux networking
   2555 stack takes a particularly strong stance on its following the so
   2556 called @samp{weak end system model}.  As such, it is willing to answer
   2557 ARP requests for any of its local IP addresses on any of its
   2558 interfaces.  If multiple interfaces are connected to the same
   2559 broadcast domain, then even if they are configured into separate IP
   2560 subnets there is no a priori way of knowing which interface was
   2561 actually used for which connection(s).  This can be addressed by
   2562 setting the @samp{arp_ignore} sysctl before configuring interfaces.
   2563 
   2564 As it is quite important, we will repeat that it is very important to
   2565 remember that each concurrent netperf instance is calculating
   2566 system-wide CPU utilization.  When calculating the service demand each
   2567 netperf assumes it is the only thing running on the system.  This
   2568 means that for concurrent tests the service demands reported by
   2569 netperf @b{will be wrong}.  One has to compute service demands for
   2570 concurrent tests by hand
   2571 
   2572 Running concurrent tests can also become difficult when there is no
   2573 one ``central'' node.  Running tests between pairs of systems may be
   2574 more difficult, calling for remote shell commands in the for loop
   2575 rather than netperf commands.  This introduces more skew error, which
   2576 the confidence intervals may not be able to sufficiently mitigate.
   2577 One possibility is to actually run three consecutive netperf tests on
   2578 each node - the first being a warm-up, the last being a cool-down.
   2579 The idea then is to ensure that the time it takes to get all the
   2580 netperfs started is less than the length of the first netperf command
   2581 in the sequence of three.  Similarly, it assumes that all ``middle''
   2582 netperfs will complete before the first of the ``last'' netperfs
   2583 complete.
   2584 
   2585 @node  Using --enable-burst, Using --enable-demo, Running Concurrent Netperf Tests, Using Netperf to Measure Aggregate Performance
   2586 @comment  node-name,  next,  previous,  up
   2587 @section Using - -enable-burst
   2588 
   2589 Starting in version 2.5.0 @code{--enable-burst=yes} is the default,
   2590 which means one no longer must:
   2591 
   2592 @example
   2593 configure --enable-burst
   2594 @end example
   2595 
   2596 To have burst-mode functionality present in netperf.  This enables a
   2597 test-specific @option{-b num} option in @ref{TCP_RR,TCP_RR},
   2598 @ref{UDP_RR,UDP_RR} and @ref{The Omni Tests,omni} tests.
   2599 
   2600 Normally, netperf will attempt to ramp-up the number of outstanding
   2601 requests to @option{num} plus one transactions in flight at one time.
   2602 The ramp-up is to avoid transactions being smashed together into a
   2603 smaller number of segments when the transport's congestion window (if
   2604 any) is smaller at the time than what netperf wants to have
   2605 outstanding at one time. If, however, the user specifies a negative
   2606 value for @option{num} this ramp-up is bypassed and the burst of sends
   2607 is made without consideration of transport congestion window.
   2608 
   2609 This burst-mode is used as an alternative to or even in conjunction
   2610 with multiple-concurrent _RR tests and as a way to implement a
   2611 single-connection, bidirectional bulk-transfer test.  When run with
   2612 just a single instance of netperf, increasing the burst size can
   2613 determine the maximum number of transactions per second which can be
   2614 serviced by a single process:
   2615 
   2616 @example
   2617 for b in 0 1 2 4 8 16 32
   2618 do 
   2619  netperf -v 0 -t TCP_RR -B "-b $b" -H hpcpc108 -P 0 -- -b $b
   2620 done
   2621 
   2622 9457.59 -b 0
   2623 9975.37 -b 1
   2624 10000.61 -b 2
   2625 20084.47 -b 4
   2626 29965.31 -b 8
   2627 71929.27 -b 16
   2628 109718.17 -b 32
   2629 @end example
   2630 
   2631 The global @option{-v} and @option{-P} options were used to minimize
   2632 the output to the single figure of merit which in this case the
   2633 transaction rate.  The global @code{-B} option was used to more
   2634 clearly label the output, and the test-specific @option{-b} option
   2635 enabled by @code{--enable-burst} increase the number of transactions
   2636 in flight at one time.
   2637 
   2638 Now, since the test-specific @option{-D} option was not specified to
   2639 set TCP_NODELAY, the stack was free to ``bundle'' requests and/or
   2640 responses into TCP segments as it saw fit, and since the default
   2641 request and response size is one byte, there could have been some
   2642 considerable bundling even in the absence of transport congestion
   2643 window issues.  If one wants to try to achieve a closer to
   2644 one-to-one correspondence between a request and response and a TCP
   2645 segment, add the test-specific @option{-D} option:
   2646 
   2647 @example
   2648 for b in 0 1 2 4 8 16 32
   2649 do
   2650  netperf -v 0 -t TCP_RR -B "-b $b -D" -H hpcpc108 -P 0 -- -b $b -D
   2651 done
   2652 
   2653  8695.12 -b 0 -D
   2654  19966.48 -b 1 -D
   2655  20691.07 -b 2 -D
   2656  49893.58 -b 4 -D
   2657  62057.31 -b 8 -D
   2658  108416.88 -b 16 -D
   2659  114411.66 -b 32 -D
   2660 @end example
   2661 
   2662 You can see that this has a rather large effect on the reported
   2663 transaction rate.  In this particular instance, the author believes it
   2664 relates to interactions between the test and interrupt coalescing
   2665 settings in the driver for the NICs used.
   2666 
   2667 @quotation
   2668 @b{NOTE: Even if you set the @option{-D} option that is still not a
   2669 guarantee that each transaction is in its own TCP segments.  You
   2670 should get into the habit of verifying the relationship between the
   2671 transaction rate and the packet rate via other means.}
   2672 @end quotation
   2673 
   2674 You can also combine @code{--enable-burst} functionality with
   2675 concurrent netperf tests.  This would then be an ``aggregate of
   2676 aggregates'' if you like:
   2677 
   2678 @example
   2679 
   2680 for i in 1 2 3 4
   2681 do
   2682  netperf -H hpcpc108 -v 0 -P 0 -i 10 -B "aggregate $i -b 8 -D" -t TCP_RR -- -b 8 -D &
   2683 done
   2684 
   2685  46668.38 aggregate 4 -b 8 -D
   2686  44890.64 aggregate 2 -b 8 -D
   2687  45702.04 aggregate 1 -b 8 -D
   2688  46352.48 aggregate 3 -b 8 -D
   2689 
   2690 @end example
   2691 
   2692 Since each netperf did hit the confidence intervals, we can be
   2693 reasonably certain that the aggregate transaction per second rate was
   2694 the sum of all four concurrent tests, or something just shy of 184,000
   2695 transactions per second.  To get some idea if that was also the packet
   2696 per second rate, we could bracket that @code{for} loop with something
   2697 to gather statistics and run the results through
   2698 @uref{ftp://ftp.cup.hp.com/dist/networking/tools,beforeafter}:
   2699 
   2700 @example
   2701 /usr/sbin/ethtool -S eth2 > before
   2702 for i in 1 2 3 4
   2703 do
   2704  netperf -H 192.168.2.108 -l 60 -v 0 -P 0 -B "aggregate $i -b 8 -D" -t TCP_RR -- -b 8 -D &
   2705 done
   2706 wait
   2707 /usr/sbin/ethtool -S eth2 > after
   2708 
   2709  52312.62 aggregate 2 -b 8 -D
   2710  50105.65 aggregate 4 -b 8 -D
   2711  50890.82 aggregate 1 -b 8 -D
   2712  50869.20 aggregate 3 -b 8 -D
   2713 
   2714 beforeafter before after > delta
   2715 
   2716 grep packets delta
   2717      rx_packets: 12251544
   2718      tx_packets: 12251550
   2719 
   2720 @end example
   2721 
   2722 This example uses @code{ethtool} because the system being used is
   2723 running Linux.  Other platforms have other tools - for example HP-UX
   2724 has lanadmin:
   2725 
   2726 @example
   2727 lanadmin -g mibstats <ppa>
   2728 @end example
   2729 
   2730 and of course one could instead use @code{netstat}.
   2731 
   2732 The @code{wait} is important because we are launching concurrent
   2733 netperfs in the background.  Without it, the second ethtool command
   2734 would be run before the tests finished and perhaps even before the
   2735 last of them got started!
   2736 
   2737 The sum of the reported transaction rates is 204178 over 60 seconds,
   2738 which is a total of 12250680 transactions.  Each transaction is the
   2739 exchange of a request and a response, so we multiply that by 2 to
   2740 arrive at 24501360.
   2741 
   2742 The sum of the ethtool stats is 24503094 packets which matches what
   2743 netperf was reporting very well. 
   2744 
   2745 Had the request or response size differed, we would need to know how
   2746 it compared with the @dfn{MSS} for the connection.
   2747 
   2748 Just for grins, here is the exercise repeated, using @code{netstat}
   2749 instead of @code{ethtool}
   2750 
   2751 @example
   2752 netstat -s -t > before
   2753 for i in 1 2 3 4
   2754 do
   2755  netperf -l 60 -H 192.168.2.108 -v 0 -P 0 -B "aggregate $i -b 8 -D" -t TCP_RR -- -b 8 -D & done
   2756 wait
   2757 netstat -s -t > after
   2758 
   2759  51305.88 aggregate 4 -b 8 -D
   2760  51847.73 aggregate 2 -b 8 -D
   2761  50648.19 aggregate 3 -b 8 -D
   2762  53605.86 aggregate 1 -b 8 -D
   2763 
   2764 beforeafter before after > delta
   2765 
   2766 grep segments delta
   2767     12445708 segments received
   2768     12445730 segments send out
   2769     1 segments retransmited
   2770     0 bad segments received.
   2771 @end example
   2772 
   2773 The sums are left as an exercise to the reader :)
   2774 
   2775 Things become considerably more complicated if there are non-trvial
   2776 packet losses and/or retransmissions.
   2777 
   2778 Of course all this checking is unnecessary if the test is a UDP_RR
   2779 test because UDP ``never'' aggregates multiple sends into the same UDP
   2780 datagram, and there are no ACKnowledgements in UDP.  The loss of a
   2781 single request or response will not bring a ``burst'' UDP_RR test to a
   2782 screeching halt, but it will reduce the number of transactions
   2783 outstanding at any one time.  A ``burst'' UDP_RR test @b{will} come to a
   2784 halt if the sum of the lost requests and responses reaches the value
   2785 specified in the test-specific @option{-b} option.
   2786 
   2787 @node Using --enable-demo,  , Using --enable-burst, Using Netperf to Measure Aggregate Performance
   2788 @section Using - -enable-demo
   2789 
   2790 One can
   2791 @example
   2792 configure --enable-demo
   2793 @end example
   2794 and compile netperf to enable netperf to emit ``interim results'' at
   2795 semi-regular intervals.  This enables a global @code{-D} option which
   2796 takes a reporting interval as an argument.  With that specified, the
   2797 output of netperf will then look something like
   2798 
   2799 @example
   2800 $ src/netperf -D 1.25
   2801 MIGRATED TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to localhost.localdomain () port 0 AF_INET : demo
   2802 Interim result: 25425.52 10^6bits/s over 1.25 seconds ending at 1327962078.405
   2803 Interim result: 25486.82 10^6bits/s over 1.25 seconds ending at 1327962079.655
   2804 Interim result: 25474.96 10^6bits/s over 1.25 seconds ending at 1327962080.905
   2805 Interim result: 25523.49 10^6bits/s over 1.25 seconds ending at 1327962082.155
   2806 Interim result: 25053.57 10^6bits/s over 1.27 seconds ending at 1327962083.429
   2807 Interim result: 25349.64 10^6bits/s over 1.25 seconds ending at 1327962084.679
   2808 Interim result: 25292.84 10^6bits/s over 1.25 seconds ending at 1327962085.932
   2809 Recv   Send    Send                          
   2810 Socket Socket  Message  Elapsed              
   2811 Size   Size    Size     Time     Throughput  
   2812 bytes  bytes   bytes    secs.    10^6bits/sec  
   2813 
   2814  87380  16384  16384    10.00    25375.66   
   2815 @end example
   2816 The units of the ``Interim result'' lines will follow the units
   2817 selected via the global @code{-f} option.  If the test-specific
   2818 @code{-o} option is specified on the command line, the format will be
   2819 CSV:
   2820 @example
   2821 ...
   2822 2978.81,MBytes/s,1.25,1327962298.035
   2823 ...
   2824 @end example
   2825 If the test-specific @code{-k} option is used the format will be
   2826 keyval with each keyval being given an index:
   2827 @example
   2828 ...
   2829 NETPERF_INTERIM_RESULT[2]=25.00
   2830 NETPERF_UNITS[2]=10^9bits/s
   2831 NETPERF_INTERVAL[2]=1.25
   2832 NETPERF_ENDING[2]=1327962357.249
   2833 ...
   2834 @end example
   2835 The expectation is it may be easier to utilize the keyvals if they
   2836 have indices.
   2837 
   2838 But how does this help with aggregate tests?  Well, what one can do is
   2839 start the netperfs via a script, giving each a Very Long (tm) run
   2840 time.  Direct the output to a file per instance.  Then, once all the
   2841 netperfs have been started, take a timestamp and wait for some desired
   2842 test interval.  Once that interval expires take another timestamp and
   2843 then start terminating the netperfs by sending them a SIGALRM signal
   2844 via the likes of the @code{kill} or @code{pkill} command.  The
   2845 netperfs will terminate and emit the rest of the ``usual'' output, and
   2846 you can then bring the files to a central location for post
   2847 processing to find the aggregate performance over the ``test interval.''  
   2848 
   2849 This method has the advantage that it does not require advance
   2850 knowledge of how long it takes to get netperf tests started and/or
   2851 stopped.  It does though require sufficiently synchronized clocks on
   2852 all the test systems.
   2853 
   2854 While calls to get the current time can be inexpensive, that neither
   2855 has been nor is universally true.  For that reason netperf tries to
   2856 minimize the number of such ``timestamping'' calls (eg
   2857 @code{gettimeofday}) calls it makes when in demo mode.  Rather than
   2858 take a timestamp after each @code{send} or @code{recv} call completes
   2859 netperf tries to guess how many units of work will be performed over
   2860 the desired interval.  Only once that many units of work have been
   2861 completed will netperf check the time.  If the reporting interval has
   2862 passed, netperf will emit an ``interim result.''  If the interval has
   2863 not passed, netperf will update its estimate for units and continue.
   2864 
   2865 After a bit of thought one can see that if things ``speed-up'' netperf
   2866 will still honor the interval.  However, if things ``slow-down''
   2867 netperf may be late with an ``interim result.''  Here is an example of
   2868 both of those happening during a test - with the interval being
   2869 honored while throughput increases, and then about half-way through
   2870 when another netperf (not shown) is started we see things slowing down
   2871 and netperf not hitting the interval as desired.
   2872 @example
   2873 $ src/netperf -D 2 -H tardy.hpl.hp.com -l 20
   2874 MIGRATED TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to tardy.hpl.hp.com () port 0 AF_INET : demo
   2875 Interim result:   36.46 10^6bits/s over 2.01 seconds ending at 1327963880.565
   2876 Interim result:   59.19 10^6bits/s over 2.00 seconds ending at 1327963882.569
   2877 Interim result:   73.39 10^6bits/s over 2.01 seconds ending at 1327963884.576
   2878 Interim result:   84.01 10^6bits/s over 2.03 seconds ending at 1327963886.603
   2879 Interim result:   75.63 10^6bits/s over 2.21 seconds ending at 1327963888.814
   2880 Interim result:   55.52 10^6bits/s over 2.72 seconds ending at 1327963891.538
   2881 Interim result:   70.94 10^6bits/s over 2.11 seconds ending at 1327963893.650
   2882 Interim result:   80.66 10^6bits/s over 2.13 seconds ending at 1327963895.777
   2883 Interim result:   86.42 10^6bits/s over 2.12 seconds ending at 1327963897.901
   2884 Recv   Send    Send                          
   2885 Socket Socket  Message  Elapsed              
   2886 Size   Size    Size     Time     Throughput  
   2887 bytes  bytes   bytes    secs.    10^6bits/sec  
   2888 
   2889  87380  16384  16384    20.34      68.87   
   2890 @end example
   2891 So long as your post-processing mechanism can account for that, there
   2892 should be no problem.  As time passes there may be changes to try to
   2893 improve the netperf's honoring the interval but one should not
   2894 ass-u-me it will always do so.  One should not assume the precision
   2895 will remain fixed - future versions may change it - perhaps going
   2896 beyond tenths of seconds in reporting the interval length etc.
   2897 
   2898 @node Using Netperf to Measure Bidirectional Transfer, The Omni Tests, Using Netperf to Measure Aggregate Performance, Top
   2899 @comment  node-name,  next,  previous,  up
   2900 @chapter Using Netperf to Measure Bidirectional Transfer
   2901 
   2902 There are two ways to use netperf to measure the performance of
   2903 bidirectional transfer.  The first is to run concurrent netperf tests
   2904 from the command line.  The second is to configure netperf with
   2905 @code{--enable-burst} and use a single instance of the
   2906 @ref{TCP_RR,TCP_RR} test.
   2907 
   2908 While neither method is more ``correct'' than the other, each is doing
   2909 so in different ways, and that has possible implications.  For
   2910 instance, using the concurrent netperf test mechanism means that
   2911 multiple TCP connections and multiple processes are involved, whereas
   2912 using the single instance of TCP_RR there is only one TCP connection
   2913 and one process on each end.  They may behave differently, especially
   2914 on an MP system.
   2915 
   2916 @menu
   2917 * Bidirectional Transfer with Concurrent Tests::  
   2918 * Bidirectional Transfer with TCP_RR::  
   2919 * Implications of Concurrent Tests vs Burst Request/Response::  
   2920 @end menu
   2921 
   2922 @node  Bidirectional Transfer with Concurrent Tests, Bidirectional Transfer with TCP_RR, Using Netperf to Measure Bidirectional Transfer, Using Netperf to Measure Bidirectional Transfer
   2923 @comment  node-name,  next,  previous,  up
   2924 @section Bidirectional Transfer with Concurrent Tests
   2925 
   2926 If we had two hosts Fred and Ethel, we could simply run a netperf
   2927 @ref{TCP_STREAM,TCP_STREAM} test on Fred pointing at Ethel, and a
   2928 concurrent netperf TCP_STREAM test on Ethel pointing at Fred, but
   2929 since there are no mechanisms to synchronize netperf tests and we
   2930 would be starting tests from two different systems, there is a
   2931 considerable risk of skew error.
   2932 
   2933 Far better would be to run simultaneous TCP_STREAM and
   2934 @ref{TCP_MAERTS,TCP_MAERTS} tests from just @b{one} system, using the
   2935 concepts and procedures outlined in @ref{Running Concurrent Netperf
   2936 Tests,Running Concurrent Netperf Tests}. Here then is an example:
   2937 
   2938 @example
   2939 for i in 1
   2940 do
   2941  netperf -H 192.168.2.108 -t TCP_STREAM -B "outbound" -i 10 -P 0 -v 0 \
   2942    -- -s 256K -S 256K &
   2943  netperf -H 192.168.2.108 -t TCP_MAERTS -B "inbound"  -i 10 -P 0 -v 0 \
   2944    -- -s 256K -S 256K &
   2945 done
   2946 
   2947  892.66 outbound
   2948  891.34 inbound
   2949 @end example
   2950 
   2951 We have used a @code{for} loop in the shell with just one iteration
   2952 because that will be @b{much} easier to get both tests started at more or
   2953 less the same time than doing it by hand.  The global @option{-P} and
   2954 @option{-v} options are used because we aren't interested in anything
   2955 other than the throughput, and the global @option{-B} option is used
   2956 to tag each output so we know which was inbound and which outbound
   2957 relative to the system on which we were running netperf.  Of course
   2958 that sense is switched on the system running netserver :)  The use of
   2959 the global @option{-i} option is explained in @ref{Running Concurrent
   2960 Netperf Tests,Running Concurrent Netperf Tests}.
   2961 
   2962 Beginning with version 2.5.0 we can accomplish a similar result with
   2963 the @ref{The Omni Tests,the omni tests} and @ref{Omni Output
   2964 Selectors,output selectors}:
   2965 
   2966 @example
   2967 for i in 1
   2968 do
   2969   netperf -H 192.168.1.3 -t omni -l 10 -P 0 -- \
   2970     -d stream -s 256K -S 256K -o throughput,direction &
   2971   netperf -H 192.168.1.3 -t omni -l 10 -P 0 -- \
   2972     -d maerts -s 256K -S 256K -o throughput,direction &
   2973 done
   2974 
   2975 805.26,Receive
   2976 828.54,Send
   2977 @end example
   2978 
   2979 @node  Bidirectional Transfer with TCP_RR, Implications of Concurrent Tests vs Burst Request/Response, Bidirectional Transfer with Concurrent Tests, Using Netperf to Measure Bidirectional Transfer
   2980 @comment  node-name,  next,  previous,  up
   2981 @section Bidirectional Transfer with TCP_RR
   2982 
   2983 Starting with version 2.5.0 the @code{--enable-burst} configure option
   2984 defaults to @code{yes}, and starting some time before version 2.5.0
   2985 but after 2.4.0 the global @option{-f} option would affect the
   2986 ``throughput'' reported by request/response tests.  If one uses the
   2987 test-specific @option{-b} option to have several ``transactions'' in
   2988 flight at one time and the test-specific @option{-r} option to
   2989 increase their size, the test looks more and more like a
   2990 single-connection bidirectional transfer than a simple
   2991 request/response test.
   2992 
   2993 So, putting it all together one can do something like:
   2994 
   2995 @example
   2996 netperf -f m -t TCP_RR -H 192.168.1.3 -v 2 -- -b 6 -r 32K -S 256K -S 256K
   2997 MIGRATED TCP REQUEST/RESPONSE TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 192.168.1.3 (192.168.1.3) port 0 AF_INET : interval : first burst 6
   2998 Local /Remote
   2999 Socket Size   Request  Resp.   Elapsed  
   3000 Send   Recv   Size     Size    Time     Throughput 
   3001 bytes  Bytes  bytes    bytes   secs.    10^6bits/sec   
   3002 
   3003 16384  87380  32768    32768   10.00    1821.30   
   3004 524288 524288
   3005 Alignment      Offset         RoundTrip  Trans    Throughput
   3006 Local  Remote  Local  Remote  Latency    Rate     10^6bits/s
   3007 Send   Recv    Send   Recv    usec/Tran  per sec  Outbound   Inbound
   3008     8      0       0      0   2015.402   3473.252 910.492    910.492
   3009 @end example
   3010 
   3011 to get a bidirectional bulk-throughput result. As one can see, the -v
   3012 2 output will include a number of interesting, related values.
   3013 
   3014 @quotation
   3015 @b{NOTE: The logic behind @code{--enable-burst} is very simple, and there
   3016 are no calls to @code{poll()} or @code{select()} which means we want
   3017 to make sure that the @code{send()} calls will never block, or we run
   3018 the risk of deadlock with each side stuck trying to call @code{send()}
   3019 and neither calling @code{recv()}.}
   3020 @end quotation
   3021 
   3022 Fortunately, this is easily accomplished by setting a ``large enough''
   3023 socket buffer size with the test-specific @option{-s} and @option{-S}
   3024 options.  Presently this must be performed by the user.  Future
   3025 versions of netperf might attempt to do this automagically, but there
   3026 are some issues to be worked-out. 
   3027 
   3028 @node Implications of Concurrent Tests vs Burst Request/Response,  , Bidirectional Transfer with TCP_RR, Using Netperf to Measure Bidirectional Transfer
   3029 @section Implications of Concurrent Tests vs Burst Request/Response
   3030 
   3031 There are perhaps subtle but important differences between using
   3032 concurrent unidirectional tests vs a burst-mode request to measure
   3033 bidirectional performance.
   3034 
   3035 Broadly speaking, a single ``connection'' or ``flow'' of traffic
   3036 cannot make use of the services of more than one or two CPUs at either
   3037 end.  Whether one or two CPUs will be used processing a flow will
   3038 depend on the specifics of the stack(s) involved and whether or not
   3039 the global @option{-T} option has been used to bind netperf/netserver
   3040 to specific CPUs.
   3041 
   3042 When using concurrent tests there will be two concurrent connections
   3043 or flows, which means that upwards of four CPUs will be employed
   3044 processing the packets (global @option{-T} used, no more than two if
   3045 not), however, with just a single, bidirectional request/response test
   3046 no more than two CPUs will be employed (only one if the global
   3047 @option{-T} is not used).
   3048 
   3049 If there is a CPU bottleneck on either system this may result in
   3050 rather different results between the two methods.
   3051 
   3052 Also, with a bidirectional request/response test there is something of
   3053 a natural balance or synchronization between inbound and outbound - a
   3054 response will not be sent until a request is received, and (once the
   3055 burst level is reached) a subsequent request will not be sent until a
   3056 response is received.  This may mask favoritism in the NIC between
   3057 inbound and outbound processing.
   3058 
   3059 With two concurrent unidirectional tests there is no such
   3060 synchronization or balance and any favoritism in the NIC may be exposed.
   3061 
   3062 @node The Omni Tests, Other Netperf Tests, Using Netperf to Measure Bidirectional Transfer, Top
   3063 @chapter The Omni Tests
   3064 
   3065 Beginning with version 2.5.0, netperf begins a migration to the
   3066 @samp{omni} tests or ``Two routines to measure them all.''  The code for
   3067 the omni tests can be found in @file{src/nettest_omni.c} and the goal
   3068 is to make it easier for netperf to support multiple protocols and
   3069 report a great many additional things about the systems under test.
   3070 Additionally, a flexible output selection mechanism is present which
   3071 allows the user to chose specifically what values she wishes to have
   3072 reported and in what format.
   3073 
   3074 The omni tests are included by default in version 2.5.0.  To disable
   3075 them, one must:
   3076 @example
   3077 ./configure --enable-omni=no ...
   3078 @end example
   3079 
   3080 and remake netperf.  Remaking netserver is optional because even in
   3081 2.5.0 it has ``unmigrated'' netserver side routines for the classic
   3082 (eg @file{src/nettest_bsd.c}) tests.
   3083 
   3084 @menu
   3085 * Native Omni Tests::           
   3086 * Migrated Tests::              
   3087 * Omni Output Selection::       
   3088 @end menu
   3089 
   3090 @node Native Omni Tests, Migrated Tests, The Omni Tests, The Omni Tests
   3091 @section Native Omni Tests
   3092 
   3093 One access the omni tests ``natively'' by using a value of ``OMNI''
   3094 with the global @option{-t} test-selection option.  This will then
   3095 cause netperf to use the code in @file{src/nettest_omni.c} and in
   3096 particular the test-specific options parser for the omni tests.  The
   3097 test-specific options for the omni tests are a superset of those for
   3098 ``classic'' tests.  The options added by the omni tests are:
   3099 
   3100 @table @code
   3101 @vindex -c, Test-specific
   3102 @item -c
   3103 This explicitly declares that the test is to include connection
   3104 establishment and tear-down as in either a TCP_CRR or TCP_CC test.
   3105 
   3106 @vindex -d, Test-specific
   3107 @item -d <direction>
   3108 This option sets the direction of the test relative to the netperf
   3109 process.  As of version 2.5.0 one can use the following in a
   3110 case-insensitive manner:
   3111 
   3112 @table @code
   3113 @item send, stream, transmit, xmit or 2 
   3114 Any of which will cause netperf to send to the netserver.
   3115 @item recv, receive, maerts or 4
   3116 Any of which will cause netserver to send to netperf.
   3117 @item rr or 6
   3118 Either of which will cause a request/response test.
   3119 @end table
   3120 
   3121 Additionally, one can specify two directions separated by a '|'
   3122 character and they will be OR'ed together.  In this way one can use
   3123 the ''Send|Recv'' that will be emitted by the @ref{Omni Output
   3124 Selectors,DIRECTION} @ref{Omni Output Selection,output selector} when
   3125 used with a request/response test.
   3126 
   3127 @vindex -k, Test-specific
   3128 @item -k [@ref{Omni Output Selection,output selector}]
   3129 This option sets the style of output to ``keyval'' where each line of
   3130 output has the form:
   3131 @example
   3132 key=value
   3133 @end example
   3134 For example:
   3135 @example
   3136 $ netperf -t omni -- -d rr -k "THROUGHPUT,THROUGHPUT_UNITS"
   3137 OMNI TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to localhost.localdomain (127.0.0.1) port 0 AF_INET : demo
   3138 THROUGHPUT=59092.65
   3139 THROUGHPUT_UNITS=Trans/s
   3140 @end example
   3141 
   3142 Using the @option{-k} option will override any previous, test-specific
   3143 @option{-o} or @option{-O} option.
   3144 
   3145 @vindex -o, Test-specific
   3146 @item -o [@ref{Omni Output Selection,output selector}]
   3147 This option sets the style of output to ``CSV'' where there will be
   3148 one line of comma-separated values, preceded by one line of column
   3149 names unless the global @option{-P} option is used with a value of 0:
   3150 @example
   3151 $ netperf -t omni -- -d rr -o "THROUGHPUT,THROUGHPUT_UNITS"
   3152 OMNI TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to localhost.localdomain (127.0.0.1) port 0 AF_INET : demo
   3153 Throughput,Throughput Units
   3154 60999.07,Trans/s
   3155 @end example
   3156 
   3157 Using the @option{-o} option will override any previous, test-specific
   3158 @option{-k} or @option{-O} option.
   3159 
   3160 @vindex -O, Test-specific
   3161 @item -O [@ref{Omni Output Selection,output selector}]
   3162 This option sets the style of output to ``human readable'' which will
   3163 look quite similar to classic netperf output:
   3164 @example
   3165 $ netperf -t omni -- -d rr -O "THROUGHPUT,THROUGHPUT_UNITS"
   3166 OMNI TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to localhost.localdomain (127.0.0.1) port 0 AF_INET : demo
   3167 Throughput Throughput 
   3168            Units      
   3169                       
   3170                       
   3171 60492.57   Trans/s
   3172 @end example
   3173 
   3174 Using the @option{-O} option will override any previous, test-specific
   3175 @option{-k} or @option{-o} option.
   3176 
   3177 @vindex -t, Test-specific
   3178 @item -t
   3179 This option explicitly sets the socket type for the test's data
   3180 connection. As of version 2.5.0 the known socket types include
   3181 ``stream'' and ``dgram'' for SOCK_STREAM and SOCK_DGRAM respectively.
   3182 
   3183 @vindex -T, Test-specific
   3184 @item -T <protocol>
   3185 This option is used to explicitly set the protocol used for the
   3186 test. It is case-insensitive. As of version 2.5.0 the protocols known
   3187 to netperf include:
   3188 @table @code
   3189 @item TCP
   3190 Select the Transmission Control Protocol
   3191 @item UDP
   3192 Select the User Datagram Protocol
   3193 @item SDP
   3194 Select the Sockets Direct Protocol
   3195 @item DCCP
   3196 Select the Datagram Congestion Control Protocol
   3197 @item SCTP
   3198 Select the Stream Control Transport Protocol
   3199 @item udplite
   3200 Select UDP Lite
   3201 @end table
   3202 
   3203 The default is implicit based on other settings.
   3204 @end table
   3205 
   3206 The omni tests also extend the interpretation of some of the classic,
   3207 test-specific options for the BSD Sockets tests:
   3208 
   3209 @table @code
   3210 @item -m <optionspec>
   3211 This can set the send size for either or both of the netperf and
   3212 netserver sides of the test:
   3213 @example
   3214 -m 32K
   3215 @end example
   3216 sets only the netperf-side send size to 32768 bytes, and or's-in
   3217 transmit for the direction. This is effectively the same behaviour as
   3218 for the classic tests.
   3219 @example
   3220 -m ,32K
   3221 @end example
   3222 sets only the netserver side send size to 32768 bytes and or's-in
   3223 receive for the direction.
   3224 @example
   3225 -m 16K,32K
   3226 sets the netperf side send size to 16284 bytes, the netserver side
   3227 send size to 32768 bytes and the direction will be "Send|Recv."
   3228 @end example
   3229 @item -M <optionspec>
   3230 This can set the receive size for either or both of the netperf and
   3231 netserver sides of the test:
   3232 @example
   3233 -M 32K
   3234 @end example
   3235 sets only the netserver side receive size to 32768 bytes and or's-in
   3236 send for the test direction.
   3237 @example
   3238 -M ,32K
   3239 @end example
   3240 sets only the netperf side receive size to 32768 bytes and or's-in
   3241 receive for the test direction.
   3242 @example
   3243 -M 16K,32K
   3244 @end example
   3245 sets the netserver side receive size to 16384 bytes and the netperf
   3246 side receive size to 32768 bytes and the direction will be "Send|Recv."
   3247 @end table
   3248 
   3249 @node Migrated Tests, Omni Output Selection, Native Omni Tests, The Omni Tests
   3250 @section Migrated Tests
   3251 
   3252 As of version 2.5.0 several tests have been migrated to use the omni
   3253 code in @file{src/nettest_omni.c} for the core of their testing.  A
   3254 migrated test retains all its previous output code and so should still
   3255 ``look and feel'' just like a pre-2.5.0 test with one exception - the
   3256 first line of the test banners will include the word ``MIGRATED'' at
   3257 the beginning as in:
   3258 
   3259 @example
   3260 $ netperf
   3261 MIGRATED TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to localhost.localdomain (127.0.0.1) port 0 AF_INET : demo
   3262 Recv   Send    Send                          
   3263 Socket Socket  Message  Elapsed              
   3264 Size   Size    Size     Time     Throughput  
   3265 bytes  bytes   bytes    secs.    10^6bits/sec  
   3266 
   3267  87380  16384  16384    10.00    27175.27   
   3268 @end example
   3269 
   3270 The tests migrated in version 2.5.0 are:
   3271 @itemize
   3272 @item TCP_STREAM
   3273 @item TCP_MAERTS
   3274 @item TCP_RR
   3275 @item TCP_CRR
   3276 @item UDP_STREAM
   3277 @item UDP_RR
   3278 @end itemize
   3279 
   3280 It is expected that future releases will have additional tests
   3281 migrated to use the ``omni'' functionality.
   3282 
   3283 If one uses ``omni-specific'' test-specific options in conjunction
   3284 with a migrated test, instead of using the classic output code, the
   3285 new omni output code will be used. For example if one uses the
   3286 @option{-k} test-specific option with a value of
   3287 ``MIN_LATENCY,MAX_LATENCY'' with a migrated TCP_RR test one will see:
   3288 
   3289 @example
   3290 $ netperf -t tcp_rr -- -k THROUGHPUT,THROUGHPUT_UNITS
   3291 MIGRATED TCP REQUEST/RESPONSE TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to localhost.localdomain (127.0.0.1) port 0 AF_INET : demo
   3292 THROUGHPUT=60074.74
   3293 THROUGHPUT_UNITS=Trans/s
   3294 @end example
   3295 rather than:
   3296 @example
   3297 $ netperf -t tcp_rr
   3298 MIGRATED TCP REQUEST/RESPONSE TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to localhost.localdomain (127.0.0.1) port 0 AF_INET : demo
   3299 Local /Remote
   3300 Socket Size   Request  Resp.   Elapsed  Trans.
   3301 Send   Recv   Size     Size    Time     Rate         
   3302 bytes  Bytes  bytes    bytes   secs.    per sec   
   3303 
   3304 16384  87380  1        1       10.00    59421.52   
   3305 16384  87380 
   3306 @end example
   3307 
   3308 @node Omni Output Selection,  , Migrated Tests, The Omni Tests
   3309 @section Omni Output Selection
   3310 
   3311 The omni test-specific @option{-k}, @option{-o} and @option{-O}
   3312 options take an optional @code{output selector} by which the user can
   3313 configure what values are reported.  The output selector can take
   3314 several forms:
   3315 
   3316 @table @code
   3317 @item @file{filename}
   3318 The output selections will be read from the named file. Within the
   3319 file there can be up to four lines of comma-separated output
   3320 selectors. This controls how many multi-line blocks of output are emitted
   3321 when the @option{-O} option is used.  This output, while not identical to
   3322 ``classic'' netperf output, is inspired by it.  Multiple lines have no
   3323 effect for @option{-k} and @option{-o} options.  Putting output
   3324 selections in a file can be useful when the list of selections is long.
   3325 @item comma and/or semi-colon-separated list
   3326 The output selections will be parsed from a comma and/or
   3327 semi-colon-separated list of output selectors. When the list is given
   3328 to a @option{-O} option a semi-colon specifies a new output block
   3329 should be started.  Semi-colons have the same meaning as commas when
   3330 used with the @option{-k} or @option{-o} options.  Depending on the
   3331 command interpreter being used, the semi-colon may have to be escaped
   3332 somehow to keep it from being interpreted by the command interpreter.
   3333 This can often be done by enclosing the entire list in quotes.
   3334 @item all
   3335 If the keyword @b{all} is specified it means that all known output
   3336 values should be displayed at the end of the test.  This can be a
   3337 great deal of output.  As of version 2.5.0 there are 157 different
   3338 output selectors.
   3339 @item ?
   3340 If a ``?'' is given as the output selection, the list of all known
   3341 output selectors will be displayed and no test actually run.  When
   3342 passed to the @option{-O} option they will be listed one per
   3343 line. Otherwise they will be listed as a comma-separated list.  It may
   3344 be necessary to protect the ``?'' from the command interpreter by
   3345 escaping it or enclosing it in quotes.
   3346 @item no selector
   3347 If nothing is given to the @option{-k}, @option{-o} or @option{-O}
   3348 option then the code selects a default set of output selectors
   3349 inspired by classic netperf output. The format will be the @samp{human
   3350 readable} format emitted by the test-specific @option{-O} option.
   3351 @end table
   3352 
   3353 The order of evaluation will first check for an output selection.  If
   3354 none is specified with the @option{-k}, @option{-o} or @option{-O}
   3355 option netperf will select a default based on the characteristics of the
   3356 test.  If there is an output selection, the code will first check for
   3357 @samp{?}, then check to see if it is the magic @samp{all} keyword.
   3358 After that it will check for either @samp{,} or @samp{;} in the
   3359 selection and take that to mean it is a comma and/or
   3360 semi-colon-separated list. If none of those checks match, netperf will then
   3361 assume the output specification is a filename and attempt to open and
   3362 parse the file.
   3363 
   3364 @menu
   3365 * Omni Output Selectors::       
   3366 @end menu
   3367 
   3368 @node Omni Output Selectors,  , Omni Output Selection, Omni Output Selection
   3369 @subsection Omni Output Selectors
   3370 
   3371 As of version 2.5.0 the output selectors are:
   3372 
   3373 @table @code
   3374 @item OUTPUT_NONE
   3375 This is essentially a null output.  For @option{-k} output it will
   3376 simply add a line that reads ``OUTPUT_NONE='' to the output. For
   3377 @option{-o} it will cause an empty ``column'' to be included. For
   3378 @option{-O} output it will cause extra spaces to separate ``real'' output.
   3379 @item SOCKET_TYPE
   3380 This will cause the socket type (eg SOCK_STREAM, SOCK_DGRAM) for the
   3381 data connection to be output.
   3382 @item PROTOCOL
   3383 This will cause the protocol used for the data connection to be displayed.
   3384 @item DIRECTION
   3385 This will display the data flow direction relative to the netperf
   3386 process. Units: Send or Recv for a unidirectional bulk-transfer test,
   3387 or Send|Recv for a request/response test.
   3388 @item ELAPSED_TIME
   3389 This will display the elapsed time in seconds for the test.
   3390 @item THROUGHPUT
   3391 This will display the throughput for the test. Units: As requested via
   3392 the global @option{-f} option and displayed by the THROUGHPUT_UNITS
   3393 output selector.
   3394 @item THROUGHPUT_UNITS
   3395 This will display the units for what is displayed by the
   3396 @code{THROUGHPUT} output selector.
   3397 @item LSS_SIZE_REQ
   3398 This will display the local (netperf) send socket buffer size (aka
   3399 SO_SNDBUF) requested via the command line. Units: Bytes.
   3400 @item LSS_SIZE
   3401 This will display the local (netperf) send socket buffer size
   3402 (SO_SNDBUF) immediately after the data connection socket was created.
   3403 Peculiarities of different networking stacks may lead to this
   3404 differing from the size requested via the command line. Units: Bytes.
   3405 @item LSS_SIZE_END
   3406 This will display the local (netperf) send socket buffer size
   3407 (SO_SNDBUF) immediately before the data connection socket is closed.
   3408 Peculiarities of different networking stacks may lead this to differ
   3409 from the size requested via the command line and/or the size
   3410 immediately after the data connection socket was created. Units: Bytes.
   3411 @item LSR_SIZE_REQ
   3412 This will display the local (netperf) receive socket buffer size (aka
   3413 SO_RCVBUF) requested via the command line. Units: Bytes.
   3414 @item LSR_SIZE
   3415 This will display the local (netperf) receive socket buffer size
   3416 (SO_RCVBUF) immediately after the data connection socket was created.
   3417 Peculiarities of different networking stacks may lead to this
   3418 differing from the size requested via the command line. Units: Bytes.
   3419 @item LSR_SIZE_END
   3420 This will display the local (netperf) receive socket buffer size
   3421 (SO_RCVBUF) immediately before the data connection socket is closed.
   3422 Peculiarities of different networking stacks may lead this to differ
   3423 from the size requested via the command line and/or the size
   3424 immediately after the data connection socket was created. Units: Bytes.
   3425 @item RSS_SIZE_REQ
   3426 This will display the remote (netserver) send socket buffer size (aka
   3427 SO_SNDBUF) requested via the command line. Units: Bytes.
   3428 @item RSS_SIZE
   3429 This will display the remote (netserver) send socket buffer size
   3430 (SO_SNDBUF) immediately after the data connection socket was created.
   3431 Peculiarities of different networking stacks may lead to this
   3432 differing from the size requested via the command line. Units: Bytes.
   3433 @item RSS_SIZE_END
   3434 This will display the remote (netserver) send socket buffer size
   3435 (SO_SNDBUF) immediately before the data connection socket is closed.
   3436 Peculiarities of different networking stacks may lead this to differ
   3437 from the size requested via the command line and/or the size
   3438 immediately after the data connection socket was created. Units: Bytes.
   3439 @item RSR_SIZE_REQ
   3440 This will display the remote (netserver) receive socket buffer size (aka
   3441 SO_RCVBUF) requested via the command line. Units: Bytes.
   3442 @item RSR_SIZE
   3443 This will display the remote (netserver) receive socket buffer size
   3444 (SO_RCVBUF) immediately after the data connection socket was created.
   3445 Peculiarities of different networking stacks may lead to this
   3446 differing from the size requested via the command line. Units: Bytes.
   3447 @item RSR_SIZE_END
   3448 This will display the remote (netserver) receive socket buffer size
   3449 (SO_RCVBUF) immediately before the data connection socket is closed.
   3450 Peculiarities of different networking stacks may lead this to differ
   3451 from the size requested via the command line and/or the size
   3452 immediately after the data connection socket was created. Units: Bytes.
   3453 @item LOCAL_SEND_SIZE
   3454 This will display the size of the buffers netperf passed in any
   3455 ``send'' calls it made on the data connection for a
   3456 non-request/response test. Units: Bytes.
   3457 @item LOCAL_RECV_SIZE
   3458 This will display the size of the buffers netperf passed in any
   3459 ``receive'' calls it made on the data connection for a
   3460 non-request/response test. Units: Bytes.
   3461 @item REMOTE_SEND_SIZE
   3462 This will display the size of the buffers netserver passed in any
   3463 ``send'' calls it made on the data connection for a
   3464 non-request/response test. Units: Bytes.
   3465 @item REMOTE_RECV_SIZE
   3466 This will display the size of the buffers netserver passed in any
   3467 ``receive'' calls it made on the data connection for a
   3468 non-request/response test. Units: Bytes.
   3469 @item REQUEST_SIZE
   3470 This will display the size of the requests netperf sent in a
   3471 request-response test. Units: Bytes.
   3472 @item RESPONSE_SIZE
   3473 This will display the size of the responses netserver sent in a
   3474 request-response test. Units: Bytes.
   3475 @item LOCAL_CPU_UTIL
   3476 This will display the overall CPU utilization during the test as
   3477 measured by netperf. Units: 0 to 100 percent.
   3478 @item LOCAL_CPU_PERCENT_USER
   3479 This will display the CPU fraction spent in user mode during the test
   3480 as measured by netperf. Only supported by netcpu_procstat. Units: 0 to
   3481 100 percent.
   3482 @item LOCAL_CPU_PERCENT_SYSTEM
   3483 This will display the CPU fraction spent in system mode during the test
   3484 as measured by netperf. Only supported by netcpu_procstat. Units: 0 to
   3485 100 percent.
   3486 @item LOCAL_CPU_PERCENT_IOWAIT
   3487 This will display the fraction of time waiting for I/O to complete
   3488 during the test as measured by netperf. Only supported by
   3489 netcpu_procstat. Units: 0 to 100 percent.
   3490 @item LOCAL_CPU_PERCENT_IRQ
   3491 This will display the fraction of time servicing interrupts during the
   3492 test as measured by netperf. Only supported by netcpu_procstat. Units:
   3493 0 to 100 percent.
   3494 @item LOCAL_CPU_PERCENT_SWINTR
   3495 This will display the fraction of time servicing softirqs during the
   3496 test as measured by netperf. Only supported by netcpu_procstat. Units:
   3497 0 to 100 percent.
   3498 @item LOCAL_CPU_METHOD
   3499 This will display the method used by netperf to measure CPU
   3500 utilization. Units: single character denoting method.
   3501 @item LOCAL_SD
   3502 This will display the service demand, or units of CPU consumed per
   3503 unit of work, as measured by netperf. Units: microseconds of CPU
   3504 consumed per either KB (K==1024) of data transferred or request/response
   3505 transaction. 
   3506 @item REMOTE_CPU_UTIL
   3507 This will display the overall CPU utilization during the test as
   3508 measured by netserver. Units 0 to 100 percent.
   3509 @item REMOTE_CPU_PERCENT_USER
   3510 This will display the CPU fraction spent in user mode during the test
   3511 as measured by netserver. Only supported by netcpu_procstat. Units: 0 to
   3512 100 percent.
   3513 @item REMOTE_CPU_PERCENT_SYSTEM
   3514 This will display the CPU fraction spent in system mode during the test
   3515 as measured by netserver. Only supported by netcpu_procstat. Units: 0 to
   3516 100 percent.
   3517 @item REMOTE_CPU_PERCENT_IOWAIT
   3518 This will display the fraction of time waiting for I/O to complete
   3519 during the test as measured by netserver. Only supported by
   3520 netcpu_procstat. Units: 0 to 100 percent.
   3521 @item REMOTE_CPU_PERCENT_IRQ
   3522 This will display the fraction of time servicing interrupts during the
   3523 test as measured by netserver. Only supported by netcpu_procstat. Units:
   3524 0 to 100 percent.
   3525 @item REMOTE_CPU_PERCENT_SWINTR
   3526 This will display the fraction of time servicing softirqs during the
   3527 test as measured by netserver. Only supported by netcpu_procstat. Units:
   3528 0 to 100 percent.
   3529 @item REMOTE_CPU_METHOD
   3530 This will display the method used by netserver to measure CPU
   3531 utilization. Units: single character denoting method.
   3532 @item REMOTE_SD
   3533 This will display the service demand, or units of CPU consumed per
   3534 unit of work, as measured by netserver. Units: microseconds of CPU
   3535 consumed per either KB (K==1024) of data transferred or
   3536 request/response transaction.
   3537 @item SD_UNITS
   3538 This will display the units for LOCAL_SD and REMOTE_SD
   3539 @item CONFIDENCE_LEVEL
   3540 This will display the confidence level requested by the user either
   3541 explicitly via the global @option{-I} option, or implicitly via the
   3542 global @option{-i} option.  The value will be either 95 or 99 if
   3543 confidence intervals have been requested or 0 if they were not. Units:
   3544 Percent
   3545 @item CONFIDENCE_INTERVAL
   3546 This will display the width of the confidence interval requested
   3547 either explicitly via the global @option{-I} option or implicitly via
   3548 the global @option{-i} option.  Units: Width in percent of mean value
   3549 computed. A value of -1.0 means that confidence intervals were not requested.
   3550 @item CONFIDENCE_ITERATION
   3551 This will display the number of test iterations netperf undertook,
   3552 perhaps while attempting to achieve the requested confidence interval
   3553 and level. If confidence intervals were requested via the command line
   3554 then the value will be between 3 and 30.  If confidence intervals were
   3555 not requested the value will be 1.  Units: Iterations
   3556 @item THROUGHPUT_CONFID
   3557 This will display the width of the confidence interval actually
   3558 achieved for @code{THROUGHPUT} during the test.  Units: Width of
   3559 interval as percentage of reported throughput value.
   3560 @item LOCAL_CPU_CONFID
   3561 This will display the width of the confidence interval actually
   3562 achieved for overall CPU utilization on the system running netperf
   3563 (@code{LOCAL_CPU_UTIL}) during the test, if CPU utilization measurement
   3564 was enabled.  Units: Width of interval as percentage of reported CPU
   3565 utilization.
   3566 @item REMOTE_CPU_CONFID
   3567 This will display the width of the confidence interval actually
   3568 achieved for overall CPU utilization on the system running netserver
   3569 (@code{REMOTE_CPU_UTIL}) during the test, if CPU utilization
   3570 measurement was enabled. Units: Width of interval as percentage of
   3571 reported CPU utilization.
   3572 @item TRANSACTION_RATE
   3573 This will display the transaction rate in transactions per second for
   3574 a request/response test even if the user has requested a throughput in
   3575 units of bits or bytes per second via the global @option{-f}
   3576 option. It is undefined for a non-request/response test. Units:
   3577 Transactions per second.
   3578 @item RT_LATENCY
   3579 This will display the average round-trip latency for a
   3580 request/response test, accounting for number of transactions in flight
   3581 at one time. It is undefined for a non-request/response test. Units:
   3582 Microseconds per transaction
   3583 @item BURST_SIZE
   3584 This will display the ``burst size'' or added transactions in flight
   3585 in a request/response test as requested via a test-specific
   3586 @option{-b} option.  The number of transactions in flight at one time
   3587 will be one greater than this value.  It is undefined for a
   3588 non-request/response test. Units: added Transactions in flight.
   3589 @item LOCAL_TRANSPORT_RETRANS
   3590 This will display the number of retransmissions experienced on the
   3591 data connection during the test as determined by netperf.  A value of
   3592 -1 means the attempt to determine the number of retransmissions failed
   3593 or the concept was not valid for the given protocol or the mechanism
   3594 is not known for the platform. A value of -2 means it was not
   3595 attempted. As of version 2.5.0 the meaning of values are in flux and
   3596 subject to change.  Units: number of retransmissions.
   3597 @item REMOTE_TRANSPORT_RETRANS
   3598 This will display the number of retransmissions experienced on the
   3599 data connection during the test as determined by netserver.  A value
   3600 of -1 means the attempt to determine the number of retransmissions
   3601 failed or the concept was not valid for the given protocol or the
   3602 mechanism is not known for the platform. A value of -2 means it was
   3603 not attempted. As of version 2.5.0 the meaning of values are in flux
   3604 and subject to change.  Units: number of retransmissions.
   3605 @item TRANSPORT_MSS
   3606 This will display the Maximum Segment Size (aka MSS) or its equivalent
   3607 for the protocol being used during the test.  A value of -1 means
   3608 either the concept of an MSS did not apply to the protocol being used,
   3609 or there was an error in retrieving it. Units: Bytes.
   3610 @item LOCAL_SEND_THROUGHPUT
   3611 The throughput as measured by netperf for the successful ``send''
   3612 calls it made on the data connection. Units: as requested via the
   3613 global @option{-f} option and displayed via the @code{THROUGHPUT_UNITS}
   3614 output selector.
   3615 @item LOCAL_RECV_THROUGHPUT
   3616 The throughput as measured by netperf for the successful ``receive''
   3617 calls it made on the data connection. Units: as requested via the
   3618 global @option{-f} option and displayed via the @code{THROUGHPUT_UNITS}
   3619 output selector.
   3620 @item REMOTE_SEND_THROUGHPUT
   3621 The throughput as measured by netserver for the successful ``send''
   3622 calls it made on the data connection. Units: as requested via the
   3623 global @option{-f} option and displayed via the @code{THROUGHPUT_UNITS}
   3624 output selector.
   3625 @item REMOTE_RECV_THROUGHPUT
   3626 The throughput as measured by netserver for the successful ``receive''
   3627 calls it made on the data connection. Units: as requested via the
   3628 global @option{-f} option and displayed via the @code{THROUGHPUT_UNITS}
   3629 output selector.
   3630 @item LOCAL_CPU_BIND
   3631 The CPU to which netperf was bound, if at all, during the test. A
   3632 value of -1 means that netperf was not explicitly bound to a CPU
   3633 during the test. Units: CPU ID
   3634 @item LOCAL_CPU_COUNT
   3635 The number of CPUs (cores, threads) detected by netperf. Units: CPU count.
   3636 @item LOCAL_CPU_PEAK_UTIL
   3637 The utilization of the CPU most heavily utilized during the test, as
   3638 measured by netperf. This can be used to see if any one CPU of a
   3639 multi-CPU system was saturated even though the overall CPU utilization
   3640 as reported by @code{LOCAL_CPU_UTIL} was low. Units: 0 to 100% 
   3641 @item LOCAL_CPU_PEAK_ID
   3642 The id of the CPU most heavily utilized during the test as determined
   3643 by netperf. Units: CPU ID.
   3644 @item LOCAL_CPU_MODEL
   3645 Model information for the processor(s) present on the system running
   3646 netperf. Assumes all processors in the system (as perceived by
   3647 netperf) on which netperf is running are the same model. Units: Text
   3648 @item LOCAL_CPU_FREQUENCY
   3649 The frequency of the processor(s) on the system running netperf, at
   3650 the time netperf made the call.  Assumes that all processors present
   3651 in the system running netperf are running at the same
   3652 frequency. Units: MHz
   3653 @item REMOTE_CPU_BIND
   3654 The CPU to which netserver was bound, if at all, during the test. A
   3655 value of -1 means that netperf was not explicitly bound to a CPU
   3656 during the test. Units: CPU ID
   3657 @item REMOTE_CPU_COUNT
   3658 The number of CPUs (cores, threads) detected by netserver. Units: CPU
   3659 count.
   3660 @item REMOTE_CPU_PEAK_UTIL
   3661 The utilization of the CPU most heavily utilized during the test, as
   3662 measured by netserver. This can be used to see if any one CPU of a
   3663 multi-CPU system was saturated even though the overall CPU utilization
   3664 as reported by @code{REMOTE_CPU_UTIL} was low. Units: 0 to 100%
   3665 @item REMOTE_CPU_PEAK_ID
   3666 The id of the CPU most heavily utilized during the test as determined
   3667 by netserver. Units: CPU ID.
   3668 @item REMOTE_CPU_MODEL
   3669 Model information for the processor(s) present on the system running
   3670 netserver. Assumes all processors in the system (as perceived by
   3671 netserver) on which netserver is running are the same model. Units:
   3672 Text
   3673 @item REMOTE_CPU_FREQUENCY
   3674 The frequency of the processor(s) on the system running netserver, at
   3675 the time netserver made the call.  Assumes that all processors present
   3676 in the system running netserver are running at the same
   3677 frequency. Units: MHz
   3678 @item SOURCE_PORT
   3679 The port ID/service name to which the data socket created by netperf
   3680 was bound.  A value of 0 means the data socket was not explicitly
   3681 bound to a port number. Units: ASCII text.
   3682 @item SOURCE_ADDR
   3683 The name/address to which the data socket created by netperf was
   3684 bound. A value of 0.0.0.0 means the data socket was not explicitly
   3685 bound to an address. Units: ASCII text.
   3686 @item SOURCE_FAMILY
   3687 The address family to which the data socket created by netperf was
   3688 bound.  A value of 0 means the data socket was not explicitly bound to
   3689 a given address family. Units: ASCII text.
   3690 @item DEST_PORT
   3691 The port ID to which the data socket created by netserver was bound. A
   3692 value of 0 means the data socket was not explicitly bound to a port
   3693 number.  Units: ASCII text.
   3694 @item DEST_ADDR
   3695 The name/address of the data socket created by netserver.  Units:
   3696 ASCII text.
   3697 @item DEST_FAMILY
   3698 The address family to which the data socket created by netserver was
   3699 bound. A value of 0 means the data socket was not explicitly bound to
   3700 a given address family. Units: ASCII text.
   3701 @item LOCAL_SEND_CALLS
   3702 The number of successful ``send'' calls made by netperf against its
   3703 data socket. Units: Calls.
   3704 @item LOCAL_RECV_CALLS
   3705 The number of successful ``receive'' calls made by netperf against its
   3706 data socket. Units: Calls.
   3707 @item LOCAL_BYTES_PER_RECV
   3708 The average number of bytes per ``receive'' call made by netperf
   3709 against its data socket. Units: Bytes.
   3710 @item LOCAL_BYTES_PER_SEND
   3711 The average number of bytes per ``send'' call made by netperf against
   3712 its data socket. Units: Bytes.
   3713 @item LOCAL_BYTES_SENT
   3714 The number of bytes successfully sent by netperf through its data
   3715 socket. Units: Bytes.
   3716 @item LOCAL_BYTES_RECVD
   3717 The number of bytes successfully received by netperf through its data
   3718 socket. Units: Bytes.
   3719 @item LOCAL_BYTES_XFERD
   3720 The sum of bytes sent and received by netperf through its data
   3721 socket. Units: Bytes.
   3722 @item LOCAL_SEND_OFFSET
   3723 The offset from the alignment of the buffers passed by netperf in its
   3724 ``send'' calls. Specified via the global @option{-o} option and
   3725 defaults to 0. Units: Bytes.
   3726 @item LOCAL_RECV_OFFSET
   3727 The offset from the alignment of the buffers passed by netperf in its
   3728 ``receive'' calls. Specified via the global @option{-o} option and
   3729 defaults to 0. Units: Bytes.
   3730 @item LOCAL_SEND_ALIGN
   3731 The alignment of the buffers passed by netperf in its ``send'' calls
   3732 as specified via the global @option{-a} option. Defaults to 8. Units:
   3733 Bytes.
   3734 @item LOCAL_RECV_ALIGN
   3735 The alignment of the buffers passed by netperf in its ``receive''
   3736 calls as specified via the global @option{-a} option. Defaults to
   3737 8. Units: Bytes.
   3738 @item LOCAL_SEND_WIDTH
   3739 The ``width'' of the ring of buffers through which netperf cycles as
   3740 it makes its ``send'' calls.  Defaults to one more than the local send
   3741 socket buffer size divided by the send size as determined at the time
   3742 the data socket is created. Can be used to make netperf more processor
   3743 data cache unfriendly. Units: number of buffers.
   3744 @item LOCAL_RECV_WIDTH
   3745 The ``width'' of the ring of buffers through which netperf cycles as
   3746 it makes its ``receive'' calls.  Defaults to one more than the local
   3747 receive socket buffer size divided by the receive size as determined
   3748 at the time the data socket is created. Can be used to make netperf
   3749 more processor data cache unfriendly. Units: number of buffers.
   3750 @item LOCAL_SEND_DIRTY_COUNT
   3751 The number of bytes to ``dirty'' (write to) before netperf makes a
   3752 ``send'' call. Specified via the global @option{-k} option, which
   3753 requires that --enable-dirty=yes was specified with the configure
   3754 command prior to building netperf. Units: Bytes.
   3755 @item LOCAL_RECV_DIRTY_COUNT
   3756 The number of bytes to ``dirty'' (write to) before netperf makes a
   3757 ``recv'' call. Specified via the global @option{-k} option which
   3758 requires that --enable-dirty was specified with the configure command
   3759 prior to building netperf. Units: Bytes.
   3760 @item LOCAL_RECV_CLEAN_COUNT
   3761 The number of bytes netperf should read ``cleanly'' before making a
   3762 ``receive'' call. Specified via the global @option{-k} option which
   3763 requires that --enable-dirty was specified with configure command
   3764 prior to building netperf.  Clean reads start were dirty writes ended.
   3765 Units: Bytes.
   3766 @item LOCAL_NODELAY
   3767 Indicates whether or not setting the test protocol-specific ``no
   3768 delay'' (eg TCP_NODELAY) option on the data socket used by netperf was
   3769 requested by the test-specific @option{-D} option and
   3770 successful. Units: 0 means no, 1 means yes.
   3771 @item LOCAL_CORK
   3772 Indicates whether or not TCP_CORK was set on the data socket used by
   3773 netperf as requested via the test-specific @option{-C} option. 1 means
   3774 yes, 0 means no/not applicable.
   3775 @item REMOTE_SEND_CALLS
   3776 @item REMOTE_RECV_CALLS
   3777 @item REMOTE_BYTES_PER_RECV
   3778 @item REMOTE_BYTES_PER_SEND
   3779 @item REMOTE_BYTES_SENT
   3780 @item REMOTE_BYTES_RECVD
   3781 @item REMOTE_BYTES_XFERD
   3782 @item REMOTE_SEND_OFFSET
   3783 @item REMOTE_RECV_OFFSET
   3784 @item REMOTE_SEND_ALIGN
   3785 @item REMOTE_RECV_ALIGN
   3786 @item REMOTE_SEND_WIDTH
   3787 @item REMOTE_RECV_WIDTH
   3788 @item REMOTE_SEND_DIRTY_COUNT
   3789 @item REMOTE_RECV_DIRTY_COUNT
   3790 @item REMOTE_RECV_CLEAN_COUNT
   3791 @item REMOTE_NODELAY
   3792 @item REMOTE_CORK
   3793 These are all like their ``LOCAL_'' counterparts only for the
   3794 netserver rather than netperf.
   3795 @item LOCAL_SYSNAME
   3796 The name of the OS (eg ``Linux'') running on the system on which
   3797 netperf was running. Units: ASCII Text
   3798 @item LOCAL_SYSTEM_MODEL
   3799 The model name of the system on which netperf was running. Units:
   3800 ASCII Text.
   3801 @item LOCAL_RELEASE
   3802 The release name/number of the OS running on the system on which
   3803 netperf  was running. Units: ASCII Text
   3804 @item LOCAL_VERSION
   3805 The version number of the OS running on the system on which netperf
   3806 was running. Units: ASCII Text
   3807 @item LOCAL_MACHINE
   3808 The machine architecture of the machine on which netperf was
   3809 running. Units: ASCII Text.
   3810 @item REMOTE_SYSNAME
   3811 @item REMOTE_SYSTEM_MODEL
   3812 @item REMOTE_RELEASE
   3813 @item REMOTE_VERSION
   3814 @item REMOTE_MACHINE
   3815 These are all like their ``LOCAL_'' counterparts only for the
   3816 netserver rather than netperf.
   3817 @item LOCAL_INTERFACE_NAME
   3818 The name of the probable egress interface through which the data
   3819 connection went on the system running netperf. Example: eth0. Units:
   3820 ASCII Text.
   3821 @item LOCAL_INTERFACE_VENDOR
   3822 The vendor ID of the probable egress interface through which traffic
   3823 on the data connection went on the system running netperf. Units:
   3824 Hexadecimal IDs as might be found in a @file{pci.ids} file or at
   3825 @uref{http://pciids.sourceforge.net/,the PCI ID Repository}.
   3826 @item LOCAL_INTERFACE_DEVICE
   3827 The device ID of the probable egress interface through which traffic
   3828 on the data connection went on the system running netperf. Units:
   3829 Hexadecimal IDs as might be found in a @file{pci.ids} file or at
   3830 @uref{http://pciids.sourceforge.net/,the PCI ID Repository}.
   3831 @item LOCAL_INTERFACE_SUBVENDOR
   3832 The sub-vendor ID of the probable egress interface through which
   3833 traffic on the data connection went on the system running
   3834 netperf. Units: Hexadecimal IDs as might be found in a @file{pci.ids}
   3835 file or at @uref{http://pciids.sourceforge.net/,the PCI ID
   3836 Repository}.
   3837 @item LOCAL_INTERFACE_SUBDEVICE
   3838 The sub-device ID of the probable egress interface through which
   3839 traffic on the data connection went on the system running
   3840 netperf. Units: Hexadecimal IDs as might be found in a @file{pci.ids}
   3841 file or at @uref{http://pciids.sourceforge.net/,the PCI ID
   3842 Repository}.
   3843 @item LOCAL_DRIVER_NAME
   3844 The name of the driver used for the probable egress interface through
   3845 which traffic on the data connection went on the system running
   3846 netperf. Units: ASCII Text.
   3847 @item LOCAL_DRIVER_VERSION
   3848 The version string for the driver used for the probable egress
   3849 interface through which traffic on the data connection went on the
   3850 system running netperf. Units: ASCII Text.
   3851 @item LOCAL_DRIVER_FIRMWARE
   3852 The firmware version for the driver used for the probable egress
   3853 interface through which traffic on the data connection went on the
   3854 system running netperf. Units: ASCII Text.
   3855 @item LOCAL_DRIVER_BUS
   3856 The bus address of the probable egress interface through which traffic
   3857 on the data connection went on the system running netperf. Units:
   3858 ASCII Text.
   3859 @item LOCAL_INTERFACE_SLOT
   3860 The slot ID of the probable egress interface through which traffic
   3861 on the data connection went on the system running netperf. Units:
   3862 ASCII Text.
   3863 @item REMOTE_INTERFACE_NAME
   3864 @item REMOTE_INTERFACE_VENDOR
   3865 @item REMOTE_INTERFACE_DEVICE
   3866 @item REMOTE_INTERFACE_SUBVENDOR
   3867 @item REMOTE_INTERFACE_SUBDEVICE
   3868 @item REMOTE_DRIVER_NAME
   3869 @item REMOTE_DRIVER_VERSION
   3870 @item REMOTE_DRIVER_FIRMWARE
   3871 @item REMOTE_DRIVER_BUS
   3872 @item REMOTE_INTERFACE_SLOT
   3873 These are all like their ``LOCAL_'' counterparts only for the
   3874 netserver rather than netperf.
   3875 @item LOCAL_INTERVAL_USECS
   3876 The interval at which bursts of operations (sends, receives,
   3877 transactions) were attempted by netperf.  Specified by the
   3878 global @option{-w} option which requires --enable-intervals to have
   3879 been specified with the configure command prior to building
   3880 netperf. Units: Microseconds (though specified by default in
   3881 milliseconds on the command line)
   3882 @item LOCAL_INTERVAL_BURST
   3883 The number of operations (sends, receives, transactions depending on
   3884 the test) which were attempted by netperf each LOCAL_INTERVAL_USECS
   3885 units of time. Specified by the global @option{-b} option which
   3886 requires --enable-intervals to have been specified with the configure
   3887 command prior to building netperf.  Units: number of operations per burst.
   3888 @item REMOTE_INTERVAL_USECS
   3889 The interval at which bursts of operations (sends, receives,
   3890 transactions) were attempted by netserver.  Specified by the
   3891 global @option{-w} option which requires --enable-intervals to have
   3892 been specified with the configure command prior to building
   3893 netperf. Units: Microseconds (though specified by default in
   3894 milliseconds on the command line)
   3895 @item REMOTE_INTERVAL_BURST
   3896 The number of operations (sends, receives, transactions depending on
   3897 the test) which were attempted by netperf each LOCAL_INTERVAL_USECS
   3898 units of time. Specified by the global @option{-b} option which
   3899 requires --enable-intervals to have been specified with the configure
   3900 command prior to building netperf.  Units: number of operations per burst.
   3901 @item LOCAL_SECURITY_TYPE_ID
   3902 @item LOCAL_SECURITY_TYPE
   3903 @item LOCAL_SECURITY_ENABLED_NUM
   3904 @item LOCAL_SECURITY_ENABLED
   3905 @item LOCAL_SECURITY_SPECIFIC
   3906 @item REMOTE_SECURITY_TYPE_ID
   3907 @item REMOTE_SECURITY_TYPE
   3908 @item REMOTE_SECURITY_ENABLED_NUM
   3909 @item REMOTE_SECURITY_ENABLED
   3910 @item REMOTE_SECURITY_SPECIFIC
   3911 A bunch of stuff related to what sort of security mechanisms (eg
   3912 SELINUX) were enabled on the systems during the test.
   3913 @item RESULT_BRAND
   3914 The string specified by the user with the global @option{-B}
   3915 option. Units: ASCII Text.
   3916 @item UUID
   3917 The universally unique identifier associated with this test, either
   3918 generated automagically by netperf, or passed to netperf via an omni
   3919 test-specific @option{-u} option. Note: Future versions may make this
   3920 a global command-line option. Units: ASCII Text.
   3921 @item MIN_LATENCY
   3922 The minimum ``latency'' or operation time (send, receive or
   3923 request/response exchange depending on the test) as measured on the
   3924 netperf side when the global @option{-j} option was specified. Units:
   3925 Microseconds.
   3926 @item MAX_LATENCY
   3927 The maximum ``latency'' or operation time (send, receive or
   3928 request/response exchange depending on the test) as measured on the
   3929 netperf side when the global @option{-j} option was specified. Units:
   3930 Microseconds.
   3931 @item P50_LATENCY
   3932 The 50th percentile value of ``latency'' or operation time (send, receive or
   3933 request/response exchange depending on the test) as measured on the
   3934 netperf side when the global @option{-j} option was specified. Units:
   3935 Microseconds.
   3936 @item P90_LATENCY
   3937 The 90th percentile value of ``latency'' or operation time (send, receive or
   3938 request/response exchange depending on the test) as measured on the
   3939 netperf side when the global @option{-j} option was specified. Units:
   3940 Microseconds.
   3941 @item P99_LATENCY
   3942 The 99th percentile value of ``latency'' or operation time (send, receive or
   3943 request/response exchange depending on the test) as measured on the
   3944 netperf side when the global @option{-j} option was specified. Units:
   3945 Microseconds.
   3946 @item MEAN_LATENCY
   3947 The average ``latency'' or operation time (send, receive or
   3948 request/response exchange depending on the test) as measured on the
   3949 netperf side when the global @option{-j} option was specified. Units:
   3950 Microseconds.
   3951 @item STDDEV_LATENCY
   3952 The standard deviation of ``latency'' or operation time (send, receive or
   3953 request/response exchange depending on the test) as measured on the
   3954 netperf side when the global @option{-j} option was specified. Units:
   3955 Microseconds.
   3956 @item COMMAND_LINE
   3957 The full command line used when invoking netperf. Units: ASCII Text.
   3958 @item OUTPUT_END
   3959 While emitted with the list of output selectors, it is ignored when
   3960 specified as an output selector.
   3961 @end table
   3962 
   3963 @node Other Netperf Tests, Address Resolution, The Omni Tests, Top
   3964 @chapter Other Netperf Tests
   3965 
   3966 Apart from the typical performance tests, netperf contains some tests
   3967 which can be used to streamline measurements and reporting.  These
   3968 include CPU rate calibration (present) and host identification (future
   3969 enhancement).
   3970 
   3971 @menu
   3972 * CPU rate calibration::        
   3973 * UUID Generation::             
   3974 @end menu
   3975 
   3976 @node CPU rate calibration, UUID Generation, Other Netperf Tests, Other Netperf Tests
   3977 @section CPU rate calibration
   3978 
   3979 Some of the CPU utilization measurement mechanisms of netperf work by
   3980 comparing the rate at which some counter increments when the system is
   3981 idle with the rate at which that same counter increments when the
   3982 system is running a netperf test.  The ratio of those rates is used to
   3983 arrive at a CPU utilization percentage.
   3984 
   3985 This means that netperf must know the rate at which the counter
   3986 increments when the system is presumed to be ``idle.''  If it does not
   3987 know the rate, netperf will measure it before starting a data transfer
   3988 test.  This calibration step takes 40 seconds for each of the local or
   3989 remote systems, and if repeated for each netperf test would make taking
   3990 repeated measurements rather slow.
   3991 
   3992 Thus, the netperf CPU utilization options @option{-c} and and
   3993 @option{-C} can take an optional calibration value.  This value is
   3994 used as the ``idle rate'' and the calibration step is not
   3995 performed. To determine the idle rate, netperf can be used to run
   3996 special tests which only report the value of the calibration - they
   3997 are the LOC_CPU and REM_CPU tests.  These return the calibration value
   3998 for the local and remote system respectively.  A common way to use
   3999 these tests is to store their results into an environment variable and
   4000 use that in subsequent netperf commands:
   4001 
   4002 @example
   4003 LOC_RATE=`netperf -t LOC_CPU`
   4004 REM_RATE=`netperf -H <remote> -t REM_CPU`
   4005 netperf -H <remote> -c $LOC_RATE -C $REM_RATE ... -- ...
   4006 ...
   4007 netperf -H <remote> -c $LOC_RATE -C $REM_RATE ... -- ...
   4008 @end example
   4009 
   4010 If you are going to use netperf to measure aggregate results, it is
   4011 important to use the LOC_CPU and REM_CPU tests to get the calibration
   4012 values first to avoid issues with some of the aggregate netperf tests
   4013 transferring data while others are ``idle'' and getting bogus
   4014 calibration values.  When running aggregate tests, it is very
   4015 important to remember that any one instance of netperf does not know
   4016 about the other instances of netperf.  It will report global CPU
   4017 utilization and will calculate service demand believing it was the
   4018 only thing causing that CPU utilization.  So, you can use the CPU
   4019 utilization reported by netperf in an aggregate test, but you have to
   4020 calculate service demands by hand.
   4021 
   4022 @node UUID Generation,  , CPU rate calibration, Other Netperf Tests
   4023 @section UUID Generation
   4024 
   4025 Beginning with version 2.5.0 netperf can generate Universally Unique
   4026 IDentifiers (UUIDs).  This can be done explicitly via the ``UUID''
   4027 test:
   4028 @example
   4029 $ netperf -t UUID
   4030 2c8561ae-9ebd-11e0-a297-0f5bfa0349d0
   4031 @end example
   4032 
   4033 In and of itself, this is not terribly useful, but used in conjunction
   4034 with the test-specific @option{-u} option of an ``omni'' test to set
   4035 the UUID emitted by the @ref{Omni Output Selectors,UUID} output
   4036 selector, it can be used to tie-together the separate instances of an
   4037 aggregate netperf test.  Say, for instance if they were inserted into
   4038 a database of some sort.
   4039 
   4040 @node Address Resolution, Enhancing Netperf, Other Netperf Tests, Top
   4041 @comment  node-name,  next,  previous,  up
   4042 @chapter Address Resolution
   4043 
   4044 Netperf versions 2.4.0 and later have merged IPv4 and IPv6 tests so
   4045 the functionality of the tests in @file{src/nettest_ipv6.c} has been
   4046 subsumed into the tests in @file{src/nettest_bsd.c}  This has been
   4047 accomplished in part by switching from @code{gethostbyname()}to
   4048 @code{getaddrinfo()} exclusively.  While it was theoretically possible
   4049 to get multiple results for a hostname from @code{gethostbyname()} it
   4050 was generally unlikely and netperf's ignoring of the second and later
   4051 results was not much of an issue.
   4052 
   4053 Now with @code{getaddrinfo} and particularly with AF_UNSPEC it is
   4054 increasingly likely that a given hostname will have multiple
   4055 associated addresses.  The @code{establish_control()} routine of
   4056 @file{src/netlib.c} will indeed attempt to chose from among all the
   4057 matching IP addresses when establishing the control connection.
   4058 Netperf does not _really_ care if the control connection is IPv4 or
   4059 IPv6 or even mixed on either end.
   4060 
   4061 However, the individual tests still ass-u-me that the first result in
   4062 the address list is the one to be used.  Whether or not this will
   4063 turn-out to be an issue has yet to be determined.
   4064 
   4065 If you do run into problems with this, the easiest workaround is to
   4066 specify IP addresses for the data connection explicitly in the
   4067 test-specific @option{-H} and @option{-L} options.  At some point, the
   4068 netperf tests _may_ try to be more sophisticated in their parsing of
   4069 returns from @code{getaddrinfo()} - straw-man patches to
   4070 @email{netperf-feedback@@netperf.org} would of course be most welcome
   4071 :)
   4072 
   4073 Netperf has leveraged code from other open-source projects with
   4074 amenable licensing to provide a replacement @code{getaddrinfo()} call
   4075 on those platforms where the @command{configure} script believes there
   4076 is no native getaddrinfo call.  As of this writing, the replacement
   4077 @code{getaddrinfo()} as been tested on HP-UX 11.0 and then presumed to
   4078 run elsewhere.
   4079 
   4080 @node Enhancing Netperf, Netperf4, Address Resolution, Top
   4081 @comment  node-name,  next,  previous,  up
   4082 @chapter Enhancing Netperf
   4083 
   4084 Netperf is constantly evolving.  If you find you want to make
   4085 enhancements to netperf, by all means do so.  If you wish to add a new
   4086 ``suite'' of tests to netperf the general idea is to:
   4087 
   4088 @enumerate
   4089 @item
   4090 Add files @file{src/nettest_mumble.c} and @file{src/nettest_mumble.h}
   4091 where mumble is replaced with something meaningful for the test-suite.
   4092 @item
   4093 Add support for an appropriate @option{--enable-mumble} option in
   4094 @file{configure.ac}.
   4095 @item
   4096 Edit @file{src/netperf.c}, @file{netsh.c}, and @file{netserver.c} as
   4097 required, using #ifdef WANT_MUMBLE.
   4098 @item
   4099 Compile and test
   4100 @end enumerate
   4101 
   4102 However, with the addition of the ``omni'' tests in version 2.5.0 it
   4103 is preferred that one attempt to make the necessary changes to
   4104 @file{src/nettest_omni.c} rather than adding new source files, unless
   4105 this would make the omni tests entirely too complicated.
   4106 
   4107 If you wish to submit your changes for possible inclusion into the
   4108 mainline sources, please try to base your changes on the latest
   4109 available sources. (@xref{Getting Netperf Bits}.) and then send email
   4110 describing the changes at a high level to
   4111 @email{netperf-feedback@@netperf.org} or perhaps
   4112 @email{netperf-talk@@netperf.org}.  If the consensus is positive, then
   4113 sending context @command{diff} results to
   4114 @email{netperf-feedback@@netperf.org} is the next step.  From that
   4115 point, it is a matter of pestering the Netperf Contributing Editor
   4116 until he gets the changes incorporated :)
   4117 
   4118 @node  Netperf4, Concept Index, Enhancing Netperf, Top
   4119 @comment  node-name,  next,  previous,  up
   4120 @chapter Netperf4
   4121 
   4122 Netperf4 is the shorthand name given to version 4.X.X of netperf.
   4123 This is really a separate benchmark more than a newer version of
   4124 netperf, but it is a descendant of netperf so the netperf name is
   4125 kept.  The facetious way to describe netperf4 is to say it is the
   4126 egg-laying-woolly-milk-pig version of netperf :)  The more respectful
   4127 way to describe it is to say it is the version of netperf with support
   4128 for synchronized, multiple-thread, multiple-test, multiple-system,
   4129 network-oriented benchmarking.
   4130 
   4131 Netperf4 is still undergoing evolution. Those wishing to work with or
   4132 on netperf4 are encouraged to join the
   4133 @uref{http://www.netperf.org/cgi-bin/mailman/listinfo/netperf-dev,netperf-dev}
   4134 mailing list and/or peruse the
   4135 @uref{http://www.netperf.org/svn/netperf4/trunk,current sources}.
   4136 
   4137 @node Concept Index, Option Index, Netperf4, Top
   4138 @unnumbered Concept Index
   4139 
   4140 @printindex cp
   4141 
   4142 @node Option Index,  , Concept Index, Top
   4143 @comment  node-name,  next,  previous,  up
   4144 @unnumbered Option Index
   4145 
   4146 @printindex vr
   4147 @bye                                      
   4148 
   4149 @c  LocalWords:  texinfo setfilename settitle titlepage vskip pt filll ifnottex
   4150 @c  LocalWords:  insertcopying cindex dfn uref printindex cp
   4151