Home | History | Annotate | Download | only in doc
      1 This document describes the operation of the test scheduling framework in
      2 the pounder30 package.  This document reflects pounder30 as of 2011-8-09.
      3 
      4 Authors:
      5 Darrick Wong <djwong (a] us.ibm.com>
      6 Lucy Liang <lgliang (a] us.ibm.com>
      7 
      8 Copyright (C) 2011 IBM.
      9 
     10 Contents
     11 ========
     12 1. Overview
     13 2. Test Files
     14 3. Build Scripts
     15 4. Test Scripts
     16 5. Scheduling Tests
     17 6. Running Tests Repeatedly
     18 7. The Provided Test Schedulers
     19 8. Creating Your Own Test Scheduler
     20 9. Including and Excluding Tests
     21 
     22 Overview
     23 ========
     24 The scheduler in the original pounder release was too simplistic--it would kick
     25 off every test at once, simultaneously.  There was no attempt to ramp up the
     26 machine's stress levels test by test, or to run only certain combinations, or
     27 even run the tests one by one before beginning the real load testing.
     28 
     29 In addition, the test scripts had a very simple pass/fail mechanism--failure
     30 was defined by a kernel panic/oops/bug, and passing was defined by the lack of
     31 that condition.  There was no attempt to find soft failures--situations where
     32 a test program would fail, but without bringing the machine down.  The test
     33 suite would not alert the user that these failures had occurred.
     34 
     35 Consequently, Darrick Wong rewrote the test scheduling framework to achieve
     36 several goals--first, to separate the test automation code from the tests
     37 themselves, to allow for more intelligent scheduling of tests, to give better
     38 summary reports of what passed (and what didn't), and finally to improve the
     39 load testing that this suite could do.
     40 
     41 Test Files
     42 ==========
     43 Each test should only need to provide three files:
     44 
     45 1) build_scripts/<testname>
     46 	- The build_scripts/ directory contains scripts that take care of checking for
     47 	system requirements, downloading the relevant packages and binaries, and building
     48 	any code necessary to run the subtests. See the "Build Scripts" section below for
     49 	more information.
     50 
     51 2) test_scripts/<testname>
     52 	- The test_script/ directory contains scripts that take care of running the actual tests.
     53 	See the "Test Scripts" section below for more information.
     54 
     55 3) tests/.../[T|D]XX<testname>
     56 	- The tests/ directory represents our unpackaged "test scheduler" (if your tests/
     57 	directory is empty, that means you haven't unpacked any test schedulers yet and will
     58 	need run "make install" to unpack a scheduler - see "The Provided Test Schedulers"
     59 	section for more information. The test_repo/ directory also provides an example of what
     60 	an unpacked test scheduler should look like). The files in the tests/ directory are
     61 	usually symlinks that point to files in test_scripts/. The order in which the subtests are
     62 	run upon starting pounder depends on how the files in tests/ are named and organized.
     63 	See the "Scheduling Tests" section below for more information.
     64 
     65 Note: <testname> should be the same in the build_scripts/, test_scripts/, and tests/ folders.
     66 (Example: build_scripts/subtest1, test_scripts/subtest1, and tests/D99subtest1 would be valid.
     67 build_scripts/subtest1, test_scripts/subtest1_different, and tests/D99subtest1 would not.)
     68 See "Scheduling Tests" below for a detailed description of naming rules for files in the tests/
     69 directory.
     70 
     71 Build Scripts
     72 =============
     73 As the name implies, a script in build_scripts/ is in charge of downloading
     74 and building whatever bits of code are necessary to make the test run.
     75 
     76 Temporary files needed to run a test should go in $POUNDER_TMPDIR. Third party source,
     77 packages, binaries should go in $POUNDER_OPTDIR. Third party packages can be fetched
     78 from the web or from a user-created cache, a web-accessible directory containing
     79 cached tarballs and files used for whatever it is you'll need to build.
     80 (see "$POUNDER_CACHE" in doc/CONFIGURATION for more information)
     81 
     82 Should there be a failure in the build script that is essential to the ability
     83 to run a test, the build script should exit with error to halt the main build
     84 process immediately.
     85 
     86 Also, be aware that distributing pre-built binary tarballs is not always a good
     87 idea. Though one could cache pre-built binary tarballs rather than source, it may
     88 not be a good idea because distros are not always good at ABI/library path compatibility,
     89 despite the efforts of LSB, FHS, etc.  It is always safest to build your
     90 subtests from source on your target system.
     91 
     92 The build_scripts/ directory provides some examples.
     93 
     94 Test Scripts
     95 ============
     96 A script in test_scripts/ is in charge of running the actual test.
     97 
     98 The requirements on test scripts are pretty light.  First, the building of the
     99 test ought to go in the build script unless it's absolutely necessary to build
    100 a test component at run time. Any checking for system requirements should also
    101 go in the build script.
    102 
    103 Second, the script must catch SIGTERM and clean up after itself.  SIGTERM is
    104 used by the test scheduler to stop tests.
    105 
    106 The third requirement is much more stringent: Return codes.  The script should
    107 return 0 to indicate success, 1-254 to indicate failure (the common use is to
    108 signify the number of failures), and -1 or 255 to indicate that the there was
    109 a failure that cannot be fixed.
    110 
    111 Note: If a test is being run in a timed or infinite loop (see the
    112 "Running Tests Repeatedly" section below for details), returning -1 or 255
    113 has the effect of cancelling all subsequent loops.
    114 
    115 Quick map of return codes to what gets reported:
    116 0             = "PASS"
    117 -1            = "ABORT"
    118 255           = "ABORT"
    119 anything else = "FAIL"
    120 
    121 Also note: If a test is killed by an unhandled signal, the test is reported as
    122 failing.
    123 
    124 Put any temporary files created during test run in $POUNDER_TMPDIR.
    125 
    126 The test_scripts/ directory provides some examples.
    127 
    128 Scheduling Tests
    129 ================
    130 Everything under the tests/ directory is used for scheduling purposes. The current
    131 test scheduler borrows a System V rc script-like structure for specifying how and
    132 when tests should be run. Files under tests/ should have names that follow the this
    133 standard:
    134 
    135    [type][sequence number][name]
    136 
    137 "type" is the type of test. Currently, there are two types, 'D' and 'T'.  'T'
    138 signifies a test, which means that the scheduler starts the test, waits for the
    139 test to complete, and reports on its exit status.  'D' signifies a daemon
    140 "test", which is to say that the scheduler will start the test, let it run in
    141 the background, and kill it when it's done running all the tests in that
    142 directory.
    143 
    144 The "sequence number" dictates the order in which the test are run. 00 goes
    145 first, 99 goes last.  Tests with the same number are started simultaneously,
    146 regardless of the type.
    147 
    148 "name" is just a convenient mnemonic to distinguish between tests. However,
    149 it should be the same as the corresponding name using in build_scripts and
    150 test_scripts. (A test with build script "build_scripts/subtest" and
    151 test script "test_scripts/subtest" should be defined as  something like
    152 "tests/T00subtest" as opposed to "tests/T00whatever_i_feel_like")
    153 
    154 Test names must be unique!
    155 
    156 File system objects under the tests/ directory can be nearly anything--
    157 directories, symbolic links, or files.  The test scheduler will not run
    158 anything that doesn't have the execute bit set.  If a FS object is a
    159 directory, then the contents of the directory are executed sequentially.
    160 
    161 Example:
    162 
    163 Let's examine the following test scheduler hierarchy:
    164 
    165 tests/
    166     D00stats
    167     T01foo
    168     T01bar
    169     T02dir/
    170         T00gav -> ../../test_scripts/gav
    171         T01hic -> ../../test_scripts/hic
    172     T03lat
    173 
    174 Let's see how the tests are run.  The test scheduler will start off by scanning
    175 the tests/ directory.  First it spawns D00stats and lets it run in the
    176 background.  Next, T01foo and T01bar are launched at the same time; the
    177 scheduler will wait for both of them to complete before proceeding.  Since T01foo
    178 is a file and not just a symbolic link, there is a fair chance that T01foo runs
    179 some test in a loop for a certain amount of time.  In any case, the scheduler
    180 next sees T02dir and proceeds into it.
    181 
    182 In the T02dir/, we find two test scripts.  First T00gav runs, followed by
    183 T01hic.  Now there are no more tests to run in T02dir/, so the scheduler heads
    184 back up to the parent directory.  T03lat is forked and allowed to run to
    185 completion, after which D00stats is killed, and the test suite exits.
    186 
    187 Running Tests Repeatedly
    188 ========================
    189 Two helper programs are provided to run tests repeatedly, timed_loop and infinite_loop.
    190 (This version of pounder currently also includes a fancy_timed_loop.c file, but it's only
    191 meant to be used for the random_syscall and will most likely be merged with timed_loop.c
    192 in the future, so we will ignore it here for now.)
    193 
    194 1. timed_loop
    195 
    196     timed_loop [-m max_failures] duration_in_seconds command [arguments]
    197 
    198 This program will run "command" with the given arguments repeated
    199 until the number of seconds given as "duration" has passed or the
    200 command has failed a total of "max_failures" times, whichever comes first.
    201 If the $MAX_FAILURES variable is set (defined in config, see CONFIGURATION
    202 for details), then the program will run until command has failed a total of
    203 $MAX_FAILURES time (as long as it's not overridden by the -m option).
    204 
    205 2. infinite_loop
    206 
    207     infinite_loop [-m max_failures] command [arguments]
    208 
    209 This program runs "command" repeatedly until sent SIGTERM or the
    210 command has failed a total of "max_failures" times. If the $MAX_FAILURES
    211 variable is set (defined in config, see CONFIGURATION for details), then
    212 the program will run until command has failed a total of $MAX_FAILURES time
    213 (as long as it's not overridden by the -m option).
    214 
    215 Examples:
    216 
    217 1. test_repo/T90ramp/D02build_kernel contains the following line:
    218 
    219 	"$POUNDER_HOME/infinite_loop $POUNDER_HOME/test_scripts/build_kernel"
    220 
    221 	which will run the build_kernel test script repeatedly until sent SIGTERM
    222 	or until it has failed a total of $MAX_FAILURES times.
    223 
    224 	"$POUNDER_HOME/infinite_loop -m 10 $POUNDER_HOME/test_scripts/build_kernel"
    225 
    226 	would run the build_kernel test script repeatedly until sent SIGTERM or
    227 	until it has failed 10 times, regardless of what $MAX_FAILURES is.
    228 
    229 2. test_scripts/time_drift contains the following line:
    230 
    231 	"$POUNDER_HOME/timed_loop 900 "$POUNDER_SRCDIR/time_tests/drift-test.py" $NTP_SERVER $FREQ"
    232 
    233 	which will run the drift-test.py script ($NTP_SERVER and $FREQ are some args passed to drift-test.py)
    234 	for 15 minutes or until it has failed a total of $MAX_FAILURES times.
    235 
    236 	"$POUNDER_HOME/timed_loop -m 10 900 "$POUNDER_SRCDIR/time_tests/drift-test.py" $NTP_SERVER $FREQ"
    237 
    238 	would run the drift-test.py script for 15 minutes or until it has failed 10 times, regardless of
    239 	what $MAX_FAILURES is.
    240 
    241 The Provided Test Schedulers
    242 ============================
    243 This version of pounder provides 3 test schedulers: the "default," "fast," and "test" test schedulers.
    244 The tarred versions can be found in the schedulers/ directory as default-tests.tar.gz, fast-tests.tar.gz,
    245 and test-tests.tar.gz respectively.
    246 
    247 To unpack a test scheduler, run "make install" in the pounder/ directory and enter the name of the
    248 scheduler you would like to unpack at the first prompt.
    249 
    250 Example of unpacking the "fast" test scheduler:
    251 
    252 	# make install
    253 	./Install
    254 	Looking for tools...make g++ lex gcc python wget sudo diff patch egrep rm echo test which cp mkdir .
    255 	All tools were found.
    256 	WHICH TEST SCHEDULER SETUP DO YOU WANT TO UNPACK?
    257 	[Choose from:
    258 	default-tests.tar.gz
    259 	fast-tests.tar.gz
    260 	test-tests.tar.gz]
    261 	[Or simply press ENTER for the default scheduler]
    262 	Scheduler selection: fast
    263 
    264 Descriptions of the provided test schedulers:
    265 
    266 1. default - provides a general purpose stress test, runs for 48 hours unless the -d option
    267 		is used when starting pounder.
    268 2. fast - basically the same as default, except it runs for 12 hours by default.
    269 3. test - provides a set of useless tests. Each test simply passes, fails, aborts, or sleeps for
    270 		some period of time. They don't do anything useful but can be used to see how
    271 		the test scheduling setup works.
    272 
    273 Creating Your Own Test Schedulers
    274 =================================
    275 From the pounder directory, place the desired tests in the tests/ directory according to
    276 the rules described in the "Scheduling Tests" section above. Then run the following command:
    277 
    278 ./pounder -c name_of_scheduler
    279 
    280 to create a new test scheduler, which will be tarred as name_of_scheduler-tests.tar.gz and
    281 placed in the schedulers/ directory.
    282 
    283 Example Usage:
    284 
    285 	# ls ./schedulers
    286 	default-tests.tar.gz  fast-tests.tar.gz     test-tests.tar.gz
    287 
    288 	# ls ./tests
    289 	T00hwinfo
    290 
    291 	# ./pounder -c new_sched
    292 
    293 	# ls ./schedulers
    294 	default-tests.tar.gz  fast-tests.tar.gz     new_sched-tests.tar.gz      test-tests.tar.gz
    295 
    296 	After unpacking the "new_sched" test scheduler during install, the tests/ directory should
    297 	contain the T00hwinfo subtest along with a tests/excluded/ directory (see the "Including and
    298 	Excluding Tests" section below for details regarding the tests/excluded directory).
    299 
    300 Including and Excluding Tests
    301 =============================
    302 After unpacking the test scheduler and building each individual test, running
    303 "./pounder" will automatically run every test included in the tests folder. If you
    304 would like to run only ONE test, run "./pounder ./tests/<some subtest>". If you would
    305 like to run a portion of tests, you can use the "./pounder -e" option to exclude
    306 certain subtests from subsequent pounder runs:
    307 
    308 Example:
    309 
    310 Suppose you have already ran "make install" and unpacked the default test scheduler.
    311 The tests/ directory should now contain the subtests to be run
    312 
    313 1) ./pounder -l
    314 	- lists all of the subtests that came with the currently active test scheduler.
    315 	The output should look something like:
    316 
    317 	------------------
    318 	#./pounder -l
    319 	Included subtests:
    320 	...
    321 	.../ltp-full-xxxxxxxx/tools/pounder/tests/T10single/T00xterm_stress
    322 	.../ltp-full-xxxxxxxx/tools/pounder/tests/T00hwinfo
    323 	...
    324 
    325 	Excluded subtests:
    326 	[NONE]
    327 	------------------
    328 
    329 2) ./pounder -e "tests/T10single/T00xterm_stress tests/T00hwinfo"
    330 	- will exclude T00xterm_stress and T00hwinfo from any subsequent pounder runs.
    331 	This command essentially moves the two tests from the "tests" folder to the
    332 	"tests/excluded" folder for temporary storage, where they will remain until
    333 	re-included back into the test scheduler (this is also why all test names
    334 	should be unique). A file "tests/excluded/testlist" keeps track of which tests
    335 	have been excluded from the test scheduler and what their original paths were.
    336 
    337 3) ./pounder -l
    338 	- should now output something like:
    339 
    340 	------------------
    341 	#./pounder -l
    342 	Included subtests:
    343 	...
    344 
    345 	Excluded subtests:
    346 	T00xterm_stress
    347 	T00hwinfo
    348 	------------------
    349 
    350 4) ./pounder -i "T00xterm_stress T00hwinfo" - will re-include these subtests back into
    351 	the test scheduler. They will be moved from the tests/excluded folder back into
    352 	the tests folder under their original paths.
    353