OpenGrok

This document describes the operation of the test scheduling framework in
the pounder30 package.  This document reflects pounder30 as of 2011-8-09.

Authors:
Darrick Wong <djwong (a] us.ibm.com>
Lucy Liang <lgliang (a] us.ibm.com>

Copyright (C) 2011 IBM.

Contents
========
1. Overview
2. Test Files
3. Build Scripts
4. Test Scripts
5. Scheduling Tests
6. Running Tests Repeatedly
7. The Provided Test Schedulers
8. Creating Your Own Test Scheduler
9. Including and Excluding Tests

Overview
========
The scheduler in the original pounder release was too simplistic--it would kick
off every test at once, simultaneously.  There was no attempt to ramp up the
machine's stress levels test by test, or to run only certain combinations, or
even run the tests one by one before beginning the real load testing.

In addition, the test scripts had a very simple pass/fail mechanism--failure
was defined by a kernel panic/oops/bug, and passing was defined by the lack of
that condition.  There was no attempt to find soft failures--situations where
a test program would fail, but without bringing the machine down.  The test
suite would not alert the user that these failures had occurred.

Consequently, Darrick Wong rewrote the test scheduling framework to achieve
several goals--first, to separate the test automation code from the tests
themselves, to allow for more intelligent scheduling of tests, to give better
summary reports of what passed (and what didn't), and finally to improve the
load testing that this suite could do.

Test Files
==========
Each test should only need to provide three files:

1) build_scripts/<testname>
	- The build_scripts/ directory contains scripts that take care of checking for
	system requirements, downloading the relevant packages and binaries, and building
	any code necessary to run the subtests. See the "Build Scripts" section below for
	more information.

2) test_scripts/<testname>
	- The test_script/ directory contains scripts that take care of running the actual tests.
	See the "Test Scripts" section below for more information.

3) tests/.../[T|D]XX<testname>
	- The tests/ directory represents our unpackaged "test scheduler" (if your tests/
	directory is empty, that means you haven't unpacked any test schedulers yet and will
	need run "make install" to unpack a scheduler - see "The Provided Test Schedulers"
	section for more information. The test_repo/ directory also provides an example of what
	an unpacked test scheduler should look like). The files in the tests/ directory are
	usually symlinks that point to files in test_scripts/. The order in which the subtests are
	run upon starting pounder depends on how the files in tests/ are named and organized.
	See the "Scheduling Tests" section below for more information.

Note: <testname> should be the same in the build_scripts/, test_scripts/, and tests/ folders.
(Example: build_scripts/subtest1, test_scripts/subtest1, and tests/D99subtest1 would be valid.
build_scripts/subtest1, test_scripts/subtest1_different, and tests/D99subtest1 would not.)
See "Scheduling Tests" below for a detailed description of naming rules for files in the tests/
directory.

Build Scripts
=============
As the name implies, a script in build_scripts/ is in charge of downloading
and building whatever bits of code are necessary to make the test run.

Temporary files needed to run a test should go in $POUNDER_TMPDIR. Third party source,
packages, binaries should go in $POUNDER_OPTDIR. Third party packages can be fetched
from the web or from a user-created cache, a web-accessible directory containing
cached tarballs and files used for whatever it is you'll need to build.
(see "$POUNDER_CACHE" in doc/CONFIGURATION for more information)

Should there be a failure in the build script that is essential to the ability
to run a test, the build script should exit with error to halt the main build
process immediately.

Also, be aware that distributing pre-built binary tarballs is not always a good
idea. Though one could cache pre-built binary tarballs rather than source, it may
not be a good idea because distros are not always good at ABI/library path compatibility,
despite the efforts of LSB, FHS, etc.  It is always safest to build your
subtests from source on your target system.

The build_scripts/ directory provides some examples.

Test Scripts
============
A script in test_scripts/ is in charge of running the actual test.

The requirements on test scripts are pretty light.  First, the building of the
test ought to go in the build script unless it's absolutely necessary to build
a test component at run time. Any checking for system requirements should also
go in the build script.

Second, the script must catch SIGTERM and clean up after itself.  SIGTERM is
used by the test scheduler to stop tests.

The third requirement is much more stringent: Return codes.  The script should
return 0 to indicate success, 1-254 to indicate failure (the common use is to
signify the number of failures), and -1 or 255 to indicate that the there was
a failure that cannot be fixed.

Note: If a test is being run in a timed or infinite loop (see the
"Running Tests Repeatedly" section below for details), returning -1 or 255
has the effect of cancelling all subsequent loops.

Quick map of return codes to what gets reported:
0             = "PASS"
-1            = "ABORT"
255           = "ABORT"
anything else = "FAIL"

Also note: If a test is killed by an unhandled signal, the test is reported as
failing.

Put any temporary files created during test run in $POUNDER_TMPDIR.

The test_scripts/ directory provides some examples.

Scheduling Tests
================
Everything under the tests/ directory is used for scheduling purposes. The current
test scheduler borrows a System V rc script-like structure for specifying how and
when tests should be run. Files under tests/ should have names that follow the this
standard:

   [type][sequence number][name]

"type" is the type of test. Currently, there are two types, 'D' and 'T'.  'T'
signifies a test, which means that the scheduler starts the test, waits for the
test to complete, and reports on its exit status.  'D' signifies a daemon
"test", which is to say that the scheduler will start the test, let it run in
the background, and kill it when it's done running all the tests in that
directory.

The "sequence number" dictates the order in which the test are run. 00 goes
first, 99 goes last.  Tests with the same number are started simultaneously,
regardless of the type.

"name" is just a convenient mnemonic to distinguish between tests. However,
it should be the same as the corresponding name using in build_scripts and
test_scripts. (A test with build script "build_scripts/subtest" and
test script "test_scripts/subtest" should be defined as  something like
"tests/T00subtest" as opposed to "tests/T00whatever_i_feel_like")

Test names must be unique!

File system objects under the tests/ directory can be nearly anything--
directories, symbolic links, or files.  The test scheduler will not run
anything that doesn't have the execute bit set.  If a FS object is a
directory, then the contents of the directory are executed sequentially.

Example:

Let's examine the following test scheduler hierarchy:

tests/
    D00stats
    T01foo
    T01bar
    T02dir/
        T00gav -> ../../test_scripts/gav
        T01hic -> ../../test_scripts/hic
    T03lat

Let's see how the tests are run.  The test scheduler will start off by scanning
the tests/ directory.  First it spawns D00stats and lets it run in the
background.  Next, T01foo and T01bar are launched at the same time; the
scheduler will wait for both of them to complete before proceeding.  Since T01foo
is a file and not just a symbolic link, there is a fair chance that T01foo runs
some test in a loop for a certain amount of time.  In any case, the scheduler
next sees T02dir and proceeds into it.

In the T02dir/, we find two test scripts.  First T00gav runs, followed by
T01hic.  Now there are no more tests to run in T02dir/, so the scheduler heads
back up to the parent directory.  T03lat is forked and allowed to run to
completion, after which D00stats is killed, and the test suite exits.

Running Tests Repeatedly
========================
Two helper programs are provided to run tests repeatedly, timed_loop and infinite_loop.
(This version of pounder currently also includes a fancy_timed_loop.c file, but it's only
meant to be used for the random_syscall and will most likely be merged with timed_loop.c
in the future, so we will ignore it here for now.)

1. timed_loop

    timed_loop [-m max_failures] duration_in_seconds command [arguments]

This program will run "command" with the given arguments repeated
until the number of seconds given as "duration" has passed or the
command has failed a total of "max_failures" times, whichever comes first.
If the $MAX_FAILURES variable is set (defined in config, see CONFIGURATION
for details), then the program will run until command has failed a total of
$MAX_FAILURES time (as long as it's not overridden by the -m option).

2. infinite_loop

    infinite_loop [-m max_failures] command [arguments]

This program runs "command" repeatedly until sent SIGTERM or the
command has failed a total of "max_failures" times. If the $MAX_FAILURES
variable is set (defined in config, see CONFIGURATION for details), then
the program will run until command has failed a total of $MAX_FAILURES time
(as long as it's not overridden by the -m option).

Examples:

1. test_repo/T90ramp/D02build_kernel contains the following line:

	"$POUNDER_HOME/infinite_loop $POUNDER_HOME/test_scripts/build_kernel"

	which will run the build_kernel test script repeatedly until sent SIGTERM
	or until it has failed a total of $MAX_FAILURES times.

	"$POUNDER_HOME/infinite_loop -m 10 $POUNDER_HOME/test_scripts/build_kernel"

	would run the build_kernel test script repeatedly until sent SIGTERM or
	until it has failed 10 times, regardless of what $MAX_FAILURES is.

2. test_scripts/time_drift contains the following line:

	"$POUNDER_HOME/timed_loop 900 "$POUNDER_SRCDIR/time_tests/drift-test.py" $NTP_SERVER $FREQ"

	which will run the drift-test.py script ($NTP_SERVER and $FREQ are some args passed to drift-test.py)
	for 15 minutes or until it has failed a total of $MAX_FAILURES times.

	"$POUNDER_HOME/timed_loop -m 10 900 "$POUNDER_SRCDIR/time_tests/drift-test.py" $NTP_SERVER $FREQ"

	would run the drift-test.py script for 15 minutes or until it has failed 10 times, regardless of
	what $MAX_FAILURES is.

The Provided Test Schedulers
============================
This version of pounder provides 3 test schedulers: the "default," "fast," and "test" test schedulers.
The tarred versions can be found in the schedulers/ directory as default-tests.tar.gz, fast-tests.tar.gz,
and test-tests.tar.gz respectively.

To unpack a test scheduler, run "make install" in the pounder/ directory and enter the name of the
scheduler you would like to unpack at the first prompt.

Example of unpacking the "fast" test scheduler:

	# make install
	./Install
	Looking for tools...make g++ lex gcc python wget sudo diff patch egrep rm echo test which cp mkdir .
	All tools were found.
	WHICH TEST SCHEDULER SETUP DO YOU WANT TO UNPACK?
	[Choose from:
	default-tests.tar.gz
	fast-tests.tar.gz
	test-tests.tar.gz]
	[Or simply press ENTER for the default scheduler]
	Scheduler selection: fast

Descriptions of the provided test schedulers:

1. default - provides a general purpose stress test, runs for 48 hours unless the -d option
		is used when starting pounder.
2. fast - basically the same as default, except it runs for 12 hours by default.
3. test - provides a set of useless tests. Each test simply passes, fails, aborts, or sleeps for
		some period of time. They don't do anything useful but can be used to see how
		the test scheduling setup works.

Creating Your Own Test Schedulers
=================================
From the pounder directory, place the desired tests in the tests/ directory according to
the rules described in the "Scheduling Tests" section above. Then run the following command:

./pounder -c name_of_scheduler

to create a new test scheduler, which will be tarred as name_of_scheduler-tests.tar.gz and
placed in the schedulers/ directory.

Example Usage:

	# ls ./schedulers
	default-tests.tar.gz  fast-tests.tar.gz     test-tests.tar.gz

	# ls ./tests
	T00hwinfo

	# ./pounder -c new_sched

	# ls ./schedulers
	default-tests.tar.gz  fast-tests.tar.gz     new_sched-tests.tar.gz      test-tests.tar.gz

	After unpacking the "new_sched" test scheduler during install, the tests/ directory should
	contain the T00hwinfo subtest along with a tests/excluded/ directory (see the "Including and
	Excluding Tests" section below for details regarding the tests/excluded directory).

Including and Excluding Tests
=============================
After unpacking the test scheduler and building each individual test, running
"./pounder" will automatically run every test included in the tests folder. If you
would like to run only ONE test, run "./pounder ./tests/<some subtest>". If you would
like to run a portion of tests, you can use the "./pounder -e" option to exclude
certain subtests from subsequent pounder runs:

Example:

Suppose you have already ran "make install" and unpacked the default test scheduler.
The tests/ directory should now contain the subtests to be run

1) ./pounder -l
	- lists all of the subtests that came with the currently active test scheduler.
	The output should look something like:

	------------------
	#./pounder -l
	Included subtests:
	...
	.../ltp-full-xxxxxxxx/tools/pounder/tests/T10single/T00xterm_stress
	.../ltp-full-xxxxxxxx/tools/pounder/tests/T00hwinfo
	...

	Excluded subtests:
	[NONE]
	------------------

2) ./pounder -e "tests/T10single/T00xterm_stress tests/T00hwinfo"
	- will exclude T00xterm_stress and T00hwinfo from any subsequent pounder runs.
	This command essentially moves the two tests from the "tests" folder to the
	"tests/excluded" folder for temporary storage, where they will remain until
	re-included back into the test scheduler (this is also why all test names
	should be unique). A file "tests/excluded/testlist" keeps track of which tests
	have been excluded from the test scheduler and what their original paths were.

3) ./pounder -l
	- should now output something like:

	------------------
	#./pounder -l
	Included subtests:
	...

	Excluded subtests:
	T00xterm_stress
	T00hwinfo
	------------------

4) ./pounder -i "T00xterm_stress T00hwinfo" - will re-include these subtests back into
	the test scheduler. They will be moved from the tests/excluded folder back into
	the tests folder under their original paths.