Home | History | Annotate | only in /external/google-benchmark
Up to higher level directory
NameDateSize
.clang-format06-Dec-201752
.gitignore06-Dec-2017469
.travis-libcxx-setup.sh06-Dec-20171.1K
.travis.yml06-Dec-20174.6K
.ycm_extra_conf.py06-Dec-20173.6K
Android.bp06-Dec-20171.3K
appveyor.yml06-Dec-20171.3K
AUTHORS06-Dec-20171.4K
cmake/06-Dec-2017
CMakeLists.txt06-Dec-20177.8K
CONTRIBUTING.md06-Dec-20172.4K
CONTRIBUTORS06-Dec-20172.2K
docs/06-Dec-2017
include/06-Dec-2017
LICENSE06-Dec-201711.1K
mingw.py06-Dec-201710.1K
MODULE_LICENSE_APACHE206-Dec-20170
NOTICE06-Dec-201711.1K
OWNERS06-Dec-201719
README.md06-Dec-201726.2K
README.version06-Dec-2017140
src/06-Dec-2017
test/06-Dec-2017
tools/06-Dec-2017

README.md

      1 # benchmark
      2 [![Build Status](https://travis-ci.org/google/benchmark.svg?branch=master)](https://travis-ci.org/google/benchmark)
      3 [![Build status](https://ci.appveyor.com/api/projects/status/u0qsyp7t1tk7cpxs/branch/master?svg=true)](https://ci.appveyor.com/project/google/benchmark/branch/master)
      4 [![Coverage Status](https://coveralls.io/repos/google/benchmark/badge.svg)](https://coveralls.io/r/google/benchmark)
      5 
      6 A library to support the benchmarking of functions, similar to unit-tests.
      7 
      8 Discussion group: https://groups.google.com/d/forum/benchmark-discuss
      9 
     10 IRC channel: https://freenode.net #googlebenchmark
     11 
     12 [Known issues and common problems](#known-issues)
     13 
     14 [Additional Tooling Documentation](docs/tools.md)
     15 
     16 ## Example usage
     17 ### Basic usage
     18 Define a function that executes the code to be measured.
     19 
     20 ```c++
     21 static void BM_StringCreation(benchmark::State& state) {
     22   while (state.KeepRunning())
     23     std::string empty_string;
     24 }
     25 // Register the function as a benchmark
     26 BENCHMARK(BM_StringCreation);
     27 
     28 // Define another benchmark
     29 static void BM_StringCopy(benchmark::State& state) {
     30   std::string x = "hello";
     31   while (state.KeepRunning())
     32     std::string copy(x);
     33 }
     34 BENCHMARK(BM_StringCopy);
     35 
     36 BENCHMARK_MAIN();
     37 ```
     38 
     39 ### Passing arguments
     40 Sometimes a family of benchmarks can be implemented with just one routine that
     41 takes an extra argument to specify which one of the family of benchmarks to
     42 run. For example, the following code defines a family of benchmarks for
     43 measuring the speed of `memcpy()` calls of different lengths:
     44 
     45 ```c++
     46 static void BM_memcpy(benchmark::State& state) {
     47   char* src = new char[state.range(0)];
     48   char* dst = new char[state.range(0)];
     49   memset(src, 'x', state.range(0));
     50   while (state.KeepRunning())
     51     memcpy(dst, src, state.range(0));
     52   state.SetBytesProcessed(int64_t(state.iterations()) *
     53                           int64_t(state.range(0)));
     54   delete[] src;
     55   delete[] dst;
     56 }
     57 BENCHMARK(BM_memcpy)->Arg(8)->Arg(64)->Arg(512)->Arg(1<<10)->Arg(8<<10);
     58 ```
     59 
     60 The preceding code is quite repetitive, and can be replaced with the following
     61 short-hand. The following invocation will pick a few appropriate arguments in
     62 the specified range and will generate a benchmark for each such argument.
     63 
     64 ```c++
     65 BENCHMARK(BM_memcpy)->Range(8, 8<<10);
     66 ```
     67 
     68 By default the arguments in the range are generated in multiples of eight and
     69 the command above selects [ 8, 64, 512, 4k, 8k ]. In the following code the
     70 range multiplier is changed to multiples of two.
     71 
     72 ```c++
     73 BENCHMARK(BM_memcpy)->RangeMultiplier(2)->Range(8, 8<<10);
     74 ```
     75 Now arguments generated are [ 8, 16, 32, 64, 128, 256, 512, 1024, 2k, 4k, 8k ].
     76 
     77 You might have a benchmark that depends on two or more inputs. For example, the
     78 following code defines a family of benchmarks for measuring the speed of set
     79 insertion.
     80 
     81 ```c++
     82 static void BM_SetInsert(benchmark::State& state) {
     83   while (state.KeepRunning()) {
     84     state.PauseTiming();
     85     std::set<int> data = ConstructRandomSet(state.range(0));
     86     state.ResumeTiming();
     87     for (int j = 0; j < state.range(1); ++j)
     88       data.insert(RandomNumber());
     89   }
     90 }
     91 BENCHMARK(BM_SetInsert)
     92     ->Args({1<<10, 1})
     93     ->Args({1<<10, 8})
     94     ->Args({1<<10, 64})
     95     ->Args({1<<10, 512})
     96     ->Args({8<<10, 1})
     97     ->Args({8<<10, 8})
     98     ->Args({8<<10, 64})
     99     ->Args({8<<10, 512});
    100 ```
    101 
    102 The preceding code is quite repetitive, and can be replaced with the following
    103 short-hand. The following macro will pick a few appropriate arguments in the
    104 product of the two specified ranges and will generate a benchmark for each such
    105 pair.
    106 
    107 ```c++
    108 BENCHMARK(BM_SetInsert)->Ranges({{1<<10, 8<<10}, {1, 512}});
    109 ```
    110 
    111 For more complex patterns of inputs, passing a custom function to `Apply` allows
    112 programmatic specification of an arbitrary set of arguments on which to run the
    113 benchmark. The following example enumerates a dense range on one parameter,
    114 and a sparse range on the second.
    115 
    116 ```c++
    117 static void CustomArguments(benchmark::internal::Benchmark* b) {
    118   for (int i = 0; i <= 10; ++i)
    119     for (int j = 32; j <= 1024*1024; j *= 8)
    120       b->Args({i, j});
    121 }
    122 BENCHMARK(BM_SetInsert)->Apply(CustomArguments);
    123 ```
    124 
    125 ### Calculate asymptotic complexity (Big O)
    126 Asymptotic complexity might be calculated for a family of benchmarks. The
    127 following code will calculate the coefficient for the high-order term in the
    128 running time and the normalized root-mean square error of string comparison.
    129 
    130 ```c++
    131 static void BM_StringCompare(benchmark::State& state) {
    132   std::string s1(state.range(0), '-');
    133   std::string s2(state.range(0), '-');
    134   while (state.KeepRunning()) {
    135     benchmark::DoNotOptimize(s1.compare(s2));
    136   }
    137   state.SetComplexityN(state.range(0));
    138 }
    139 BENCHMARK(BM_StringCompare)
    140     ->RangeMultiplier(2)->Range(1<<10, 1<<18)->Complexity(benchmark::oN);
    141 ```
    142 
    143 As shown in the following invocation, asymptotic complexity might also be
    144 calculated automatically.
    145 
    146 ```c++
    147 BENCHMARK(BM_StringCompare)
    148     ->RangeMultiplier(2)->Range(1<<10, 1<<18)->Complexity();
    149 ```
    150 
    151 The following code will specify asymptotic complexity with a lambda function,
    152 that might be used to customize high-order term calculation.
    153 
    154 ```c++
    155 BENCHMARK(BM_StringCompare)->RangeMultiplier(2)
    156     ->Range(1<<10, 1<<18)->Complexity([](int n)->double{return n; });
    157 ```
    158 
    159 ### Templated benchmarks
    160 Templated benchmarks work the same way: This example produces and consumes
    161 messages of size `sizeof(v)` `range_x` times. It also outputs throughput in the
    162 absence of multiprogramming.
    163 
    164 ```c++
    165 template <class Q> int BM_Sequential(benchmark::State& state) {
    166   Q q;
    167   typename Q::value_type v;
    168   while (state.KeepRunning()) {
    169     for (int i = state.range(0); i--; )
    170       q.push(v);
    171     for (int e = state.range(0); e--; )
    172       q.Wait(&v);
    173   }
    174   // actually messages, not bytes:
    175   state.SetBytesProcessed(
    176       static_cast<int64_t>(state.iterations())*state.range(0));
    177 }
    178 BENCHMARK_TEMPLATE(BM_Sequential, WaitQueue<int>)->Range(1<<0, 1<<10);
    179 ```
    180 
    181 Three macros are provided for adding benchmark templates.
    182 
    183 ```c++
    184 #if __cplusplus >= 201103L // C++11 and greater.
    185 #define BENCHMARK_TEMPLATE(func, ...) // Takes any number of parameters.
    186 #else // C++ < C++11
    187 #define BENCHMARK_TEMPLATE(func, arg1)
    188 #endif
    189 #define BENCHMARK_TEMPLATE1(func, arg1)
    190 #define BENCHMARK_TEMPLATE2(func, arg1, arg2)
    191 ```
    192 
    193 ## Passing arbitrary arguments to a benchmark
    194 In C++11 it is possible to define a benchmark that takes an arbitrary number
    195 of extra arguments. The `BENCHMARK_CAPTURE(func, test_case_name, ...args)`
    196 macro creates a benchmark that invokes `func`  with the `benchmark::State` as
    197 the first argument followed by the specified `args...`.
    198 The `test_case_name` is appended to the name of the benchmark and
    199 should describe the values passed.
    200 
    201 ```c++
    202 template <class ...ExtraArgs>`
    203 void BM_takes_args(benchmark::State& state, ExtraArgs&&... extra_args) {
    204   [...]
    205 }
    206 // Registers a benchmark named "BM_takes_args/int_string_test` that passes
    207 // the specified values to `extra_args`.
    208 BENCHMARK_CAPTURE(BM_takes_args, int_string_test, 42, std::string("abc"));
    209 ```
    210 Note that elements of `...args` may refer to global variables. Users should
    211 avoid modifying global state inside of a benchmark.
    212 
    213 ## Using RegisterBenchmark(name, fn, args...)
    214 
    215 The `RegisterBenchmark(name, func, args...)` function provides an alternative
    216 way to create and register benchmarks.
    217 `RegisterBenchmark(name, func, args...)` creates, registers, and returns a
    218 pointer to a new benchmark with the specified `name` that invokes
    219 `func(st, args...)` where `st` is a `benchmark::State` object.
    220 
    221 Unlike the `BENCHMARK` registration macros, which can only be used at the global
    222 scope, the `RegisterBenchmark` can be called anywhere. This allows for
    223 benchmark tests to be registered programmatically.
    224 
    225 Additionally `RegisterBenchmark` allows any callable object to be registered
    226 as a benchmark. Including capturing lambdas and function objects. This
    227 allows the creation
    228 
    229 For Example:
    230 ```c++
    231 auto BM_test = [](benchmark::State& st, auto Inputs) { /* ... */ };
    232 
    233 int main(int argc, char** argv) {
    234   for (auto& test_input : { /* ... */ })
    235       benchmark::RegisterBenchmark(test_input.name(), BM_test, test_input);
    236   benchmark::Initialize(&argc, argv);
    237   benchmark::RunSpecifiedBenchmarks();
    238 }
    239 ```
    240 
    241 ### Multithreaded benchmarks
    242 In a multithreaded test (benchmark invoked by multiple threads simultaneously),
    243 it is guaranteed that none of the threads will start until all have called
    244 `KeepRunning`, and all will have finished before KeepRunning returns false. As
    245 such, any global setup or teardown can be wrapped in a check against the thread
    246 index:
    247 
    248 ```c++
    249 static void BM_MultiThreaded(benchmark::State& state) {
    250   if (state.thread_index == 0) {
    251     // Setup code here.
    252   }
    253   while (state.KeepRunning()) {
    254     // Run the test as normal.
    255   }
    256   if (state.thread_index == 0) {
    257     // Teardown code here.
    258   }
    259 }
    260 BENCHMARK(BM_MultiThreaded)->Threads(2);
    261 ```
    262 
    263 If the benchmarked code itself uses threads and you want to compare it to
    264 single-threaded code, you may want to use real-time ("wallclock") measurements
    265 for latency comparisons:
    266 
    267 ```c++
    268 BENCHMARK(BM_test)->Range(8, 8<<10)->UseRealTime();
    269 ```
    270 
    271 Without `UseRealTime`, CPU time is used by default.
    272 
    273 
    274 ## Manual timing
    275 For benchmarking something for which neither CPU time nor real-time are
    276 correct or accurate enough, completely manual timing is supported using
    277 the `UseManualTime` function. 
    278 
    279 When `UseManualTime` is used, the benchmarked code must call
    280 `SetIterationTime` once per iteration of the `KeepRunning` loop to
    281 report the manually measured time.
    282 
    283 An example use case for this is benchmarking GPU execution (e.g. OpenCL
    284 or CUDA kernels, OpenGL or Vulkan or Direct3D draw calls), which cannot
    285 be accurately measured using CPU time or real-time. Instead, they can be
    286 measured accurately using a dedicated API, and these measurement results
    287 can be reported back with `SetIterationTime`.
    288 
    289 ```c++
    290 static void BM_ManualTiming(benchmark::State& state) {
    291   int microseconds = state.range(0);
    292   std::chrono::duration<double, std::micro> sleep_duration {
    293     static_cast<double>(microseconds)
    294   };
    295 
    296   while (state.KeepRunning()) {
    297     auto start = std::chrono::high_resolution_clock::now();
    298     // Simulate some useful workload with a sleep
    299     std::this_thread::sleep_for(sleep_duration);
    300     auto end   = std::chrono::high_resolution_clock::now();
    301 
    302     auto elapsed_seconds =
    303       std::chrono::duration_cast<std::chrono::duration<double>>(
    304         end - start);
    305 
    306     state.SetIterationTime(elapsed_seconds.count());
    307   }
    308 }
    309 BENCHMARK(BM_ManualTiming)->Range(1, 1<<17)->UseManualTime();
    310 ```
    311 
    312 ### Preventing optimisation
    313 To prevent a value or expression from being optimized away by the compiler
    314 the `benchmark::DoNotOptimize(...)` and `benchmark::ClobberMemory()`
    315 functions can be used.
    316 
    317 ```c++
    318 static void BM_test(benchmark::State& state) {
    319   while (state.KeepRunning()) {
    320       int x = 0;
    321       for (int i=0; i < 64; ++i) {
    322         benchmark::DoNotOptimize(x += i);
    323       }
    324   }
    325 }
    326 ```
    327 
    328 `DoNotOptimize(<expr>)` forces the  *result* of `<expr>` to be stored in either
    329 memory or a register. For GNU based compilers it acts as read/write barrier
    330 for global memory. More specifically it forces the compiler to flush pending
    331 writes to memory and reload any other values as necessary.
    332 
    333 Note that `DoNotOptimize(<expr>)` does not prevent optimizations on `<expr>`
    334 in any way. `<expr>` may even be removed entirely when the result is already
    335 known. For example:
    336 
    337 ```c++
    338   /* Example 1: `<expr>` is removed entirely. */
    339   int foo(int x) { return x + 42; }
    340   while (...) DoNotOptimize(foo(0)); // Optimized to DoNotOptimize(42);
    341 
    342   /*  Example 2: Result of '<expr>' is only reused */
    343   int bar(int) __attribute__((const));
    344   while (...) DoNotOptimize(bar(0)); // Optimized to:
    345   // int __result__ = bar(0);
    346   // while (...) DoNotOptimize(__result__);
    347 ```
    348 
    349 The second tool for preventing optimizations is `ClobberMemory()`. In essence
    350 `ClobberMemory()` forces the compiler to perform all pending writes to global
    351 memory. Memory managed by block scope objects must be "escaped" using
    352 `DoNotOptimize(...)` before it can be clobbered. In the below example
    353 `ClobberMemory()` prevents the call to `v.push_back(42)` from being optimized
    354 away.
    355 
    356 ```c++
    357 static void BM_vector_push_back(benchmark::State& state) {
    358   while (state.KeepRunning()) {
    359     std::vector<int> v;
    360     v.reserve(1);
    361     benchmark::DoNotOptimize(v.data()); // Allow v.data() to be clobbered.
    362     v.push_back(42);
    363     benchmark::ClobberMemory(); // Force 42 to be written to memory.
    364   }
    365 }
    366 ```
    367 
    368 Note that `ClobberMemory()` is only available for GNU or MSVC based compilers.
    369 
    370 ### Set time unit manually
    371 If a benchmark runs a few milliseconds it may be hard to visually compare the
    372 measured times, since the output data is given in nanoseconds per default. In
    373 order to manually set the time unit, you can specify it manually:
    374 
    375 ```c++
    376 BENCHMARK(BM_test)->Unit(benchmark::kMillisecond);
    377 ```
    378 
    379 ## Controlling number of iterations
    380 In all cases, the number of iterations for which the benchmark is run is
    381 governed by the amount of time the benchmark takes. Concretely, the number of
    382 iterations is at least one, not more than 1e9, until CPU time is greater than
    383 the minimum time, or the wallclock time is 5x minimum time. The minimum time is
    384 set as a flag `--benchmark_min_time` or per benchmark by calling `MinTime` on
    385 the registered benchmark object.
    386 
    387 ## Reporting the mean and standard devation by repeated benchmarks
    388 By default each benchmark is run once and that single result is reported.
    389 However benchmarks are often noisy and a single result may not be representative
    390 of the overall behavior. For this reason it's possible to repeatedly rerun the
    391 benchmark.
    392 
    393 The number of runs of each benchmark is specified globally by the
    394 `--benchmark_repetitions` flag or on a per benchmark basis by calling
    395 `Repetitions` on the registered benchmark object. When a benchmark is run
    396 more than once the mean and standard deviation of the runs will be reported.
    397 
    398 Additionally the `--benchmark_report_aggregates_only={true|false}` flag or
    399 `ReportAggregatesOnly(bool)` function can be used to change how repeated tests
    400 are reported. By default the result of each repeated run is reported. When this
    401 option is 'true' only the mean and standard deviation of the runs is reported.
    402 Calling `ReportAggregatesOnly(bool)` on a registered benchmark object overrides
    403 the value of the flag for that benchmark.
    404 
    405 ## Fixtures
    406 Fixture tests are created by
    407 first defining a type that derives from ::benchmark::Fixture and then
    408 creating/registering the tests using the following macros:
    409 
    410 * `BENCHMARK_F(ClassName, Method)`
    411 * `BENCHMARK_DEFINE_F(ClassName, Method)`
    412 * `BENCHMARK_REGISTER_F(ClassName, Method)`
    413 
    414 For Example:
    415 
    416 ```c++
    417 class MyFixture : public benchmark::Fixture {};
    418 
    419 BENCHMARK_F(MyFixture, FooTest)(benchmark::State& st) {
    420    while (st.KeepRunning()) {
    421      ...
    422   }
    423 }
    424 
    425 BENCHMARK_DEFINE_F(MyFixture, BarTest)(benchmark::State& st) {
    426    while (st.KeepRunning()) {
    427      ...
    428   }
    429 }
    430 /* BarTest is NOT registered */
    431 BENCHMARK_REGISTER_F(MyFixture, BarTest)->Threads(2);
    432 /* BarTest is now registered */
    433 ```
    434 
    435 
    436 ## User-defined counters
    437 
    438 You can add your own counters with user-defined names. The example below
    439 will add columns "Foo", "Bar" and "Baz" in its output:
    440 
    441 ```c++
    442 static void UserCountersExample1(benchmark::State& state) {
    443   double numFoos = 0, numBars = 0, numBazs = 0;
    444   while (state.KeepRunning()) {
    445     // ... count Foo,Bar,Baz events
    446   }
    447   state.counters["Foo"] = numFoos;
    448   state.counters["Bar"] = numBars;
    449   state.counters["Baz"] = numBazs;
    450 }
    451 ```
    452 
    453 The `state.counters` object is a `std::map` with `std::string` keys
    454 and `Counter` values. The latter is a `double`-like class, via an implicit
    455 conversion to `double&`. Thus you can use all of the standard arithmetic
    456 assignment operators (`=,+=,-=,*=,/=`) to change the value of each counter.
    457 
    458 In multithreaded benchmarks, each counter is set on the calling thread only.
    459 When the benchmark finishes, the counters from each thread will be summed;
    460 the resulting sum is the value which will be shown for the benchmark.
    461 
    462 The `Counter` constructor accepts two parameters: the value as a `double`
    463 and a bit flag which allows you to show counters as rates and/or as
    464 per-thread averages:
    465 
    466 ```c++
    467   // sets a simple counter
    468   state.counters["Foo"] = numFoos;
    469 
    470   // Set the counter as a rate. It will be presented divided
    471   // by the duration of the benchmark.
    472   state.counters["FooRate"] = Counter(numFoos, benchmark::Counter::kIsRate);
    473 
    474   // Set the counter as a thread-average quantity. It will
    475   // be presented divided by the number of threads.
    476   state.counters["FooAvg"] = Counter(numFoos, benchmark::Counter::kAvgThreads);
    477 
    478   // There's also a combined flag:
    479   state.counters["FooAvgRate"] = Counter(numFoos,benchmark::Counter::kAvgThreadsRate);
    480 ```
    481 
    482 When you're compiling in C++11 mode or later you can use `insert()` with
    483 `std::initializer_list`:
    484 
    485 ```c++
    486   // With C++11, this can be done:
    487   state.counters.insert({{"Foo", numFoos}, {"Bar", numBars}, {"Baz", numBazs}});
    488   // ... instead of:
    489   state.counters["Foo"] = numFoos;
    490   state.counters["Bar"] = numBars;
    491   state.counters["Baz"] = numBazs;
    492 ```
    493 
    494 ### Counter reporting
    495 
    496 When using the console reporter, by default, user counters are are printed at
    497 the end after the table, the same way as ``bytes_processed`` and
    498 ``items_processed``. This is best for cases in which there are few counters,
    499 or where there are only a couple of lines per benchmark. Here's an example of
    500 the default output:
    501 
    502 ```
    503 ------------------------------------------------------------------------------
    504 Benchmark                        Time           CPU Iterations UserCounters...
    505 ------------------------------------------------------------------------------
    506 BM_UserCounter/threads:8      2248 ns      10277 ns      68808 Bar=16 Bat=40 Baz=24 Foo=8
    507 BM_UserCounter/threads:1      9797 ns       9788 ns      71523 Bar=2 Bat=5 Baz=3 Foo=1024m
    508 BM_UserCounter/threads:2      4924 ns       9842 ns      71036 Bar=4 Bat=10 Baz=6 Foo=2
    509 BM_UserCounter/threads:4      2589 ns      10284 ns      68012 Bar=8 Bat=20 Baz=12 Foo=4
    510 BM_UserCounter/threads:8      2212 ns      10287 ns      68040 Bar=16 Bat=40 Baz=24 Foo=8
    511 BM_UserCounter/threads:16     1782 ns      10278 ns      68144 Bar=32 Bat=80 Baz=48 Foo=16
    512 BM_UserCounter/threads:32     1291 ns      10296 ns      68256 Bar=64 Bat=160 Baz=96 Foo=32
    513 BM_UserCounter/threads:4      2615 ns      10307 ns      68040 Bar=8 Bat=20 Baz=12 Foo=4
    514 BM_Factorial                    26 ns         26 ns   26608979 40320
    515 BM_Factorial/real_time          26 ns         26 ns   26587936 40320
    516 BM_CalculatePiRange/1           16 ns         16 ns   45704255 0
    517 BM_CalculatePiRange/8           73 ns         73 ns    9520927 3.28374
    518 BM_CalculatePiRange/64         609 ns        609 ns    1140647 3.15746
    519 BM_CalculatePiRange/512       4900 ns       4901 ns     142696 3.14355
    520 ```
    521 
    522 If this doesn't suit you, you can print each counter as a table column by
    523 passing the flag `--benchmark_counters_tabular=true` to the benchmark
    524 application. This is best for cases in which there are a lot of counters, or
    525 a lot of lines per individual benchmark. Note that this will trigger a
    526 reprinting of the table header any time the counter set changes between
    527 individual benchmarks. Here's an example of corresponding output when
    528 `--benchmark_counters_tabular=true` is passed:
    529 
    530 ```
    531 ---------------------------------------------------------------------------------------
    532 Benchmark                        Time           CPU Iterations    Bar   Bat   Baz   Foo
    533 ---------------------------------------------------------------------------------------
    534 BM_UserCounter/threads:8      2198 ns       9953 ns      70688     16    40    24     8
    535 BM_UserCounter/threads:1      9504 ns       9504 ns      73787      2     5     3     1
    536 BM_UserCounter/threads:2      4775 ns       9550 ns      72606      4    10     6     2
    537 BM_UserCounter/threads:4      2508 ns       9951 ns      70332      8    20    12     4
    538 BM_UserCounter/threads:8      2055 ns       9933 ns      70344     16    40    24     8
    539 BM_UserCounter/threads:16     1610 ns       9946 ns      70720     32    80    48    16
    540 BM_UserCounter/threads:32     1192 ns       9948 ns      70496     64   160    96    32
    541 BM_UserCounter/threads:4      2506 ns       9949 ns      70332      8    20    12     4
    542 --------------------------------------------------------------
    543 Benchmark                        Time           CPU Iterations
    544 --------------------------------------------------------------
    545 BM_Factorial                    26 ns         26 ns   26392245 40320
    546 BM_Factorial/real_time          26 ns         26 ns   26494107 40320
    547 BM_CalculatePiRange/1           15 ns         15 ns   45571597 0
    548 BM_CalculatePiRange/8           74 ns         74 ns    9450212 3.28374
    549 BM_CalculatePiRange/64         595 ns        595 ns    1173901 3.15746
    550 BM_CalculatePiRange/512       4752 ns       4752 ns     147380 3.14355
    551 BM_CalculatePiRange/4k       37970 ns      37972 ns      18453 3.14184
    552 BM_CalculatePiRange/32k     303733 ns     303744 ns       2305 3.14162
    553 BM_CalculatePiRange/256k   2434095 ns    2434186 ns        288 3.1416
    554 BM_CalculatePiRange/1024k  9721140 ns    9721413 ns         71 3.14159
    555 BM_CalculatePi/threads:8      2255 ns       9943 ns      70936
    556 ```
    557 Note above the additional header printed when the benchmark changes from
    558 ``BM_UserCounter`` to ``BM_Factorial``. This is because ``BM_Factorial`` does
    559 not have the same counter set as ``BM_UserCounter``.
    560 
    561 ## Exiting Benchmarks in Error
    562 
    563 When errors caused by external influences, such as file I/O and network
    564 communication, occur within a benchmark the
    565 `State::SkipWithError(const char* msg)` function can be used to skip that run
    566 of benchmark and report the error. Note that only future iterations of the
    567 `KeepRunning()` are skipped. Users may explicitly return to exit the
    568 benchmark immediately.
    569 
    570 The `SkipWithError(...)` function may be used at any point within the benchmark,
    571 including before and after the `KeepRunning()` loop.
    572 
    573 For example:
    574 
    575 ```c++
    576 static void BM_test(benchmark::State& state) {
    577   auto resource = GetResource();
    578   if (!resource.good()) {
    579       state.SkipWithError("Resource is not good!");
    580       // KeepRunning() loop will not be entered.
    581   }
    582   while (state.KeepRunning()) {
    583       auto data = resource.read_data();
    584       if (!resource.good()) {
    585         state.SkipWithError("Failed to read data!");
    586         break; // Needed to skip the rest of the iteration.
    587      }
    588      do_stuff(data);
    589   }
    590 }
    591 ```
    592 
    593 ## Running a subset of the benchmarks
    594 
    595 The `--benchmark_filter=<regex>` option can be used to only run the benchmarks
    596 which match the specified `<regex>`. For example:
    597 
    598 ```bash
    599 $ ./run_benchmarks.x --benchmark_filter=BM_memcpy/32
    600 Run on (1 X 2300 MHz CPU )
    601 2016-06-25 19:34:24
    602 Benchmark              Time           CPU Iterations
    603 ----------------------------------------------------
    604 BM_memcpy/32          11 ns         11 ns   79545455
    605 BM_memcpy/32k       2181 ns       2185 ns     324074
    606 BM_memcpy/32          12 ns         12 ns   54687500
    607 BM_memcpy/32k       1834 ns       1837 ns     357143
    608 ```
    609 
    610 
    611 ## Output Formats
    612 The library supports multiple output formats. Use the
    613 `--benchmark_format=<console|json|csv>` flag to set the format type. `console`
    614 is the default format.
    615 
    616 The Console format is intended to be a human readable format. By default
    617 the format generates color output. Context is output on stderr and the 
    618 tabular data on stdout. Example tabular output looks like:
    619 ```
    620 Benchmark                               Time(ns)    CPU(ns) Iterations
    621 ----------------------------------------------------------------------
    622 BM_SetInsert/1024/1                        28928      29349      23853  133.097kB/s   33.2742k items/s
    623 BM_SetInsert/1024/8                        32065      32913      21375  949.487kB/s   237.372k items/s
    624 BM_SetInsert/1024/10                       33157      33648      21431  1.13369MB/s   290.225k items/s
    625 ```
    626 
    627 The JSON format outputs human readable json split into two top level attributes.
    628 The `context` attribute contains information about the run in general, including
    629 information about the CPU and the date.
    630 The `benchmarks` attribute contains a list of ever benchmark run. Example json
    631 output looks like:
    632 ```json
    633 {
    634   "context": {
    635     "date": "2015/03/17-18:40:25",
    636     "num_cpus": 40,
    637     "mhz_per_cpu": 2801,
    638     "cpu_scaling_enabled": false,
    639     "build_type": "debug"
    640   },
    641   "benchmarks": [
    642     {
    643       "name": "BM_SetInsert/1024/1",
    644       "iterations": 94877,
    645       "real_time": 29275,
    646       "cpu_time": 29836,
    647       "bytes_per_second": 134066,
    648       "items_per_second": 33516
    649     },
    650     {
    651       "name": "BM_SetInsert/1024/8",
    652       "iterations": 21609,
    653       "real_time": 32317,
    654       "cpu_time": 32429,
    655       "bytes_per_second": 986770,
    656       "items_per_second": 246693
    657     },
    658     {
    659       "name": "BM_SetInsert/1024/10",
    660       "iterations": 21393,
    661       "real_time": 32724,
    662       "cpu_time": 33355,
    663       "bytes_per_second": 1199226,
    664       "items_per_second": 299807
    665     }
    666   ]
    667 }
    668 ```
    669 
    670 The CSV format outputs comma-separated values. The `context` is output on stderr
    671 and the CSV itself on stdout. Example CSV output looks like:
    672 ```
    673 name,iterations,real_time,cpu_time,bytes_per_second,items_per_second,label
    674 "BM_SetInsert/1024/1",65465,17890.7,8407.45,475768,118942,
    675 "BM_SetInsert/1024/8",116606,18810.1,9766.64,3.27646e+06,819115,
    676 "BM_SetInsert/1024/10",106365,17238.4,8421.53,4.74973e+06,1.18743e+06,
    677 ```
    678 
    679 ## Output Files
    680 The library supports writing the output of the benchmark to a file specified
    681 by `--benchmark_out=<filename>`. The format of the output can be specified
    682 using `--benchmark_out_format={json|console|csv}`. Specifying
    683 `--benchmark_out` does not suppress the console output.
    684 
    685 ## Debug vs Release
    686 By default, benchmark builds as a debug library. You will see a warning in the output when this is the case. To build it as a release library instead, use:
    687 
    688 ```
    689 cmake -DCMAKE_BUILD_TYPE=Release
    690 ```
    691 
    692 To enable link-time optimisation, use
    693 
    694 ```
    695 cmake -DCMAKE_BUILD_TYPE=Release -DBENCHMARK_ENABLE_LTO=true
    696 ```
    697 
    698 ## Linking against the library
    699 When using gcc, it is necessary to link against pthread to avoid runtime exceptions.
    700 This is due to how gcc implements std::thread.
    701 See [issue #67](https://github.com/google/benchmark/issues/67) for more details.
    702 
    703 ## Compiler Support
    704 
    705 Google Benchmark uses C++11 when building the library. As such we require
    706 a modern C++ toolchain, both compiler and standard library.
    707 
    708 The following minimum versions are strongly recommended build the library:
    709 
    710 * GCC 4.8
    711 * Clang 3.4
    712 * Visual Studio 2013
    713 * Intel 2015 Update 1
    714 
    715 Anything older *may* work.
    716 
    717 Note: Using the library and its headers in C++03 is supported. C++11 is only
    718 required to build the library.
    719 
    720 # Known Issues
    721 
    722 ### Windows
    723 
    724 * Users must manually link `shlwapi.lib`. Failure to do so may result
    725 in unresolved symbols.
    726 
    727 

README.version

      1 URL: https://github.com/google/benchmark
      2 Version: 8da907c2c2786685c7da9f4759de052e3990f6f1
      3 BugComponent: 119451
      4 Owners: enh, android-bionic
      5