Home | History | Annotate | Download | only in openswr
      1 Profiling
      2 =========
      3 
      4 OpenSWR contains built-in profiling  which can be enabled
      5 at build time to provide insight into performance tuning.
      6 
      7 To enable this, uncomment the following line in ``rasterizer/core/knobs.h`` and rebuild: ::
      8 
      9   //#define KNOB_ENABLE_RDTSC
     10 
     11 Running an application will result in a ``rdtsc.txt`` file being
     12 created in current working directory.  This file contains profile
     13 information captured between the ``KNOB_BUCKETS_START_FRAME`` and
     14 ``KNOB_BUCKETS_END_FRAME`` (see knobs section).
     15 
     16 The resulting file will contain sections for each thread with a
     17 hierarchical breakdown of the time spent in the various operations.
     18 For example: ::
     19 
     20  Thread 0 (API)
     21   %Tot   %Par  Cycles     CPE        NumEvent   CPE2       NumEvent2  Bucket
     22    0.00   0.00 28370      2837       10         0          0          APIClearRenderTarget
     23    0.00  41.23 11698      1169       10         0          0          |-> APIDrawWakeAllThreads
     24    0.00  18.34 5202       520        10         0          0          |-> APIGetDrawContext
     25   98.72  98.72 12413773688 29957      414380     0          0          APIDraw
     26    0.36   0.36 44689364   107        414380     0          0          |-> APIDrawWakeAllThreads
     27   96.36  97.62 12117951562 9747       1243140    0          0          |-> APIGetDrawContext
     28    0.00   0.00 19904      995        20         0          0          APIStoreTiles
     29    0.00   7.88 1568       78         20         0          0          |-> APIDrawWakeAllThreads
     30    0.00  25.28 5032       251        20         0          0          |-> APIGetDrawContext
     31    1.28   1.28 161344902  64         2486370    0          0          APIGetDrawContext
     32    0.00   0.00 50368      2518       20         0          0          APISync
     33    0.00   2.70 1360       68         20         0          0          |-> APIDrawWakeAllThreads
     34    0.00  65.27 32876      1643       20         0          0          |-> APIGetDrawContext
     35 
     36 
     37  Thread 1 (WORKER)
     38   %Tot   %Par  Cycles     CPE        NumEvent   CPE2       NumEvent2  Bucket
     39   83.92  83.92 13198987522 96411      136902     0          0          FEProcessDraw
     40   24.91  29.69 3918184840 167        23410158   0          0          |-> FEFetchShader
     41   11.17  13.31 1756972646 75         23410158   0          0          |-> FEVertexShader
     42    8.89  10.59 1397902996 59         23410161   0          0          |-> FEPAAssemble
     43   19.06  22.71 2997794710 384        7803387    0          0          |-> FEClipTriangles
     44   11.67  61.21 1834958176 235        7803387    0          0              |-> FEBinTriangles
     45    0.00   0.00 0          0          187258     0          0                  |-> FECullZeroAreaAndBackface
     46    0.00   0.00 0          0          60051033   0          0                  |-> FECullBetweenCenters
     47    0.11   0.11 17217556   2869592    6          0          0          FEProcessStoreTiles
     48   15.97  15.97 2511392576 73665      34092      0          0          WorkerWorkOnFifoBE
     49   14.04  87.95 2208687340 9187       240408     0          0          |-> WorkerFoundWork
     50    0.06   0.43 9390536    13263      708        0          0              |-> BELoadTiles
     51    0.00   0.01 293020     182        1609       0          0              |-> BEClear
     52   12.63  89.94 1986508990 949        2093014    0          0              |-> BERasterizeTriangle
     53    2.37  18.75 372374596  177        2093014    0          0                  |-> BETriangleSetup
     54    0.42   3.35 66539016   31         2093014    0          0                  |-> BEStepSetup
     55    0.00   0.00 0          0          21766      0          0                  |-> BETrivialReject
     56    1.05   8.33 165410662  79         2071248    0          0                  |-> BERasterizePartial
     57    6.06  48.02 953847796  1260       756783     0          0                  |-> BEPixelBackend
     58    0.20   3.30 31521202   41         756783     0          0                      |-> BESetup
     59    0.16   2.69 25624304   33         756783     0          0                      |-> BEBarycentric
     60    0.18   2.92 27884986   36         756783     0          0                      |-> BEEarlyDepthTest
     61    0.19   3.20 30564174   41         744058     0          0                      |-> BEPixelShader
     62    0.26   4.30 41058646   55         744058     0          0                      |-> BEOutputMerger
     63    1.27  20.94 199750822  32         6054264    0          0                      |-> BEEndTile
     64    0.33   2.34 51758160   23687      2185       0          0              |-> BEStoreTiles
     65    0.20  60.22 31169500   28807      1082       0          0                  |-> B8G8R8A8_UNORM
     66    0.00   0.00 302752     302752     1          0          0          WorkerWaitForThreadEvent
     67 
     68