Home | History | Annotate | Download | only in g3doc
      1 ## Command Line Interface Tutorials
      2 
      3 * [Command Line Inputs](#command-line-inputs)
      4 * [Start `tfprof`](#start-tfprof)
      5 * [Examples](#examples)
      6   * [Profile Python Time](#profile-python-time)
      7   * [Profile Graph Time](#profile-graph-time)
      8   * [Profile Checkpoint Value](#profile-checkpoint-value)
      9   * [Profile Model Parameter](#profile-model-parameter)
     10   * [Profile Device Placement](#profile-device-placement)
     11   * [Define Customized Operation Type](#define-customized-operation-type)
     12   * [Non-interactive Mode](#non-interactive-mode)
     13 
     14 
     15 ### Command Line Inputs
     16 
     17 tfprof command line tool uses the following input:
     18 
     19 <b>--profile_path:</b> A ProfileProto binary proto file.
     20 See QuickStart on generating the file.
     21 
     22 <b>THE OLD WAY BELOW IS DEPRECATED:</b>
     23 
     24 <b>--graph_path:</b> GraphDef proto file (optional in eager execution).
     25 Used to build in-memory
     26 data structure of the model. For example, graph.pbtxt written by tf.Supervisor
     27 can be passed to --graph_path. You can also easily get GraphDef using
     28 tf.get_default_graph().as_graph_def(add_shapes=True) or other API.
     29 
     30 <b>--run_meta_path:</b> RunMetadata proto file (optional).
     31 Used to get the memory consumption and execution time of
     32 each op of the model.
     33 
     34 The following code snippet writes a RunMetadata file:
     35 
     36 ```python
     37 run_options = config_pb2.RunOptions(trace_level=config_pb2.RunOptions.FULL_TRACE)
     38 run_metadata = config_pb2.RunMetadata()
     39 _ = self._sess.run(..., options=run_options, run_metadata=run_metadata)
     40 with tf.gfile.Open(os.path.join(output_dir, "run_meta"), "w") as f:
     41   f.write(run_metadata.SerializeToString())
     42 ```
     43 
     44 <b>--op_log_path:</b>
     45 tensorflow.tfprof.OpLogProto (optional). A proto used to provide extra operation
     46 information. 1) float operations. 2) code traces. 3) define customized operation
     47 type for `-account_type_regexes` option.
     48 
     49 The following code snippet writes a OpLogProto file.
     50 
     51 ```python
     52 tf.profiler.write_op_log(graph, log_dir, op_log=None)
     53 ```
     54 
     55 <b>--checkpoint_path:</b> TensorFlow checkpoint (optional).
     56 It defines _checkpoint_variable op type. It also provides checkpointed tensors' values.
     57 Note: this feature is not well maintained now.
     58 
     59 
     60 ### Start `tfprof`
     61 
     62 #### Build `tfprof`
     63 
     64 ```shell
     65 # Build the tool.
     66 bazel build --config opt tensorflow/core/profiler:profiler
     67 
     68 # Help information, including detail 'option' instructions.
     69 bazel-bin/tensorflow/core/profiler/profiler help
     70 ```
     71 
     72 #### Start `tfprof` Interactive Mode
     73 ```shell
     74 # The following commands will start tfprof interactive mode.
     75 #
     76 # Recommended:
     77 #
     78 # The file contains the binary string of ProfileProto.
     79 # It contains all needed information in one file.
     80 bazel-bin/tensorflow/core/profiler/profiler \
     81     --profile_path=profile_xxx
     82 #
     83 # Alternatively, user can pass separate files.
     84 #
     85 # --graph_path contains the model architecutre and tensor shapes.
     86 # --run_meta_path contains the memory and time information.
     87 # --op_log_path contains float operation and code traces.
     88 # --checkpoint_path contains the model checkpoint data.
     89 #
     90 # Only includes model architecture, parameters and shapes.
     91 bazel-bin/tensorflow/core/profiler/profiler \
     92     --graph_path=graph.pbtxt
     93 
     94 # For profiling eager execution, user can only specify run_meta_path
     95 # and profile execution info of each operation.
     96 bazel-bin/tensorflow/core/profiler/profiler \
     97     --run_meta_path=run_meta
     98 #
     99 # Additionally profile ops memory and timing.
    100 bazel-bin/tensorflow/core/profiler/profiler \
    101     --graph_path=graph.pbtxt \
    102     --run_meta_path=run_meta \
    103 #
    104 # tfprof_log is used to define customized op types, float ops and code traces.
    105 # Use tfprof_logger.write_op_log() to create tfprof_log.
    106 bazel-bin/tensorflow/core/profiler/profiler \
    107     --graph_path=graph.pbtxt \
    108     --run_meta_path=run_meta \
    109     --op_log_path=tfprof_log \
    110 #
    111 # Additionally profile checkpoint statistics and values.
    112 # Use '-account_type_regexes _checkpoint_variables' to select
    113 # checkpoint tensors.
    114 bazel-bin/tensorflow/core/profiler/profiler \
    115     --graph_path=graph.pbtxt \
    116     --run_meta_path=run_meta \
    117     --op_log_path=tfprof_log \
    118     --checkpoint_path=model.ckpt
    119 ```
    120 
    121 #### Start `tfprof` Non-interactive Mode.
    122 
    123 ```python
    124 # Runs tfprof in one-shot.
    125 bazel-bin/tensorflow/core/profiler/profiler scope \
    126     --graph_path=graph.pbtxt \
    127     --max_depth=3
    128 ```
    129 
    130 #### Press enter to show the default options
    131 
    132 Refer to [Options](options.md) for option instructions.
    133 
    134 ```shell
    135 tfprof>
    136 -max_depth                  4
    137 -min_bytes                  0
    138 -min_micros                 0
    139 -min_params                 0
    140 -min_float_ops              0
    141 -min_occurrence             0
    142 -step                       -1
    143 -order_by                   name
    144 -account_type_regexes       Variable,VariableV2
    145 -start_name_regexes         .*
    146 -trim_name_regexes
    147 -show_name_regexes          .*
    148 -hide_name_regexes          IsVariableInitialized_[0-9]+,save\/.*,^zeros[0-9_]*
    149 -account_displayed_op_only  false
    150 # supported select fields. Availability depends on --[run_meta|checkpoint|op_log]_path.
    151 # [bytes|micros|params|float_ops|occurrence|tensor_value|device|op_types]
    152 -select                     params
    153 # format: output_type:key=value,key=value...
    154 # output_types: stdout (default), timeline, file.
    155 # key=value pairs:
    156 #   1. timeline: outfile=<filename>
    157 #   2. file: outfile=<filename>
    158 #   3. stdout: None.
    159 # E.g. timeline:outfile=/tmp/timeline.json
    160 -output
    161 ```
    162 
    163 ### Examples
    164 
    165 #### Profile Python Time
    166 ```shell
    167 # Requires --graph_path --op_log_path
    168 tfprof> code -max_depth 1000 -show_name_regexes .*model_analyzer.*py.* -select micros -account_type_regexes .* -order_by micros
    169 _TFProfRoot (0us/22.44ms)
    170   model_analyzer_test.py:149:run_filename_as_m...:none (0us/22.44ms)
    171     model_analyzer_test.py:33:_run_code_in_main:none (0us/22.44ms)
    172       model_analyzer_test.py:208:<module>:test.main() (0us/22.44ms)
    173         model_analyzer_test.py:132:testComplexCodeView:x = lib.BuildFull... (0us/22.44ms)
    174           model_analyzer_testlib.py:63:BuildFullModel:return sgd_op.min... (0us/21.83ms)
    175           model_analyzer_testlib.py:58:BuildFullModel:cell, array_ops.c... (0us/333us)
    176           model_analyzer_testlib.py:54:BuildFullModel:seq.append(array_... (0us/254us)
    177             model_analyzer_testlib.py:42:BuildSmallModel:x = nn_ops.conv2d... (0us/134us)
    178             model_analyzer_testlib.py:46:BuildSmallModel:initializer=init_... (0us/40us)
    179             ...
    180           model_analyzer_testlib.py:61:BuildFullModel:loss = nn_ops.l2_... (0us/28us)
    181           model_analyzer_testlib.py:60:BuildFullModel:target = array_op... (0us/0us)
    182         model_analyzer_test.py:134:testComplexCodeView:sess.run(variable... (0us/0us)
    183 ```
    184 
    185 Set ```-output timeline:outfile=<filename>``` to generate timeline instead of stdout.
    186 <left>
    187 ![CodeTimeline](code_timeline.png)
    188 </left>
    189 
    190 #### Profile Graph Time
    191 
    192 ```shell
    193 # I defined an op named cost to calculate the loss. I want to know what ops
    194 # it depends on take a long time to run.
    195 
    196 # Requires --graph_path, --run_meta_path.
    197 tfprof> graph -start_name_regexes cost.* -max_depth 100 -min_micros 10000 -select micros -account_type_regexes .*
    198 _TFProfRoot (0us/3.61sec)
    199   init/init_conv/Conv2D (11.75ms/3.10sec)
    200     random_shuffle_queue_DequeueMany (3.09sec/3.09sec)
    201   unit_1_0/sub2/conv2/Conv2D (74.14ms/3.19sec)
    202   unit_1_3/sub2/conv2/Conv2D (60.75ms/3.34sec)
    203   unit_2_4/sub2/conv2/Conv2D (73.58ms/3.54sec)
    204   unit_3_3/sub2/conv2/Conv2D (10.26ms/3.60sec)
    205 ```
    206 
    207 #### Profile Checkpoint Value
    208 ```shell
    209 # Requires --graph_path, --checkpoint_path.
    210 tfprof> scope -show_name_regexes unit_1_0.*gamma -select tensor_value -max_depth 5
    211 _TFProfRoot ()
    212   unit_1_0/shared_activation/init_bn/gamma ()
    213 [1.80 2.10 2.06 1.91 2.26 1.86 1.81 1.37 1.78 1.85 1.96 1.54 2.04 2.34 2.22 1.99 ],
    214   unit_1_0/sub2/bn2/gamma ()
    215 [1.57 1.83 1.30 1.25 1.59 1.14 1.26 0.82 1.19 1.10 1.48 1.01 0.82 1.23 1.21 1.14 ],
    216 ```
    217 
    218 #### Profile Model Parameter
    219 
    220 ```shell
    221 # Show the number of parameters of all `tf.trainable_variables()` in the model.
    222 # Requires --graph_path --op_log_path.
    223 # store option for future commands.
    224 tfprof> set -account_type_regexes _trainable_variables
    225 tfprof> scope -max_depth 4 -select params
    226 _TFProfRoot (--/464.15k params)
    227   init/init_conv/DW (3x3x3x16, 432/432 params)
    228   pool_logit/DW (64x10, 640/640 params)
    229   pool_logit/biases (10, 10/10 params)
    230   unit_last/final_bn/beta (64, 64/64 params)
    231   unit_last/final_bn/gamma (64, 64/64 params)
    232 ```
    233 
    234 Where does `_trainable_variables` come from? It is customized operation type
    235 defined through the OpLogProto file.
    236 Users can [Define Customized Operation Type](#define-customized-operation-type)
    237 
    238 <b>Following example shows importance of defining customized operation type.</b>
    239 In this example, extra `Variables` are created by TensorFlow
    240 implicitly and /Momentum is appended to their names. They shouldn't be
    241 included in you model capacity calculation.
    242 
    243 ```shell
    244 tfprof> scope -account_type_regexes VariableV2 -max_depth 4 -select params
    245 _TFProfRoot (--/930.58k params)
    246   global_step (1/1 params)
    247   init/init_conv/DW (3x3x3x16, 432/864 params)
    248   pool_logit/DW (64x10, 640/1.28k params)
    249     pool_logit/DW/Momentum (64x10, 640/640 params)
    250   pool_logit/biases (10, 10/20 params)
    251     pool_logit/biases/Momentum (10, 10/10 params)
    252   unit_last/final_bn/beta (64, 64/128 params)
    253   unit_last/final_bn/gamma (64, 64/128 params)
    254   unit_last/final_bn/moving_mean (64, 64/64 params)
    255   unit_last/final_bn/moving_variance (64, 64/64 params)
    256 ```
    257 
    258 #### Profile Device Placement
    259 
    260 In this tutorial, a model is split
    261 on several gpus at workers and several parameter servers.
    262 
    263 In tfprof, 'device' is an op_type. For example, if op1 and op2 are placed on
    264 gpu:0. They share an operation type.
    265 
    266 ```shell
    267 bazel-bin/tensorflow/core/profiler/profiler \
    268   --graph_path=/tmp/graph.pbtxt  \
    269   --run_meta_path=/tmp/run_meta
    270 
    271 # Looks like ps task 1 is holding twice more parameters than task 0.
    272 tfprof> scope -select device,params -account_type_regexes .*ps.*task:0.* -max_depth 1
    273 _TFProfRoot (--/25.81m params)
    274 tfprof> scope -select device,params -account_type_regexes .*ps.*task:1.* -max_depth 1
    275 _TFProfRoot (--/58.84m params)
    276 ```
    277 
    278 #### Define Customized Operation Type
    279 
    280 First, in Python code, create an `OpLogProto` proto and add op type
    281 information to it:
    282 
    283 ```python
    284 op_log = tfprof_log_pb2.OpLogProto()
    285 entry = op_log.log_entries.add()
    286 entry.name = 'pool_logit/DW'
    287 entry.types.append('pool_logit')
    288 entry = op_log.log_entries.add()
    289 entry.name = 'pool_logit/biases'
    290 entry.types.append('pool_logit')
    291 ```
    292 
    293 Second, call write_op_log to write the OpLogProto proto.
    294 
    295 ```python
    296 tf.profiler.write_op_log(
    297     sess.graph, /tmp/my_op_log_dir, op_log)
    298 
    299 # Get run-time shape information in order to fill shapes and get flops.
    300 tf.profiler.write_op_log(
    301     sess.graph, /tmp/my_op_log_dir, op_log, run_meta)
    302 ```
    303 
    304 Third, when starting the tfprof tool, specify
    305 "--op_log_path=/tmp/my_op_log_dir/op_log"
    306 
    307 ```shell
    308 tfprof> scope -account_type_regexes pool_logit -max_depth 4 -select params
    309 _TFProfRoot (--/650 params)
    310   pool_logit/DW (64x10, 640/640 params)
    311   pool_logit/biases (10, 10/10 params)
    312 ```
    313 
    314 Note that `tf.profiler.write_op_log(...)` automatically
    315 assigns all `Variables` inside `tf.trainable_variables()` a customized
    316 operation type: `_trainable_variables`.
    317 
    318 
    319 #### Non-interactive Mode
    320 12) Run tfprof in one-shot mode and dump result to file.
    321 
    322 ```shell
    323 # By default output to stdout. Use -output option to change output types.
    324 tfprof scope --graph_path=graph.pbtxt  \
    325              --max_depth=3 \
    326              --output="file:outfile=/tmp/dump"
    327 Reading Files...
    328 Parsing GraphDef...
    329 Preparing Views...
    330 
    331 cat /tmp/dump
    332 _TFProfRoot (--/930.58k params)
    333   global_step (0/0 params)
    334   pool_logit/DW (64x10, 640/1.28k params)
    335   pool_logit/biases (10, 10/20 params)
    336 ```
    337