1 # Output pipelines in gemmlowp 2 3 In gemmlowp, the "output pipeline" is the process that takes a final `int32` 4 accumulator value (the output of the compute/kernel stage), and processes it to 5 obtain the final value (typically a `uint8` value) and write it to the 6 destination matrix. 7 8 Gemmlowp has some genericity in what arithmetic transformations take place in 9 the output pipeline, so as to allow different users to implement different 10 quantization paradigms. See [low-precision.md](low-precision.md) and 11 [quantization.md](quantization.md). 12 13 Besides implementing a quantization paradigm, the other thing that output 14 pipelines is good for, is implementing fused operations where a matrix 15 multiplication feeds into other operations applied to its result, without 16 additional array traversals. For instance, when implementing neural network 17 inference, one might have a Convolutional layer with a bias-addition and an 18 activation. One then wants to feed the result of the matrix multiplication 19 implementing the Convolutional operator itself, directly into the bias-addition 20 and activation function. gemmlowp's output pipelines allow implementing that: 21 the bias-addition and activation function are just additional stages in the 22 output pipeline. 23 24 ## Usage 25 26 The gemmlowp entry point allowing to use an arbitrary output pipeline is 27 `GemmWithOutputPipeline` in [public/gemmlowp.h](../public/gemmlowp.h). 28 29 The output pipeline is specified as a `std::tuple` of "output stages", each of 30 which defining an elementary arithmetic transformation. 31 32 All available output stages are defined in 33 [public/output_stages.h](../public/output_stages.h). 34 35 ## Example usage 36 37 The best part to see examples of using various output pipelines is in the unit 38 test, 39 40 ``` 41 test/test.cc 42 ``` 43 44 specifically in this function: 45 46 ``` 47 TestOutputStages 48 ``` 49 50 Separately, a self-contained example showing how to use gemmlowp to compute a 51 quantized matrix multiplication with a sounds quantization paradigm, is here: 52 53 [doc/quantization_example.cc](quantization_example.cc) 54