Home | History | Annotate | only in /external/eigen/unsupported/Eigen/CXX11/src/Tensor
Up to higher level directory
NameDateSize
README.md06-Dec-201758.7K
Tensor.h06-Dec-201720.1K
TensorArgMax.h06-Dec-201710.8K
TensorAssign.h06-Dec-20177.5K
TensorBase.h06-Dec-201748.3K
TensorBroadcasting.h06-Dec-201714K
TensorChipping.h06-Dec-201714.4K
TensorConcatenation.h06-Dec-201714.3K
TensorContraction.h06-Dec-201726.1K
TensorContractionBlocking.h06-Dec-20171.6K
TensorContractionCuda.h06-Dec-201760.6K
TensorContractionMapper.h06-Dec-201718.1K
TensorContractionThreadPool.h06-Dec-201743.5K
TensorConversion.h06-Dec-201710.7K
TensorConvolution.h06-Dec-201746.5K
TensorCostModel.h06-Dec-20178.2K
TensorCustomOp.h06-Dec-201711.2K
TensorDevice.h06-Dec-20172.5K
TensorDeviceCuda.h06-Dec-201710.8K
TensorDeviceDefault.h06-Dec-20172.4K
TensorDeviceSycl.h06-Dec-20175.1K
TensorDeviceThreadPool.h06-Dec-20179.6K
TensorDimensionList.h06-Dec-20177.5K
TensorDimensions.h06-Dec-201715.2K
TensorEvalTo.h06-Dec-20176.4K
TensorEvaluator.h06-Dec-201724.7K
TensorExecutor.h06-Dec-201710K
TensorExpr.h06-Dec-201714.3K
TensorFFT.h06-Dec-201722.8K
TensorFixedSize.h06-Dec-201714.6K
TensorForcedEval.h06-Dec-20176.4K
TensorForwardDeclarations.h06-Dec-20175.3K
TensorFunctors.h06-Dec-201714.3K
TensorGenerator.h06-Dec-20176.2K
TensorGlobalFunctions.h06-Dec-20171.3K
TensorImagePatch.h06-Dec-201722.6K
TensorIndexList.h06-Dec-201725.2K
TensorInflation.h06-Dec-20178.2K
TensorInitializer.h06-Dec-20172.7K
TensorIntDiv.h06-Dec-20178.3K
TensorIO.h06-Dec-20172.5K
TensorLayoutSwap.h06-Dec-20177.2K
TensorMacros.h06-Dec-20171.3K
TensorMap.h06-Dec-201713.1K
TensorMeta.h06-Dec-20175.2K
TensorMorphing.h06-Dec-201733.5K
TensorPadding.h06-Dec-201715.4K
TensorPatch.h06-Dec-201710.4K
TensorRandom.h06-Dec-20179.1K
TensorReduction.h06-Dec-201733.1K
TensorReductionCuda.h06-Dec-201729.6K
TensorReductionSycl.h06-Dec-201713.7K
TensorRef.h06-Dec-201713.3K
TensorReverse.h06-Dec-201710.3K
TensorScan.h06-Dec-20179.7K
TensorShuffling.h06-Dec-20179.3K
TensorStorage.h06-Dec-20175K
TensorStriding.h06-Dec-201712.9K
TensorSycl.h06-Dec-20172.4K
TensorSyclConvertToDeviceExpression.h06-Dec-20174.9K
TensorSyclExprConstructor.h06-Dec-201711.3K
TensorSyclExtractAccessor.h06-Dec-201712.2K
TensorSyclExtractFunctors.h06-Dec-20179.5K
TensorSyclLeafCount.h06-Dec-20175.2K
TensorSyclPlaceHolderExpr.h06-Dec-20176.5K
TensorSyclRun.h06-Dec-20173K
TensorSyclTuple.h06-Dec-20179.4K
TensorTraits.h06-Dec-20179.2K
TensorUInt128.h06-Dec-20177.3K
TensorVolumePatch.h06-Dec-201727.5K

README.md

      1 # Eigen Tensors
      2 
      3 Tensors are multidimensional arrays of elements. Elements are typically scalars,
      4 but more complex types such as strings are also supported.
      5 
      6 [TOC]
      7 
      8 ## Tensor Classes
      9 
     10 You can manipulate a tensor with one of the following classes.  They all are in
     11 the namespace ```::Eigen.```
     12 
     13 
     14 ### Class Tensor<data_type, rank>
     15 
     16 This is the class to use to create a tensor and allocate memory for it.  The
     17 class is templatized with the tensor datatype, such as float or int, and the
     18 tensor rank.  The rank is the number of dimensions, for example rank 2 is a
     19 matrix.
     20 
     21 Tensors of this class are resizable.  For example, if you assign a tensor of a
     22 different size to a Tensor, that tensor is resized to match its new value.
     23 
     24 #### Constructor Tensor<data_type, rank>(size0, size1, ...)
     25 
     26 Constructor for a Tensor.  The constructor must be passed ```rank``` integers
     27 indicating the sizes of the instance along each of the the ```rank```
     28 dimensions.
     29 
     30     // Create a tensor of rank 3 of sizes 2, 3, 4.  This tensor owns
     31     // memory to hold 24 floating point values (24 = 2 x 3 x 4).
     32     Tensor<float, 3> t_3d(2, 3, 4);
     33 
     34     // Resize t_3d by assigning a tensor of different sizes, but same rank.
     35     t_3d = Tensor<float, 3>(3, 4, 3);
     36 
     37 #### Constructor Tensor<data_type, rank>(size_array)
     38 
     39 Constructor where the sizes for the constructor are specified as an array of
     40 values instead of an explicitly list of parameters.  The array type to use is
     41 ```Eigen::array<Eigen::Index>```.  The array can be constructed automatically
     42 from an initializer list.
     43 
     44     // Create a tensor of strings of rank 2 with sizes 5, 7.
     45     Tensor<string, 2> t_2d({5, 7});
     46 
     47 
     48 ### Class TensorFixedSize<data_type, Sizes<size0, size1, ...>>
     49 
     50 Class to use for tensors of fixed size, where the size is known at compile
     51 time.  Fixed sized tensors can provide very fast computations because all their
     52 dimensions are known by the compiler.  FixedSize tensors are not resizable.
     53 
     54 If the total number of elements in a fixed size tensor is small enough the
     55 tensor data is held onto the stack and does not cause heap allocation and free.
     56 
     57     // Create a 4 x 3 tensor of floats.
     58     TensorFixedSize<float, Sizes<4, 3>> t_4x3;
     59 
     60 ### Class TensorMap<Tensor<data_type, rank>>
     61 
     62 This is the class to use to create a tensor on top of memory allocated and
     63 owned by another part of your code.  It allows to view any piece of allocated
     64 memory as a Tensor.  Instances of this class do not own the memory where the
     65 data are stored.
     66 
     67 A TensorMap is not resizable because it does not own the memory where its data
     68 are stored.
     69 
     70 #### Constructor TensorMap<Tensor<data_type, rank>>(data, size0, size1, ...)
     71 
     72 Constructor for a Tensor.  The constructor must be passed a pointer to the
     73 storage for the data, and "rank" size attributes.  The storage has to be
     74 large enough to hold all the data.
     75 
     76     // Map a tensor of ints on top of stack-allocated storage.
     77     int storage[128];  // 2 x 4 x 2 x 8 = 128
     78     TensorMap<Tensor<int, 4>> t_4d(storage, 2, 4, 2, 8);
     79 
     80     // The same storage can be viewed as a different tensor.
     81     // You can also pass the sizes as an array.
     82     TensorMap<Tensor<int, 2>> t_2d(storage, 16, 8);
     83 
     84     // You can also map fixed-size tensors.  Here we get a 1d view of
     85     // the 2d fixed-size tensor.
     86     Tensor<float, Sizes<4, 5>> t_4x3;
     87     TensorMap<Tensor<float, 1>> t_12(t_4x3, 12);
     88 
     89 
     90 #### Class TensorRef
     91 
     92 See Assigning to a TensorRef below.
     93 
     94 ## Accessing Tensor Elements
     95 
     96 #### <data_type> tensor(index0, index1...)
     97 
     98 Return the element at position ```(index0, index1...)``` in tensor
     99 ```tensor```.  You must pass as many parameters as the rank of ```tensor```.
    100 The expression can be used as an l-value to set the value of the element at the
    101 specified position.  The value returned is of the datatype of the tensor.
    102 
    103     // Set the value of the element at position (0, 1, 0);
    104     Tensor<float, 3> t_3d(2, 3, 4);
    105     t_3d(0, 1, 0) = 12.0f;
    106 
    107     // Initialize all elements to random values.
    108     for (int i = 0; i < 2; ++i) {
    109       for (int j = 0; j < 3; ++j) {
    110         for (int k = 0; k < 4; ++k) {
    111           t_3d(i, j, k) = ...some random value...;
    112         }
    113       }
    114     }
    115 
    116     // Print elements of a tensor.
    117     for (int i = 0; i < 2; ++i) {
    118       LOG(INFO) << t_3d(i, 0, 0);
    119     }
    120 
    121 
    122 ## TensorLayout
    123 
    124 The tensor library supports 2 layouts: ```ColMajor``` (the default) and
    125 ```RowMajor```.  Only the default column major layout is currently fully
    126 supported, and it is therefore not recommended to attempt to use the row major
    127 layout at the moment.
    128 
    129 The layout of a tensor is optionally specified as part of its type. If not
    130 specified explicitly column major is assumed.
    131 
    132     Tensor<float, 3, ColMajor> col_major;  // equivalent to Tensor<float, 3>
    133     TensorMap<Tensor<float, 3, RowMajor> > row_major(data, ...);
    134 
    135 All the arguments to an expression must use the same layout. Attempting to mix
    136 different layouts will result in a compilation error.
    137 
    138 It is possible to change the layout of a tensor or an expression using the
    139 ```swap_layout()``` method.  Note that this will also reverse the order of the
    140 dimensions.
    141 
    142     Tensor<float, 2, ColMajor> col_major(2, 4);
    143     Tensor<float, 2, RowMajor> row_major(2, 4);
    144 
    145     Tensor<float, 2> col_major_result = col_major;  // ok, layouts match
    146     Tensor<float, 2> col_major_result = row_major;  // will not compile
    147 
    148     // Simple layout swap
    149     col_major_result = row_major.swap_layout();
    150     eigen_assert(col_major_result.dimension(0) == 4);
    151     eigen_assert(col_major_result.dimension(1) == 2);
    152 
    153     // Swap the layout and preserve the order of the dimensions
    154     array<int, 2> shuffle(1, 0);
    155     col_major_result = row_major.swap_layout().shuffle(shuffle);
    156     eigen_assert(col_major_result.dimension(0) == 2);
    157     eigen_assert(col_major_result.dimension(1) == 4);
    158 
    159 
    160 ## Tensor Operations
    161 
    162 The Eigen Tensor library provides a vast library of operations on Tensors:
    163 numerical operations such as addition and multiplication, geometry operations
    164 such as slicing and shuffling, etc.  These operations are available as methods
    165 of the Tensor classes, and in some cases as operator overloads.  For example
    166 the following code computes the elementwise addition of two tensors:
    167 
    168     Tensor<float, 3> t1(2, 3, 4);
    169     ...set some values in t1...
    170     Tensor<float, 3> t2(2, 3, 4);
    171     ...set some values in t2...
    172     // Set t3 to the element wise sum of t1 and t2
    173     Tensor<float, 3> t3 = t1 + t2;
    174 
    175 While the code above looks easy enough, it is important to understand that the
    176 expression ```t1 + t2``` is not actually adding the values of the tensors.  The
    177 expression instead constructs a "tensor operator" object of the class
    178 TensorCwiseBinaryOp<scalar_sum>, which has references to the tensors
    179 ```t1``` and ```t2```.  This is a small C++ object that knows how to add
    180 ```t1``` and ```t2```.  It is only when the value of the expression is assigned
    181 to the tensor ```t3``` that the addition is actually performed.  Technically,
    182 this happens through the overloading of ```operator=()``` in the Tensor class.
    183 
    184 This mechanism for computing tensor expressions allows for lazy evaluation and
    185 optimizations which are what make the tensor library very fast.
    186 
    187 Of course, the tensor operators do nest, and the expression ```t1 + t2 *
    188 0.3f``` is actually represented with the (approximate) tree of operators:
    189 
    190     TensorCwiseBinaryOp<scalar_sum>(t1, TensorCwiseUnaryOp<scalar_mul>(t2, 0.3f))
    191 
    192 
    193 ### Tensor Operations and C++ "auto"
    194 
    195 Because Tensor operations create tensor operators, the C++ ```auto``` keyword
    196 does not have its intuitive meaning.  Consider these 2 lines of code:
    197 
    198     Tensor<float, 3> t3 = t1 + t2;
    199     auto t4 = t1 + t2;
    200 
    201 In the first line we allocate the tensor ```t3``` and it will contain the
    202 result of the addition of ```t1``` and ```t2```.  In the second line, ```t4```
    203 is actually the tree of tensor operators that will compute the addition of
    204 ```t1``` and ```t2```.  In fact, ```t4``` is *not* a tensor and you cannot get
    205 the values of its elements:
    206 
    207     Tensor<float, 3> t3 = t1 + t2;
    208     cout << t3(0, 0, 0);  // OK prints the value of t1(0, 0, 0) + t2(0, 0, 0)
    209 
    210     auto t4 = t1 + t2;
    211     cout << t4(0, 0, 0);  // Compilation error!
    212 
    213 When you use ```auto``` you do not get a Tensor as a result but instead a
    214 non-evaluated expression.  So only use ```auto``` to delay evaluation.
    215 
    216 Unfortunately, there is no single underlying concrete type for holding
    217 non-evaluated expressions, hence you have to use auto in the case when you do
    218 want to hold non-evaluated expressions.
    219 
    220 When you need the results of set of tensor computations you have to assign the
    221 result to a Tensor that will be capable of holding onto them.  This can be
    222 either a normal Tensor, a fixed size Tensor, or a TensorMap on an existing
    223 piece of memory.  All the following will work:
    224 
    225     auto t4 = t1 + t2;
    226 
    227     Tensor<float, 3> result = t4;  // Could also be: result(t4);
    228     cout << result(0, 0, 0);
    229 
    230     TensorMap<float, 4> result(<a float* with enough space>, <size0>, ...) = t4;
    231     cout << result(0, 0, 0);
    232 
    233     TensorFixedSize<float, Sizes<size0, ...>> result = t4;
    234     cout << result(0, 0, 0);
    235 
    236 Until you need the results, you can keep the operation around, and even reuse
    237 it for additional operations.  As long as you keep the expression as an
    238 operation, no computation is performed.
    239 
    240     // One way to compute exp((t1 + t2) * 0.2f);
    241     auto t3 = t1 + t2;
    242     auto t4 = t3 * 0.2f;
    243     auto t5 = t4.exp();
    244     Tensor<float, 3> result = t5;
    245 
    246     // Another way, exactly as efficient as the previous one:
    247     Tensor<float, 3> result = ((t1 + t2) * 0.2f).exp();
    248 
    249 ### Controlling When Expression are Evaluated
    250 
    251 There are several ways to control when expressions are evaluated:
    252 
    253 *   Assignment to a Tensor, TensorFixedSize, or TensorMap.
    254 *   Use of the eval() method.
    255 *   Assignment to a TensorRef.
    256 
    257 #### Assigning to a Tensor, TensorFixedSize, or TensorMap.
    258 
    259 The most common way to evaluate an expression is to assign it to a Tensor.  In
    260 the example below, the ```auto``` declarations make the intermediate values
    261 "Operations", not Tensors, and do not cause the expressions to be evaluated.
    262 The assignment to the Tensor ```result``` causes the evaluation of all the
    263 operations.
    264 
    265     auto t3 = t1 + t2;             // t3 is an Operation.
    266     auto t4 = t3 * 0.2f;           // t4 is an Operation.
    267     auto t5 = t4.exp();            // t5 is an Operation.
    268     Tensor<float, 3> result = t5;  // The operations are evaluated.
    269 
    270 If you know the ranks and sizes of the Operation value you can assign the
    271 Operation to a TensorFixedSize instead of a Tensor, which is a bit more
    272 efficient.
    273 
    274     // We know that the result is a 4x4x2 tensor!
    275     TensorFixedSize<float, 4, 4, 2> result = t5;
    276 
    277 Simiarly, assigning an expression to a TensorMap causes its evaluation.  Like
    278 tensors of type TensorFixedSize, TensorMaps cannot be resized so they have to
    279 have the rank and sizes of the expression that are assigned to them.
    280 
    281 #### Calling eval().
    282 
    283 When you compute large composite expressions, you sometimes want to tell Eigen
    284 that an intermediate value in the expression tree is worth evaluating ahead of
    285 time.  This is done by inserting a call to the ```eval()``` method of the
    286 expression Operation.
    287 
    288     // The previous example could have been written:
    289     Tensor<float, 3> result = ((t1 + t2) * 0.2f).exp();
    290 
    291     // If you want to compute (t1 + t2) once ahead of time you can write:
    292     Tensor<float, 3> result = ((t1 + t2).eval() * 0.2f).exp();
    293 
    294 Semantically, calling ```eval()``` is equivalent to materializing the value of
    295 the expression in a temporary Tensor of the right size.  The code above in
    296 effect does:
    297 
    298     // .eval() knows the size!
    299     TensorFixedSize<float, 4, 4, 2> tmp = t1 + t2;
    300     Tensor<float, 3> result = (tmp * 0.2f).exp();
    301 
    302 Note that the return value of ```eval()``` is itself an Operation, so the
    303 following code does not do what you may think:
    304 
    305     // Here t3 is an evaluation Operation.  t3 has not been evaluated yet.
    306     auto t3 = (t1 + t2).eval();
    307 
    308     // You can use t3 in another expression.  Still no evaluation.
    309     auto t4 = (t3 * 0.2f).exp();
    310 
    311     // The value is evaluated when you assign the Operation to a Tensor, using
    312     // an intermediate tensor to represent t3.x
    313     Tensor<float, 3> result = t4;
    314 
    315 While in the examples above calling ```eval()``` does not make a difference in
    316 performance, in other cases it can make a huge difference.  In the expression
    317 below the ```broadcast()``` expression causes the ```X.maximum()``` expression
    318 to be evaluated many times:
    319 
    320     Tensor<...> X ...;
    321     Tensor<...> Y = ((X - X.maximum(depth_dim).reshape(dims2d).broadcast(bcast))
    322                      * beta).exp();
    323 
    324 Inserting a call to ```eval()``` between the ```maximum()``` and
    325 ```reshape()``` calls guarantees that maximum() is only computed once and
    326 greatly speeds-up execution:
    327 
    328     Tensor<...> Y =
    329       ((X - X.maximum(depth_dim).eval().reshape(dims2d).broadcast(bcast))
    330         * beta).exp();
    331 
    332 In the other example below, the tensor ```Y``` is both used in the expression
    333 and its assignment.  This is an aliasing problem and if the evaluation is not
    334 done in the right order Y will be updated incrementally during the evaluation
    335 resulting in bogus results:
    336 
    337      Tensor<...> Y ...;
    338      Y = Y / (Y.sum(depth_dim).reshape(dims2d).broadcast(bcast));
    339 
    340 Inserting a call to ```eval()``` between the ```sum()``` and ```reshape()```
    341 expressions ensures that the sum is computed before any updates to ```Y``` are
    342 done.
    343 
    344      Y = Y / (Y.sum(depth_dim).eval().reshape(dims2d).broadcast(bcast));
    345 
    346 Note that an eval around the full right hand side expression is not needed
    347 because the generated has to compute the i-th value of the right hand side
    348 before assigning it to the left hand side.
    349 
    350 However, if you were assigning the expression value to a shuffle of ```Y```
    351 then you would need to force an eval for correctness by adding an ```eval()```
    352 call for the right hand side:
    353 
    354      Y.shuffle(...) =
    355         (Y / (Y.sum(depth_dim).eval().reshape(dims2d).broadcast(bcast))).eval();
    356 
    357 
    358 #### Assigning to a TensorRef.
    359 
    360 If you need to access only a few elements from the value of an expression you
    361 can avoid materializing the value in a full tensor by using a TensorRef.
    362 
    363 A TensorRef is a small wrapper class for any Eigen Operation.  It provides
    364 overloads for the ```()``` operator that let you access individual values in
    365 the expression.  TensorRef is convenient, because the Operation themselves do
    366 not provide a way to access individual elements.
    367 
    368     // Create a TensorRef for the expression.  The expression is not
    369     // evaluated yet.
    370     TensorRef<Tensor<float, 3> > ref = ((t1 + t2) * 0.2f).exp();
    371 
    372     // Use "ref" to access individual elements.  The expression is evaluated
    373     // on the fly.
    374     float at_0 = ref(0, 0, 0);
    375     cout << ref(0, 1, 0);
    376 
    377 Only use TensorRef when you need a subset of the values of the expression.
    378 TensorRef only computes the values you access.  However note that if you are
    379 going to access all the values it will be much faster to materialize the
    380 results in a Tensor first.
    381 
    382 In some cases, if the full Tensor result would be very large, you may save
    383 memory by accessing it as a TensorRef.  But not always.  So don't count on it.
    384 
    385 
    386 ### Controlling How Expressions Are Evaluated
    387 
    388 The tensor library provides several implementations of the various operations
    389 such as contractions and convolutions.  The implementations are optimized for
    390 different environments: single threaded on CPU, multi threaded on CPU, or on a
    391 GPU using cuda.  Additional implementations may be added later.
    392 
    393 You can choose which implementation to use with the ```device()``` call.  If
    394 you do not choose an implementation explicitly the default implementation that
    395 uses a single thread on the CPU is used.
    396 
    397 The default implementation has been optimized for recent Intel CPUs, taking
    398 advantage of SSE, AVX, and FMA instructions.  Work is ongoing to tune the
    399 library on ARM CPUs.  Note that you need to pass compiler-dependent flags
    400 to enable the use of SSE, AVX, and other instructions.
    401 
    402 For example, the following code adds two tensors using the default
    403 single-threaded CPU implementation:
    404 
    405     Tensor<float, 2> a(30, 40);
    406     Tensor<float, 2> b(30, 40);
    407     Tensor<float, 2> c = a + b;
    408 
    409 To choose a different implementation you have to insert a ```device()``` call
    410 before the assignment of the result.  For technical C++ reasons this requires
    411 that the Tensor for the result be declared on its own.  This means that you
    412 have to know the size of the result.
    413 
    414     Eigen::Tensor<float, 2> c(30, 40);
    415     c.device(...) = a + b;
    416 
    417 The call to ```device()``` must be the last call on the left of the operator=.
    418 
    419 You must pass to the ```device()``` call an Eigen device object.  There are
    420 presently three devices you can use: DefaultDevice, ThreadPoolDevice and
    421 GpuDevice.
    422 
    423 
    424 #### Evaluating With the DefaultDevice
    425 
    426 This is exactly the same as not inserting a ```device()``` call.
    427 
    428     DefaultDevice my_device;
    429     c.device(my_device) = a + b;
    430 
    431 #### Evaluating with a Thread Pool
    432 
    433     // Create the Eigen ThreadPoolDevice.
    434     Eigen::ThreadPoolDevice my_device(4 /* number of threads to use */);
    435 
    436     // Now just use the device when evaluating expressions.
    437     Eigen::Tensor<float, 2> c(30, 50);
    438     c.device(my_device) = a.contract(b, dot_product_dims);
    439 
    440 
    441 #### Evaluating On GPU
    442 
    443 This is presently a bit more complicated than just using a thread pool device.
    444 You need to create a GPU device but you also need to explicitly allocate the
    445 memory for tensors with cuda.
    446 
    447 
    448 ## API Reference
    449 
    450 ### Datatypes
    451 
    452 In the documentation of the tensor methods and Operation we mention datatypes
    453 that are tensor-type specific:
    454 
    455 #### <Tensor-Type>::Dimensions
    456 
    457 Acts like an array of ints.  Has an ```int size``` attribute, and can be
    458 indexed like an array to access individual values.  Used to represent the
    459 dimensions of a tensor.  See ```dimensions()```.
    460 
    461 #### <Tensor-Type>::Index
    462 
    463 Acts like an ```int```.  Used for indexing tensors along their dimensions.  See
    464 ```operator()```, ```dimension()```, and ```size()```.
    465 
    466 #### <Tensor-Type>::Scalar
    467 
    468 Represents the datatype of individual tensor elements.  For example, for a
    469 ```Tensor<float>```, ```Scalar``` is the type ```float```.  See
    470 ```setConstant()```.
    471 
    472 #### <Operation>
    473 
    474 We use this pseudo type to indicate that a tensor Operation is returned by a
    475 method.  We indicate in the text the type and dimensions of the tensor that the
    476 Operation returns after evaluation.
    477 
    478 The Operation will have to be evaluated, for example by assigning it to a
    479 tensor, before you can access the values of the resulting tensor.  You can also
    480 access the values through a TensorRef.
    481 
    482 
    483 ## Built-in Tensor Methods
    484 
    485 These are usual C++ methods that act on tensors immediately.  They are not
    486 Operations which provide delayed evaluation of their results.  Unless specified
    487 otherwise, all the methods listed below are available on all tensor classes:
    488 Tensor, TensorFixedSize, and TensorMap.
    489 
    490 ## Metadata
    491 
    492 ### int NumDimensions
    493 
    494 Constant value indicating the number of dimensions of a Tensor.  This is also
    495 known as the tensor "rank".
    496 
    497       Eigen::Tensor<float, 2> a(3, 4);
    498       cout << "Dims " << a.NumDimensions;
    499       => Dims 2
    500 
    501 ### Dimensions dimensions()
    502 
    503 Returns an array-like object representing the dimensions of the tensor.
    504 The actual type of the dimensions() result is <Tensor-Type>::Dimensions.
    505 
    506     Eigen::Tensor<float, 2> a(3, 4);
    507     const Eigen::Tensor<float, 2>::Dimensions& d = a.dimensions();
    508     cout << "Dim size: " << d.size << ", dim 0: " << d[0]
    509          << ", dim 1: " << d[1];
    510     => Dim size: 2, dim 0: 3, dim 1: 4
    511 
    512 If you use a C++11 compiler, you can use ```auto``` to simplify the code:
    513 
    514     const auto& d = a.dimensions();
    515     cout << "Dim size: " << d.size << ", dim 0: " << d[0]
    516          << ", dim 1: " << d[1];
    517     => Dim size: 2, dim 0: 3, dim 1: 4
    518 
    519 ### Index dimension(Index n)
    520 
    521 Returns the n-th dimension of the tensor.  The actual type of the
    522 ```dimension()``` result is ```<Tensor-Type>::Index```, but you can
    523 always use it like an int.
    524 
    525       Eigen::Tensor<float, 2> a(3, 4);
    526       int dim1 = a.dimension(1);
    527       cout << "Dim 1: " << dim1;
    528       => Dim 1: 4
    529 
    530 ### Index size()
    531 
    532 Returns the total number of elements in the tensor.  This is the product of all
    533 the tensor dimensions.  The actual type of the ```size()``` result is
    534 ```<Tensor-Type>::Index```, but you can always use it like an int.
    535 
    536     Eigen::Tensor<float, 2> a(3, 4);
    537     cout << "Size: " << a.size();
    538     => Size: 12
    539 
    540 
    541 ### Getting Dimensions From An Operation
    542 
    543 A few operations provide ```dimensions()``` directly,
    544 e.g. ```TensorReslicingOp```.  Most operations defer calculating dimensions
    545 until the operation is being evaluated.  If you need access to the dimensions
    546 of a deferred operation, you can wrap it in a TensorRef (see Assigning to a
    547 TensorRef above), which provides ```dimensions()``` and ```dimension()``` as
    548 above.
    549 
    550 TensorRef can also wrap the plain Tensor types, so this is a useful idiom in
    551 templated contexts where the underlying object could be either a raw Tensor
    552 or some deferred operation (e.g. a slice of a Tensor).  In this case, the
    553 template code can wrap the object in a TensorRef and reason about its
    554 dimensionality while remaining agnostic to the underlying type.
    555 
    556 
    557 ## Constructors
    558 
    559 ### Tensor
    560 
    561 Creates a tensor of the specified size. The number of arguments must be equal
    562 to the rank of the tensor. The content of the tensor is not initialized.
    563 
    564     Eigen::Tensor<float, 2> a(3, 4);
    565     cout << "NumRows: " << a.dimension(0) << " NumCols: " << a.dimension(1) << endl;
    566     => NumRows: 3 NumCols: 4
    567 
    568 ### TensorFixedSize
    569 
    570 Creates a tensor of the specified size. The number of arguments in the Size<>
    571 template parameter determines the rank of the tensor. The content of the tensor
    572 is not initialized.
    573 
    574     Eigen::TensorFixedSize<float, Size<3, 4>> a;
    575     cout << "Rank: " << a.rank() << endl;
    576     => Rank: 2
    577     cout << "NumRows: " << a.dimension(0) << " NumCols: " << a.dimension(1) << endl;
    578     => NumRows: 3 NumCols: 4
    579 
    580 ### TensorMap
    581 
    582 Creates a tensor mapping an existing array of data. The data must not be freed
    583 until the TensorMap is discarded, and the size of the data must be large enough
    584 to accomodate of the coefficients of the tensor.
    585 
    586     float data[] = {0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11};
    587     Eigen::TensorMap<float, 2> a(data, 3, 4);
    588     cout << "NumRows: " << a.dimension(0) << " NumCols: " << a.dimension(1) << endl;
    589     => NumRows: 3 NumCols: 4
    590     cout << "a(1, 2): " << a(1, 2) << endl;
    591     => a(1, 2): 9
    592 
    593 
    594 ## Contents Initialization
    595 
    596 When a new Tensor or a new TensorFixedSize are created, memory is allocated to
    597 hold all the tensor elements, but the memory is not initialized.  Similarly,
    598 when a new TensorMap is created on top of non-initialized memory the memory its
    599 contents are not initialized.
    600 
    601 You can use one of the methods below to initialize the tensor memory.  These
    602 have an immediate effect on the tensor and return the tensor itself as a
    603 result.  These are not tensor Operations which delay evaluation.
    604 
    605 ### <Tensor-Type> setConstant(const Scalar& val)
    606 
    607 Sets all elements of the tensor to the constant value ```val```.  ```Scalar```
    608 is the type of data stored in the tensor.  You can pass any value that is
    609 convertible to that type.
    610 
    611 Returns the tensor itself in case you want to chain another call.
    612 
    613     a.setConstant(12.3f);
    614     cout << "Constant: " << endl << a << endl << endl;
    615     =>
    616     Constant:
    617     12.3 12.3 12.3 12.3
    618     12.3 12.3 12.3 12.3
    619     12.3 12.3 12.3 12.3
    620 
    621 Note that ```setConstant()``` can be used on any tensor where the element type
    622 has a copy constructor and an ```operator=()```:
    623 
    624     Eigen::Tensor<string, 2> a(2, 3);
    625     a.setConstant("yolo");
    626     cout << "String tensor: " << endl << a << endl << endl;
    627     =>
    628     String tensor:
    629     yolo yolo yolo
    630     yolo yolo yolo
    631 
    632 
    633 ### <Tensor-Type> setZero()
    634 
    635 Fills the tensor with zeros.  Equivalent to ```setConstant(Scalar(0))```.
    636 Returns the tensor itself in case you want to chain another call.
    637 
    638     a.setZero();
    639     cout << "Zeros: " << endl << a << endl << endl;
    640     =>
    641     Zeros:
    642     0 0 0 0
    643     0 0 0 0
    644     0 0 0 0
    645 
    646 
    647 ### <Tensor-Type> setValues({..initializer_list})
    648 
    649 Fills the tensor with explicit values specified in a std::initializer_list.
    650 The type of the initializer list depends on the type and rank of the tensor.
    651 
    652 If the tensor has rank N, the initializer list must be nested N times.  The
    653 most deeply nested lists must contains P scalars of the Tensor type where P is
    654 the size of the last dimension of the Tensor.
    655 
    656 For example, for a ```TensorFixedSize<float, 2, 3>``` the initializer list must
    657 contains 2 lists of 3 floats each.
    658 
    659 ```setValues()``` returns the tensor itself in case you want to chain another
    660 call.
    661 
    662     Eigen::Tensor<float, 2> a(2, 3);
    663     a.setValues({{0.0f, 1.0f, 2.0f}, {3.0f, 4.0f, 5.0f}});
    664     cout << "a" << endl << a << endl << endl;
    665     =>
    666     a
    667     0 1 2
    668     3 4 5
    669 
    670 If a list is too short, the corresponding elements of the tensor will not be
    671 changed.  This is valid at each level of nesting.  For example the following
    672 code only sets the values of the first row of the tensor.
    673 
    674     Eigen::Tensor<int, 2> a(2, 3);
    675     a.setConstant(1000);
    676     a.setValues({{10, 20, 30}});
    677     cout << "a" << endl << a << endl << endl;
    678     =>
    679     a
    680     10   20   30
    681     1000 1000 1000
    682 
    683 ### <Tensor-Type> setRandom()
    684 
    685 Fills the tensor with random values.  Returns the tensor itself in case you
    686 want to chain another call.
    687 
    688     a.setRandom();
    689     cout << "Random: " << endl << a << endl << endl;
    690     =>
    691     Random:
    692       0.680375    0.59688  -0.329554    0.10794
    693      -0.211234   0.823295   0.536459 -0.0452059
    694       0.566198  -0.604897  -0.444451   0.257742
    695 
    696 You can customize ```setRandom()``` by providing your own random number
    697 generator as a template argument:
    698 
    699     a.setRandom<MyRandomGenerator>();
    700 
    701 Here, ```MyRandomGenerator``` must be a struct with the following member
    702 functions, where Scalar and Index are the same as ```<Tensor-Type>::Scalar```
    703 and ```<Tensor-Type>::Index```.
    704 
    705 See ```struct UniformRandomGenerator``` in TensorFunctors.h for an example.
    706 
    707     // Custom number generator for use with setRandom().
    708     struct MyRandomGenerator {
    709       // Default and copy constructors. Both are needed
    710       MyRandomGenerator() { }
    711       MyRandomGenerator(const MyRandomGenerator& ) { }
    712 
    713       // Return a random value to be used.  "element_location" is the
    714       // location of the entry to set in the tensor, it can typically
    715       // be ignored.
    716       Scalar operator()(Eigen::DenseIndex element_location,
    717                         Eigen::DenseIndex /*unused*/ = 0) const {
    718         return <randomly generated value of type T>;
    719       }
    720 
    721       // Same as above but generates several numbers at a time.
    722       typename internal::packet_traits<Scalar>::type packetOp(
    723           Eigen::DenseIndex packet_location, Eigen::DenseIndex /*unused*/ = 0) const {
    724         return <a packet of randomly generated values>;
    725       }
    726     };
    727 
    728 You can also use one of the 2 random number generators that are part of the
    729 tensor library:
    730 *   UniformRandomGenerator
    731 *   NormalRandomGenerator
    732 
    733 
    734 ## Data Access
    735 
    736 The Tensor, TensorFixedSize, and TensorRef classes provide the following
    737 accessors to access the tensor coefficients:
    738 
    739     const Scalar& operator()(const array<Index, NumIndices>& indices)
    740     const Scalar& operator()(Index firstIndex, IndexTypes... otherIndices)
    741     Scalar& operator()(const array<Index, NumIndices>& indices)
    742     Scalar& operator()(Index firstIndex, IndexTypes... otherIndices)
    743 
    744 The number of indices must be equal to the rank of the tensor. Moreover, these
    745 accessors are not available on tensor expressions. In order to access the
    746 values of a tensor expression, the expression must either be evaluated or
    747 wrapped in a TensorRef.
    748 
    749 
    750 ### Scalar* data() and const Scalar* data() const
    751 
    752 Returns a pointer to the storage for the tensor.  The pointer is const if the
    753 tensor was const.  This allows direct access to the data.  The layout of the
    754 data depends on the tensor layout: RowMajor or ColMajor.
    755 
    756 This access is usually only needed for special cases, for example when mixing
    757 Eigen Tensor code with other libraries.
    758 
    759 Scalar is the type of data stored in the tensor.
    760 
    761     Eigen::Tensor<float, 2> a(3, 4);
    762     float* a_data = a.data();
    763     a_data[0] = 123.45f;
    764     cout << "a(0, 0): " << a(0, 0);
    765     => a(0, 0): 123.45
    766 
    767 
    768 ## Tensor Operations
    769 
    770 All the methods documented below return non evaluated tensor ```Operations```.
    771 These can be chained: you can apply another Tensor Operation to the value
    772 returned by the method.
    773 
    774 The chain of Operation is evaluated lazily, typically when it is assigned to a
    775 tensor.  See "Controlling when Expression are Evaluated" for more details about
    776 their evaluation.
    777 
    778 ### <Operation> constant(const Scalar& val)
    779 
    780 Returns a tensor of the same type and dimensions as the original tensor but
    781 where all elements have the value ```val```.
    782 
    783 This is useful, for example, when you want to add or subtract a constant from a
    784 tensor, or multiply every element of a tensor by a scalar.
    785 
    786     Eigen::Tensor<float, 2> a(2, 3);
    787     a.setConstant(1.0f);
    788     Eigen::Tensor<float, 2> b = a + a.constant(2.0f);
    789     Eigen::Tensor<float, 2> c = b * b.constant(0.2f);
    790     cout << "a" << endl << a << endl << endl;
    791     cout << "b" << endl << b << endl << endl;
    792     cout << "c" << endl << c << endl << endl;
    793     =>
    794     a
    795     1 1 1
    796     1 1 1
    797 
    798     b
    799     3 3 3
    800     3 3 3
    801 
    802     c
    803     0.6 0.6 0.6
    804     0.6 0.6 0.6
    805 
    806 ### <Operation> random()
    807 
    808 Returns a tensor of the same type and dimensions as the current tensor
    809 but where all elements have random values.
    810 
    811 This is for example useful to add random values to an existing tensor.
    812 The generation of random values can be customized in the same manner
    813 as for ```setRandom()```.
    814 
    815     Eigen::Tensor<float, 2> a(2, 3);
    816     a.setConstant(1.0f);
    817     Eigen::Tensor<float, 2> b = a + a.random();
    818     cout << "a" << endl << a << endl << endl;
    819     cout << "b" << endl << b << endl << endl;
    820     =>
    821     a
    822     1 1 1
    823     1 1 1
    824 
    825     b
    826     1.68038   1.5662  1.82329
    827     0.788766  1.59688 0.395103
    828 
    829 
    830 ## Unary Element Wise Operations
    831 
    832 All these operations take a single input tensor as argument and return a tensor
    833 of the same type and dimensions as the tensor to which they are applied.  The
    834 requested operations are applied to each element independently.
    835 
    836 ### <Operation> operator-()
    837 
    838 Returns a tensor of the same type and dimensions as the original tensor
    839 containing the opposite values of the original tensor.
    840 
    841     Eigen::Tensor<float, 2> a(2, 3);
    842     a.setConstant(1.0f);
    843     Eigen::Tensor<float, 2> b = -a;
    844     cout << "a" << endl << a << endl << endl;
    845     cout << "b" << endl << b << endl << endl;
    846     =>
    847     a
    848     1 1 1
    849     1 1 1
    850 
    851     b
    852     -1 -1 -1
    853     -1 -1 -1
    854 
    855 ### <Operation> sqrt()
    856 
    857 Returns a tensor of the same type and dimensions as the original tensor
    858 containing the square roots of the original tensor.
    859 
    860 ### <Operation> rsqrt()
    861 
    862 Returns a tensor of the same type and dimensions as the original tensor
    863 containing the inverse square roots of the original tensor.
    864 
    865 ### <Operation> square()
    866 
    867 Returns a tensor of the same type and dimensions as the original tensor
    868 containing the squares of the original tensor values.
    869 
    870 ### <Operation> inverse()
    871 
    872 Returns a tensor of the same type and dimensions as the original tensor
    873 containing the inverse of the original tensor values.
    874 
    875 ### <Operation> exp()
    876 
    877 Returns a tensor of the same type and dimensions as the original tensor
    878 containing the exponential of the original tensor.
    879 
    880 ### <Operation> log()
    881 
    882 Returns a tensor of the same type and dimensions as the original tensor
    883 containing the natural logarithms of the original tensor.
    884 
    885 ### <Operation> abs()
    886 
    887 Returns a tensor of the same type and dimensions as the original tensor
    888 containing the absolute values of the original tensor.
    889 
    890 ### <Operation> pow(Scalar exponent)
    891 
    892 Returns a tensor of the same type and dimensions as the original tensor
    893 containing the coefficients of the original tensor to the power of the
    894 exponent.
    895 
    896 The type of the exponent, Scalar, is always the same as the type of the
    897 tensor coefficients.  For example, only integer exponents can be used in
    898 conjuntion with tensors of integer values.
    899 
    900 You can use cast() to lift this restriction.  For example this computes
    901 cubic roots of an int Tensor:
    902 
    903     Eigen::Tensor<int, 2> a(2, 3);
    904     a.setValues({{0, 1, 8}, {27, 64, 125}});
    905     Eigen::Tensor<double, 2> b = a.cast<double>().pow(1.0 / 3.0);
    906     cout << "a" << endl << a << endl << endl;
    907     cout << "b" << endl << b << endl << endl;
    908     =>
    909     a
    910     0   1   8
    911     27  64 125
    912 
    913     b
    914     0 1 2
    915     3 4 5
    916 
    917 ### <Operation>  operator * (Scalar scale)
    918 
    919 Multiplies all the coefficients of the input tensor by the provided scale.
    920 
    921 ### <Operation>  cwiseMax(Scalar threshold)
    922 TODO
    923 
    924 ### <Operation>  cwiseMin(Scalar threshold)
    925 TODO
    926 
    927 ### <Operation>  unaryExpr(const CustomUnaryOp& func)
    928 TODO
    929 
    930 
    931 ## Binary Element Wise Operations
    932 
    933 These operations take two input tensors as arguments. The 2 input tensors should
    934 be of the same type and dimensions. The result is a tensor of the same
    935 dimensions as the tensors to which they are applied, and unless otherwise
    936 specified it is also of the same type. The requested operations are applied to
    937 each pair of elements independently.
    938 
    939 ### <Operation> operator+(const OtherDerived& other)
    940 
    941 Returns a tensor of the same type and dimensions as the input tensors
    942 containing the coefficient wise sums of the inputs.
    943 
    944 ### <Operation> operator-(const OtherDerived& other)
    945 
    946 Returns a tensor of the same type and dimensions as the input tensors
    947 containing the coefficient wise differences of the inputs.
    948 
    949 ### <Operation> operator*(const OtherDerived& other)
    950 
    951 Returns a tensor of the same type and dimensions as the input tensors
    952 containing the coefficient wise products of the inputs.
    953 
    954 ### <Operation> operator/(const OtherDerived& other)
    955 
    956 Returns a tensor of the same type and dimensions as the input tensors
    957 containing the coefficient wise quotients of the inputs.
    958 
    959 This operator is not supported for integer types.
    960 
    961 ### <Operation> cwiseMax(const OtherDerived& other)
    962 
    963 Returns a tensor of the same type and dimensions as the input tensors
    964 containing the coefficient wise maximums of the inputs.
    965 
    966 ### <Operation> cwiseMin(const OtherDerived& other)
    967 
    968 Returns a tensor of the same type and dimensions as the input tensors
    969 containing the coefficient wise mimimums of the inputs.
    970 
    971 ### <Operation> Logical operators
    972 
    973 The following logical operators are supported as well:
    974 
    975 *   operator&&(const OtherDerived& other)
    976 *   operator||(const OtherDerived& other)
    977 *   operator<(const OtherDerived& other)
    978 *   operator<=(const OtherDerived& other)
    979 *   operator>(const OtherDerived& other)
    980 *   operator>=(const OtherDerived& other)
    981 *   operator==(const OtherDerived& other)
    982 *   operator!=(const OtherDerived& other)
    983 
    984 They all return a tensor of boolean values.
    985 
    986 
    987 ## Selection (select(const ThenDerived& thenTensor, const ElseDerived& elseTensor)
    988 
    989 Selection is a coefficient-wise ternary operator that is the tensor equivalent
    990 to the if-then-else operation.
    991 
    992     Tensor<bool, 3> if = ...;
    993     Tensor<float, 3> then = ...;
    994     Tensor<float, 3> else = ...;
    995     Tensor<float, 3> result = if.select(then, else);
    996 
    997 The 3 arguments must be of the same dimensions, which will also be the dimension
    998 of the result.  The 'if' tensor must be of type boolean, the 'then' and the
    999 'else' tensor must be of the same type, which will also be the type of the
   1000 result.
   1001 
   1002 Each coefficient in the result is equal to the corresponding coefficient in the
   1003 'then' tensor if the corresponding value in the 'if' tensor is true. If not, the
   1004 resulting coefficient will come from the 'else' tensor.
   1005 
   1006 
   1007 ## Contraction
   1008 
   1009 Tensor *contractions* are a generalization of the matrix product to the
   1010 multidimensional case.
   1011 
   1012     // Create 2 matrices using tensors of rank 2
   1013     Eigen::Tensor<int, 2> a(2, 3);
   1014     a.setValues({{1, 2, 3}, {6, 5, 4}});
   1015     Eigen::Tensor<int, 2> b(3, 2);
   1016     a.setValues({{1, 2}, {4, 5}, {5, 6}});
   1017 
   1018     // Compute the traditional matrix product
   1019     array<IndexPair<int>, 1> product_dims = { IndexPair(1, 0) };
   1020     Eigen::Tensor<int, 2> AB = a.contract(b, product_dims);
   1021 
   1022     // Compute the product of the transpose of the matrices
   1023     array<IndexPair<int>, 1> transpose_product_dims = { IndexPair(0, 1) };
   1024     Eigen::Tensor<int, 2> AtBt = a.contract(b, transposed_product_dims);
   1025 
   1026 
   1027 ## Reduction Operations
   1028 
   1029 A *Reduction* operation returns a tensor with fewer dimensions than the
   1030 original tensor.  The values in the returned tensor are computed by applying a
   1031 *reduction operator* to slices of values from the original tensor.  You specify
   1032 the dimensions along which the slices are made.
   1033 
   1034 The Eigen Tensor library provides a set of predefined reduction operators such
   1035 as ```maximum()``` and ```sum()``` and lets you define additional operators by
   1036 implementing a few methods from a reductor template.
   1037 
   1038 ### Reduction Dimensions
   1039 
   1040 All reduction operations take a single parameter of type
   1041 ```<TensorType>::Dimensions``` which can always be specified as an array of
   1042 ints.  These are called the "reduction dimensions."  The values are the indices
   1043 of the dimensions of the input tensor over which the reduction is done.  The
   1044 parameter can have at most as many element as the rank of the input tensor;
   1045 each element must be less than the tensor rank, as it indicates one of the
   1046 dimensions to reduce.
   1047 
   1048 Each dimension of the input tensor should occur at most once in the reduction
   1049 dimensions as the implementation does not remove duplicates.
   1050 
   1051 The order of the values in the reduction dimensions does not affect the
   1052 results, but the code may execute faster if you list the dimensions in
   1053 increasing order.
   1054 
   1055 Example: Reduction along one dimension.
   1056 
   1057     // Create a tensor of 2 dimensions
   1058     Eigen::Tensor<int, 2> a(2, 3);
   1059     a.setValues({{1, 2, 3}, {6, 5, 4}});
   1060     // Reduce it along the second dimension (1)...
   1061     Eigen::array<int, 1> dims({1 /* dimension to reduce */});
   1062     // ...using the "maximum" operator.
   1063     // The result is a tensor with one dimension.  The size of
   1064     // that dimension is the same as the first (non-reduced) dimension of a.
   1065     Eigen::Tensor<int, 1> b = a.maximum(dims);
   1066     cout << "a" << endl << a << endl << endl;
   1067     cout << "b" << endl << b << endl << endl;
   1068     =>
   1069     a
   1070     1 2 3
   1071     6 5 4
   1072 
   1073     b
   1074     3
   1075     6
   1076 
   1077 Example: Reduction along two dimensions.
   1078 
   1079     Eigen::Tensor<float, 3, Eigen::ColMajor> a(2, 3, 4);
   1080     a.setValues({{{0.0f, 1.0f, 2.0f, 3.0f},
   1081                   {7.0f, 6.0f, 5.0f, 4.0f},
   1082                   {8.0f, 9.0f, 10.0f, 11.0f}},
   1083                  {{12.0f, 13.0f, 14.0f, 15.0f},
   1084                   {19.0f, 18.0f, 17.0f, 16.0f},
   1085                   {20.0f, 21.0f, 22.0f, 23.0f}}});
   1086     // The tensor a has 3 dimensions.  We reduce along the
   1087     // first 2, resulting in a tensor with a single dimension
   1088     // of size 4 (the last dimension of a.)
   1089     // Note that we pass the array of reduction dimensions
   1090     // directly to the maximum() call.
   1091     Eigen::Tensor<float, 1, Eigen::ColMajor> b =
   1092         a.maximum(Eigen::array<int, 2>({0, 1}));
   1093     cout << "b" << endl << b << endl << endl;
   1094     =>
   1095     b
   1096     20
   1097     21
   1098     22
   1099     23
   1100 
   1101 #### Reduction along all dimensions
   1102 
   1103 As a special case, if you pass no parameter to a reduction operation the
   1104 original tensor is reduced along *all* its dimensions.  The result is a
   1105 scalar, represented as a zero-dimension tensor.
   1106 
   1107     Eigen::Tensor<float, 3> a(2, 3, 4);
   1108     a.setValues({{{0.0f, 1.0f, 2.0f, 3.0f},
   1109                   {7.0f, 6.0f, 5.0f, 4.0f},
   1110                   {8.0f, 9.0f, 10.0f, 11.0f}},
   1111                  {{12.0f, 13.0f, 14.0f, 15.0f},
   1112                   {19.0f, 18.0f, 17.0f, 16.0f},
   1113                   {20.0f, 21.0f, 22.0f, 23.0f}}});
   1114     // Reduce along all dimensions using the sum() operator.
   1115     Eigen::Tensor<float, 0> b = a.sum();
   1116     cout << "b" << endl << b << endl << endl;
   1117     =>
   1118     b
   1119     276
   1120 
   1121 
   1122 ### <Operation> sum(const Dimensions& new_dims)
   1123 ### <Operation> sum()
   1124 
   1125 Reduce a tensor using the sum() operator.  The resulting values
   1126 are the sum of the reduced values.
   1127 
   1128 ### <Operation> mean(const Dimensions& new_dims)
   1129 ### <Operation> mean()
   1130 
   1131 Reduce a tensor using the mean() operator.  The resulting values
   1132 are the mean of the reduced values.
   1133 
   1134 ### <Operation> maximum(const Dimensions& new_dims)
   1135 ### <Operation> maximum()
   1136 
   1137 Reduce a tensor using the maximum() operator.  The resulting values are the
   1138 largest of the reduced values.
   1139 
   1140 ### <Operation> minimum(const Dimensions& new_dims)
   1141 ### <Operation> minimum()
   1142 
   1143 Reduce a tensor using the minimum() operator.  The resulting values
   1144 are the smallest of the reduced values.
   1145 
   1146 ### <Operation> prod(const Dimensions& new_dims)
   1147 ### <Operation> prod()
   1148 
   1149 Reduce a tensor using the prod() operator.  The resulting values
   1150 are the product of the reduced values.
   1151 
   1152 ### <Operation> all(const Dimensions& new_dims)
   1153 ### <Operation> all()
   1154 Reduce a tensor using the all() operator.  Casts tensor to bool and then checks
   1155 whether all elements are true.  Runs through all elements rather than
   1156 short-circuiting, so may be significantly inefficient.
   1157 
   1158 ### <Operation> any(const Dimensions& new_dims)
   1159 ### <Operation> any()
   1160 Reduce a tensor using the any() operator.  Casts tensor to bool and then checks
   1161 whether any element is true.  Runs through all elements rather than
   1162 short-circuiting, so may be significantly inefficient.
   1163 
   1164 
   1165 ### <Operation> reduce(const Dimensions& new_dims, const Reducer& reducer)
   1166 
   1167 Reduce a tensor using a user-defined reduction operator.  See ```SumReducer```
   1168 in TensorFunctors.h for information on how to implement a reduction operator.
   1169 
   1170 
   1171 ## Scan Operations
   1172 
   1173 A *Scan* operation returns a tensor with the same dimensions as the original
   1174 tensor. The operation performs an inclusive scan along the specified
   1175 axis, which means it computes a running total along the axis for a given
   1176 reduction operation.
   1177 If the reduction operation corresponds to summation, then this computes the
   1178 prefix sum of the tensor along the given axis.
   1179 
   1180 Example:
   1181 dd a comment to this line
   1182 
   1183     // Create a tensor of 2 dimensions
   1184     Eigen::Tensor<int, 2> a(2, 3);
   1185     a.setValues({{1, 2, 3}, {4, 5, 6}});
   1186     // Scan it along the second dimension (1) using summation
   1187     Eigen::Tensor<int, 2> b = a.cumsum(1);
   1188     // The result is a tensor with the same size as the input
   1189     cout << "a" << endl << a << endl << endl;
   1190     cout << "b" << endl << b << endl << endl;
   1191     =>
   1192     a
   1193     1 2 3
   1194     6 5 4
   1195 
   1196     b
   1197     1  3  6
   1198     4  9 15
   1199 
   1200 ### <Operation> cumsum(const Index& axis)
   1201 
   1202 Perform a scan by summing consecutive entries.
   1203 
   1204 ### <Operation> cumprod(const Index& axis)
   1205 
   1206 Perform a scan by multiplying consecutive entries.
   1207 
   1208 
   1209 ## Convolutions
   1210 
   1211 ### <Operation> convolve(const Kernel& kernel, const Dimensions& dims)
   1212 
   1213 Returns a tensor that is the output of the convolution of the input tensor with the kernel,
   1214 along the specified dimensions of the input tensor. The dimension size for dimensions of the output tensor
   1215 which were part of the convolution will be reduced by the formula:
   1216 output_dim_size = input_dim_size - kernel_dim_size + 1 (requires: input_dim_size >= kernel_dim_size).
   1217 The dimension sizes for dimensions that were not part of the convolution will remain the same.
   1218 Performance of the convolution can depend on the length of the stride(s) of the input tensor dimension(s) along which the
   1219 convolution is computed (the first dimension has the shortest stride for ColMajor, whereas RowMajor's shortest stride is
   1220 for the last dimension).
   1221 
   1222     // Compute convolution along the second and third dimension.
   1223     Tensor<float, 4, DataLayout> input(3, 3, 7, 11);
   1224     Tensor<float, 2, DataLayout> kernel(2, 2);
   1225     Tensor<float, 4, DataLayout> output(3, 2, 6, 11);
   1226     input.setRandom();
   1227     kernel.setRandom();
   1228 
   1229     Eigen::array<ptrdiff_t, 2> dims({1, 2});  // Specify second and third dimension for convolution.
   1230     output = input.convolve(kernel, dims);
   1231 
   1232     for (int i = 0; i < 3; ++i) {
   1233       for (int j = 0; j < 2; ++j) {
   1234         for (int k = 0; k < 6; ++k) {
   1235           for (int l = 0; l < 11; ++l) {
   1236             const float result = output(i,j,k,l);
   1237             const float expected = input(i,j+0,k+0,l) * kernel(0,0) +
   1238                                    input(i,j+1,k+0,l) * kernel(1,0) +
   1239                                    input(i,j+0,k+1,l) * kernel(0,1) +
   1240                                    input(i,j+1,k+1,l) * kernel(1,1);
   1241             VERIFY_IS_APPROX(result, expected);
   1242           }
   1243         }
   1244       }
   1245     }
   1246 
   1247 
   1248 ## Geometrical Operations
   1249 
   1250 These operations return a Tensor with different dimensions than the original
   1251 Tensor.  They can be used to access slices of tensors, see them with different
   1252 dimensions, or pad tensors with additional data.
   1253 
   1254 ### <Operation> reshape(const Dimensions& new_dims)
   1255 
   1256 Returns a view of the input tensor that has been reshaped to the specified
   1257 new dimensions.  The argument new_dims is an array of Index values.  The
   1258 rank of the resulting tensor is equal to the number of elements in new_dims.
   1259 
   1260 The product of all the sizes in the new dimension array must be equal to
   1261 the number of elements in the input tensor.
   1262 
   1263     // Increase the rank of the input tensor by introducing a new dimension
   1264     // of size 1.
   1265     Tensor<float, 2> input(7, 11);
   1266     array<int, 3> three_dims{{7, 11, 1}};
   1267     Tensor<float, 3> result = input.reshape(three_dims);
   1268 
   1269     // Decrease the rank of the input tensor by merging 2 dimensions;
   1270     array<int, 1> one_dim{{7 * 11}};
   1271     Tensor<float, 1> result = input.reshape(one_dim);
   1272 
   1273 This operation does not move any data in the input tensor, so the resulting
   1274 contents of a reshaped Tensor depend on the data layout of the original Tensor.
   1275 
   1276 For example this is what happens when you ```reshape()``` a 2D ColMajor tensor
   1277 to one dimension:
   1278 
   1279     Eigen::Tensor<float, 2, Eigen::ColMajor> a(2, 3);
   1280     a.setValues({{0.0f, 100.0f, 200.0f}, {300.0f, 400.0f, 500.0f}});
   1281     Eigen::array<Eigen::DenseIndex, 1> one_dim({3 * 2});
   1282     Eigen::Tensor<float, 1, Eigen::ColMajor> b = a.reshape(one_dim);
   1283     cout << "b" << endl << b << endl;
   1284     =>
   1285     b
   1286       0
   1287     300
   1288     100
   1289     400
   1290     200
   1291     500
   1292 
   1293 This is what happens when the 2D Tensor is RowMajor:
   1294 
   1295     Eigen::Tensor<float, 2, Eigen::RowMajor> a(2, 3);
   1296     a.setValues({{0.0f, 100.0f, 200.0f}, {300.0f, 400.0f, 500.0f}});
   1297     Eigen::array<Eigen::DenseIndex, 1> one_dim({3 * 2});
   1298     Eigen::Tensor<float, 1, Eigen::RowMajor> b = a.reshape(one_dim);
   1299     cout << "b" << endl << b << endl;
   1300     =>
   1301     b
   1302       0
   1303     100
   1304     200
   1305     300
   1306     400
   1307     500
   1308 
   1309 The reshape operation is a lvalue. In other words, it can be used on the left
   1310 side of the assignment operator.
   1311 
   1312 The previous example can be rewritten as follow:
   1313 
   1314     Eigen::Tensor<float, 2, Eigen::ColMajor> a(2, 3);
   1315     a.setValues({{0.0f, 100.0f, 200.0f}, {300.0f, 400.0f, 500.0f}});
   1316     Eigen::array<Eigen::DenseIndex, 2> two_dim({2, 3});
   1317     Eigen::Tensor<float, 1, Eigen::ColMajor> b;
   1318     b.reshape(two_dim) = a;
   1319     cout << "b" << endl << b << endl;
   1320     =>
   1321     b
   1322       0
   1323     300
   1324     100
   1325     400
   1326     200
   1327     500
   1328 
   1329 Note that "b" itself was not reshaped but that instead the assignment is done to
   1330 the reshape view of b.
   1331 
   1332 
   1333 ### <Operation> shuffle(const Shuffle& shuffle)
   1334 
   1335 Returns a copy of the input tensor whose dimensions have been
   1336 reordered according to the specified permutation. The argument shuffle
   1337 is an array of Index values. Its size is the rank of the input
   1338 tensor. It must contain a permutation of 0, 1, ..., rank - 1. The i-th
   1339 dimension of the output tensor equals to the size of the shuffle[i]-th
   1340 dimension of the input tensor. For example:
   1341 
   1342     // Shuffle all dimensions to the left by 1.
   1343     Tensor<float, 3> input(20, 30, 50);
   1344     // ... set some values in input.
   1345     Tensor<float, 3> output = input.shuffle({1, 2, 0})
   1346 
   1347     eigen_assert(output.dimension(0) == 30);
   1348     eigen_assert(output.dimension(1) == 50);
   1349     eigen_assert(output.dimension(2) == 20);
   1350 
   1351 Indices into the output tensor are shuffled accordingly to formulate
   1352 indices into the input tensor. For example, one can assert in the above
   1353 code snippet that:
   1354 
   1355     eigen_assert(output(3, 7, 11) == input(11, 3, 7));
   1356 
   1357 In general, one can assert that
   1358 
   1359     eigen_assert(output(..., indices[shuffle[i]], ...) ==
   1360                  input(..., indices[i], ...))
   1361 
   1362 The shuffle operation results in a lvalue, which means that it can be assigned
   1363 to. In other words, it can be used on the left side of the assignment operator.
   1364 
   1365 Let's rewrite the previous example to take advantage of this feature:
   1366 
   1367     // Shuffle all dimensions to the left by 1.
   1368     Tensor<float, 3> input(20, 30, 50);
   1369     // ... set some values in input.
   1370     Tensor<float, 3> output(30, 50, 20);
   1371     output.shuffle({2, 0, 1}) = input;
   1372 
   1373 
   1374 ### <Operation> stride(const Strides& strides)
   1375 
   1376 Returns a view of the input tensor that strides (skips stride-1
   1377 elements) along each of the dimensions.  The argument strides is an
   1378 array of Index values.  The dimensions of the resulting tensor are
   1379 ceil(input_dimensions[i] / strides[i]).
   1380 
   1381 For example this is what happens when you ```stride()``` a 2D tensor:
   1382 
   1383     Eigen::Tensor<int, 2> a(4, 3);
   1384     a.setValues({{0, 100, 200}, {300, 400, 500}, {600, 700, 800}, {900, 1000, 1100}});
   1385     Eigen::array<Eigen::DenseIndex, 2> strides({3, 2});
   1386     Eigen::Tensor<int, 2> b = a.stride(strides);
   1387     cout << "b" << endl << b << endl;
   1388     =>
   1389     b
   1390        0   200
   1391      900  1100
   1392 
   1393 It is possible to assign a tensor to a stride:
   1394     Tensor<float, 3> input(20, 30, 50);
   1395     // ... set some values in input.
   1396     Tensor<float, 3> output(40, 90, 200);
   1397     output.stride({2, 3, 4}) = input;
   1398 
   1399 
   1400 ### <Operation> slice(const StartIndices& offsets, const Sizes& extents)
   1401 
   1402 Returns a sub-tensor of the given tensor. For each dimension i, the slice is
   1403 made of the coefficients stored between offset[i] and offset[i] + extents[i] in
   1404 the input tensor.
   1405 
   1406     Eigen::Tensor<int, 2> a(4, 3);
   1407     a.setValues({{0, 100, 200}, {300, 400, 500},
   1408                  {600, 700, 800}, {900, 1000, 1100}});
   1409     Eigen::array<int, 2> offsets = {1, 0};
   1410     Eigen::array<int, 2> extents = {2, 2};
   1411     Eigen::Tensor<int, 1> slice = a.slice(offsets, extents);
   1412     cout << "a" << endl << a << endl;
   1413     =>
   1414     a
   1415        0   100   200
   1416      300   400   500
   1417      600   700   800
   1418      900  1000  1100
   1419     cout << "slice" << endl << slice << endl;
   1420     =>
   1421     slice
   1422      300   400
   1423      600   700
   1424 
   1425 
   1426 ### <Operation> chip(const Index offset, const Index dim)
   1427 
   1428 A chip is a special kind of slice. It is the subtensor at the given offset in
   1429 the dimension dim. The returned tensor has one fewer dimension than the input
   1430 tensor: the dimension dim is removed.
   1431 
   1432 For example, a matrix chip would be either a row or a column of the input
   1433 matrix.
   1434 
   1435     Eigen::Tensor<int, 2> a(4, 3);
   1436     a.setValues({{0, 100, 200}, {300, 400, 500},
   1437                  {600, 700, 800}, {900, 1000, 1100}});
   1438     Eigen::Tensor<int, 1> row_3 = a.chip(2, 0);
   1439     Eigen::Tensor<int, 1> col_2 = a.chip(1, 1);
   1440     cout << "a" << endl << a << endl;
   1441     =>
   1442     a
   1443        0   100   200
   1444      300   400   500
   1445      600   700   800
   1446      900  1000  1100
   1447     cout << "row_3" << endl << row_3 << endl;
   1448     =>
   1449     row_3
   1450        600   700   800
   1451     cout << "col_2" << endl << col_2 << endl;
   1452     =>
   1453     col_2
   1454        100   400   700    1000
   1455 
   1456 It is possible to assign values to a tensor chip since the chip operation is a
   1457 lvalue. For example:
   1458 
   1459     Eigen::Tensor<int, 1> a(3);
   1460     a.setValues({{100, 200, 300}});
   1461     Eigen::Tensor<int, 2> b(2, 3);
   1462     b.setZero();
   1463     b.chip(0, 0) = a;
   1464     cout << "a" << endl << a << endl;
   1465     =>
   1466     a
   1467      100
   1468      200
   1469      300
   1470     cout << "b" << endl << b << endl;
   1471     =>
   1472     b
   1473        100   200   300
   1474          0     0     0
   1475 
   1476 
   1477 ### <Operation> reverse(const ReverseDimensions& reverse)
   1478 
   1479 Returns a view of the input tensor that reverses the order of the coefficients
   1480 along a subset of the dimensions.  The argument reverse is an array of boolean
   1481 values that indicates whether or not the order of the coefficients should be
   1482 reversed along each of the dimensions.  This operation preserves the dimensions
   1483 of the input tensor.
   1484 
   1485 For example this is what happens when you ```reverse()``` the first dimension
   1486 of a 2D tensor:
   1487 
   1488     Eigen::Tensor<int, 2> a(4, 3);
   1489     a.setValues({{0, 100, 200}, {300, 400, 500},
   1490                 {600, 700, 800}, {900, 1000, 1100}});
   1491     Eigen::array<bool, 2> reverse({true, false});
   1492     Eigen::Tensor<int, 2> b = a.reverse(reverse);
   1493     cout << "a" << endl << a << endl << "b" << endl << b << endl;
   1494     =>
   1495     a
   1496        0   100   200
   1497      300   400   500
   1498      600   700   800
   1499      900  1000  1100
   1500     b
   1501      900  1000  1100
   1502      600   700   800
   1503      300   400   500
   1504        0   100   200
   1505 
   1506 
   1507 ### <Operation> broadcast(const Broadcast& broadcast)
   1508 
   1509 Returns a view of the input tensor in which the input is replicated one to many
   1510 times.
   1511 The broadcast argument specifies how many copies of the input tensor need to be
   1512 made in each of the dimensions.
   1513 
   1514     Eigen::Tensor<int, 2> a(2, 3);
   1515     a.setValues({{0, 100, 200}, {300, 400, 500}});
   1516     Eigen::array<int, 2> bcast({3, 2});
   1517     Eigen::Tensor<int, 2> b = a.broadcast(bcast);
   1518     cout << "a" << endl << a << endl << "b" << endl << b << endl;
   1519     =>
   1520     a
   1521        0   100   200
   1522      300   400   500
   1523     b
   1524        0   100   200    0   100   200
   1525      300   400   500  300   400   500
   1526        0   100   200    0   100   200
   1527      300   400   500  300   400   500
   1528        0   100   200    0   100   200
   1529      300   400   500  300   400   500
   1530 
   1531 ### <Operation> concatenate(const OtherDerived& other, Axis axis)
   1532 
   1533 TODO
   1534 
   1535 ### <Operation>  pad(const PaddingDimensions& padding)
   1536 
   1537 Returns a view of the input tensor in which the input is padded with zeros.
   1538 
   1539     Eigen::Tensor<int, 2> a(2, 3);
   1540     a.setValues({{0, 100, 200}, {300, 400, 500}});
   1541     Eigen::array<pair<int, int>, 2> paddings;
   1542     paddings[0] = make_pair(0, 1);
   1543     paddings[1] = make_pair(2, 3);
   1544     Eigen::Tensor<int, 2> b = a.pad(paddings);
   1545     cout << "a" << endl << a << endl << "b" << endl << b << endl;
   1546     =>
   1547     a
   1548        0   100   200
   1549      300   400   500
   1550     b
   1551        0     0     0    0
   1552        0     0     0    0
   1553        0   100   200    0
   1554      300   400   500    0
   1555        0     0     0    0
   1556        0     0     0    0
   1557        0     0     0    0
   1558 
   1559 
   1560 ### <Operation>  extract_patches(const PatchDims& patch_dims)
   1561 
   1562 Returns a tensor of coefficient patches extracted from the input tensor, where
   1563 each patch is of dimension specified by 'patch_dims'. The returned tensor has
   1564 one greater dimension than the input tensor, which is used to index each patch.
   1565 The patch index in the output tensor depends on the data layout of the input
   1566 tensor: the patch index is the last dimension ColMajor layout, and the first
   1567 dimension in RowMajor layout.
   1568 
   1569 For example, given the following input tensor:
   1570 
   1571   Eigen::Tensor<float, 2, DataLayout> tensor(3,4);
   1572   tensor.setValues({{0.0f, 1.0f, 2.0f, 3.0f},
   1573                     {4.0f, 5.0f, 6.0f, 7.0f},
   1574                     {8.0f, 9.0f, 10.0f, 11.0f}});
   1575 
   1576   cout << "tensor: " << endl << tensor << endl;
   1577 =>
   1578 tensor:
   1579  0   1   2   3
   1580  4   5   6   7
   1581  8   9  10  11
   1582 
   1583 Six 2x2 patches can be extracted and indexed using the following code:
   1584 
   1585   Eigen::Tensor<float, 3, DataLayout> patch;
   1586   Eigen::array<ptrdiff_t, 2> patch_dims;
   1587   patch_dims[0] = 2;
   1588   patch_dims[1] = 2;
   1589   patch = tensor.extract_patches(patch_dims);
   1590   for (int k = 0; k < 6; ++k) {
   1591     cout << "patch index: " << k << endl;
   1592     for (int i = 0; i < 2; ++i) {
   1593       for (int j = 0; j < 2; ++j) {
   1594         if (DataLayout == ColMajor) {
   1595           cout << patch(i, j, k) << " ";
   1596         } else {
   1597           cout << patch(k, i, j) << " ";
   1598         }
   1599       }
   1600       cout << endl;
   1601     }
   1602   }
   1603 
   1604 This code results in the following output when the data layout is ColMajor:
   1605 
   1606 patch index: 0
   1607 0 1
   1608 4 5
   1609 patch index: 1
   1610 4 5
   1611 8 9
   1612 patch index: 2
   1613 1 2
   1614 5 6
   1615 patch index: 3
   1616 5 6
   1617 9 10
   1618 patch index: 4
   1619 2 3
   1620 6 7
   1621 patch index: 5
   1622 6 7
   1623 10 11
   1624 
   1625 This code results in the following output when the data layout is RowMajor:
   1626 (NOTE: the set of patches is the same as in ColMajor, but are indexed differently).
   1627 
   1628 patch index: 0
   1629 0 1
   1630 4 5
   1631 patch index: 1
   1632 1 2
   1633 5 6
   1634 patch index: 2
   1635 2 3
   1636 6 7
   1637 patch index: 3
   1638 4 5
   1639 8 9
   1640 patch index: 4
   1641 5 6
   1642 9 10
   1643 patch index: 5
   1644 6 7
   1645 10 11
   1646 
   1647 ### <Operation>  extract_image_patches(const Index patch_rows, const Index patch_cols,
   1648                           const Index row_stride, const Index col_stride,
   1649                           const PaddingType padding_type)
   1650 
   1651 Returns a tensor of coefficient image patches extracted from the input tensor,
   1652 which is expected to have dimensions ordered as follows (depending on the data
   1653 layout of the input tensor, and the number of additional dimensions 'N'):
   1654 
   1655 *) ColMajor
   1656 1st dimension: channels (of size d)
   1657 2nd dimension: rows (of size r)
   1658 3rd dimension: columns (of size c)
   1659 4th-Nth dimension: time (for video) or batch (for bulk processing).
   1660 
   1661 *) RowMajor (reverse order of ColMajor)
   1662 1st-Nth dimension: time (for video) or batch (for bulk processing).
   1663 N+1'th dimension: columns (of size c)
   1664 N+2'th dimension: rows (of size r)
   1665 N+3'th dimension: channels (of size d)
   1666 
   1667 The returned tensor has one greater dimension than the input tensor, which is
   1668 used to index each patch. The patch index in the output tensor depends on the
   1669 data layout of the input tensor: the patch index is the 4'th dimension in
   1670 ColMajor layout, and the 4'th from the last dimension in RowMajor layout.
   1671 
   1672 For example, given the following input tensor with the following dimension
   1673 sizes:
   1674  *) depth:   2
   1675  *) rows:    3
   1676  *) columns: 5
   1677  *) batch:   7
   1678 
   1679   Tensor<float, 4> tensor(2,3,5,7);
   1680   Tensor<float, 4, RowMajor> tensor_row_major = tensor.swap_layout();
   1681 
   1682 2x2 image patches can be extracted and indexed using the following code:
   1683 
   1684 *) 2D patch: ColMajor (patch indexed by second-to-last dimension)
   1685   Tensor<float, 5> twod_patch;
   1686   twod_patch = tensor.extract_image_patches<2, 2>();
   1687   // twod_patch.dimension(0) == 2
   1688   // twod_patch.dimension(1) == 2
   1689   // twod_patch.dimension(2) == 2
   1690   // twod_patch.dimension(3) == 3*5
   1691   // twod_patch.dimension(4) == 7
   1692 
   1693 *) 2D patch: RowMajor (patch indexed by the second dimension)
   1694   Tensor<float, 5, RowMajor> twod_patch_row_major;
   1695   twod_patch_row_major = tensor_row_major.extract_image_patches<2, 2>();
   1696   // twod_patch_row_major.dimension(0) == 7
   1697   // twod_patch_row_major.dimension(1) == 3*5
   1698   // twod_patch_row_major.dimension(2) == 2
   1699   // twod_patch_row_major.dimension(3) == 2
   1700   // twod_patch_row_major.dimension(4) == 2
   1701 
   1702 ## Special Operations
   1703 
   1704 ### <Operation> cast<T>()
   1705 
   1706 Returns a tensor of type T with the same dimensions as the original tensor.
   1707 The returned tensor contains the values of the original tensor converted to
   1708 type T.
   1709 
   1710     Eigen::Tensor<float, 2> a(2, 3);
   1711     Eigen::Tensor<int, 2> b = a.cast<int>();
   1712 
   1713 This can be useful for example if you need to do element-wise division of
   1714 Tensors of integers.  This is not currently supported by the Tensor library
   1715 but you can easily cast the tensors to floats to do the division:
   1716 
   1717     Eigen::Tensor<int, 2> a(2, 3);
   1718     a.setValues({{0, 1, 2}, {3, 4, 5}});
   1719     Eigen::Tensor<int, 2> b =
   1720         (a.cast<float>() / a.constant(2).cast<float>()).cast<int>();
   1721     cout << "a" << endl << a << endl << endl;
   1722     cout << "b" << endl << b << endl << endl;
   1723     =>
   1724     a
   1725     0 1 2
   1726     3 4 5
   1727 
   1728     b
   1729     0 0 1
   1730     1 2 2
   1731 
   1732 
   1733 ### <Operation>     eval()
   1734 
   1735 TODO
   1736 
   1737 
   1738 ## Representation of scalar values
   1739 
   1740 Scalar values are often represented by tensors of size 1 and rank 1. It would be
   1741 more logical and user friendly to use tensors of rank 0 instead. For example
   1742 Tensor<T, N>::maximum() currently returns a Tensor<T, 1>. Similarly, the inner
   1743 product of 2 1d tensors (through contractions) returns a 1d tensor. In the
   1744 future these operations might be updated to return 0d tensors instead.
   1745 
   1746 ## Limitations
   1747 
   1748 *   The number of tensor dimensions is currently limited to 250 when using a
   1749     compiler that supports cxx11. It is limited to only 5 for older compilers.
   1750 *   The IndexList class requires a cxx11 compliant compiler. You can use an
   1751     array of indices instead if you don't have access to a modern compiler.
   1752 *   On GPUs only floating point values are properly tested and optimized for.
   1753 *   Complex and integer values are known to be broken on GPUs. If you try to use
   1754     them you'll most likely end up triggering a static assertion failure such as
   1755     EIGEN_STATIC_ASSERT(packetSize > 1, YOU_MADE_A_PROGRAMMING_MISTAKE)
   1756 
   1757 
   1758