Cross Reference: /external/llvm/lib/Target/X86/X86ISelLowering.cpp

Lines Matching refs:Lanes
6773 /// \brief Test whether there are elements crossing 128-bit lanes in this
6795 /// non-trivial to compute in the face of undef lanes. The representation is
6808       // This entry crosses lanes, so there is no way to model this shuffle.
6862 /// shuffling 4 lanes. It can be used with most of the PSHUF instructions for
6887 /// zero. Many x86 shuffles can zero lanes cheaply and we often want to handle
6888 /// as many lanes with this technique as possible to simplify the remaining
7139       // We can lower these with PBLENDW which is mirrored across 128-bit lanes.
7286 /// rotate* of the vector lanes.
7702   // The SSE4A EXTRQ instruction can efficiently extend the first 2 lanes
7783 /// inputs to explicitly zero-extend and undef-lanes (sometimes undef due to
8025     // If we have 4 or fewer lanes we can cheaply shuffle the element into
8511   assert(Mask[0] != -1 && "No undef lanes in multi-input v2 shuffles!");
8512   assert(Mask[1] != -1 && "No undef lanes in multi-input v2 shuffles!");
8631       // This will only ever happen in the high lanes because we commute the
8657       // Handle the easy case where we have V1 in the low lanes and V2 in the
8658       // high lanes.
8670       // We have a mixture of V1 and V2 in both low and high lanes. Rather than
8883 /// The lowering strategy is to try to form pairs of input lanes which are
8885 /// to place them onto the right half, and finally unpack the paired lanes into
8892 /// This code also handles repeated 128-bit lanes of v8i16 shuffles, but each
9523 /// Any of these lanes can of course be undef.
9549     // Ignore undef lanes, we'll optimistically collapse them to the pattern we
9734   // PSHUFB approach because of its ability to zero lanes.
9837   // Check if any of the odd lanes in the v16i8 are used. If not, we can mask
9929     // When zeroing, we need to spread the zeroing across both lanes to widen.
10141 /// \brief Lower a vector shuffle crossing multiple 128-bit lanes as
10142 /// a permutation and blend of those lanes.
10276 /// \brief Lower a vector shuffle by first fixing the 128-bit lanes and then
10279 /// This will only succeed when the result of fixing the 128-bit lanes results
10281 /// each 128-bit lanes. This handles many cases where we can quickly blend away
10301   SmallVector<int, 4> Lanes;
10302   Lanes.resize(NumLanes, -1);
10311     if (Lanes[j] < 0) {
10313       Lanes[j] = Mask[i] / LaneSize;
10314     } else if (Lanes[j] != Mask[i] / LaneSize) {
10329   // First shuffle the lanes into place.
10335     if (Lanes[i] >= 0) {
10336       LaneMask[2 * i + 0] = 2*Lanes[i] + 0;
10337       LaneMask[2 * i + 1] = 2*Lanes[i] + 1;
10467   // Try to simplify this by merging 128-bit lanes to enable a lane-based
10469   // we will be able to shuffle even across lanes the other input in a single
10516   // When the shuffle is mirrored between the 128-bit lanes of the unit, we can
10517 lanes.
10536   // lanes.
10551   // Try to simplify this by merging 128-bit lanes to enable a lane-based
10553   // we will be able to shuffle even across lanes the other input in a single
10621   // two 128-bit lanes use the variable mask to VPERMILPS.
10642   // Try to simplify this by merging 128-bit lanes to enable a lane-based
10691   // lanes.
10726   // Try to simplify this by merging 128-bit lanes to enable a lane-based
10818   // Try to simplify this by merging 128-bit lanes to enable a lane-based
10894 lanes to enable a lane-based
11236 /// to involve fewer lanes of wider elements, consolidate symmetric patterns
11378   // There are 2 lanes if (NumElems > 8), and 1 lane otherwise.
11379   // We don't handle the >2 lanes case right now.
11386   // Blend for v16i16 should be symmetric for the both lanes.
15160   // pre-SSE41 targets unpack lower lanes and then sign-extend using SRAI.
15409     // lanes.
18278 // The shift amount is a variable, but it is the same for all vector lanes.
18762       // all lanes and the upper i64 is ignored. These shuffle masks
18763       // optimally zero-extend each lanes on SSE2/SSE41/AVX targets.
18796       // zero - a negative value will set all bits of the lanes to true
18934       // set all bits of the lanes to true and VSELECT uses that in
23024       // The incoming lanes are zero or undef, it doesn't matter which ones we
23030     // Ok, we have non-zero lanes, map them through.
23096                "Mask doesn't repeat in high 128-bit lanes!");
23459 /// the operands which explicitly discard the lanes which are unused by this
24316             // don't rely on particular values of undef lanes.
26423   // operate independently on 128-bit lanes.
OpenGrok