Lines Matching refs:Lanes
6773 /// \brief Test whether there are elements crossing 128-bit lanes in this
6795 /// non-trivial to compute in the face of undef lanes. The representation is
6808 // This entry crosses lanes, so there is no way to model this shuffle.
6862 /// shuffling 4 lanes. It can be used with most of the PSHUF instructions for
6887 /// zero. Many x86 shuffles can zero lanes cheaply and we often want to handle
6888 /// as many lanes with this technique as possible to simplify the remaining
7139 // We can lower these with PBLENDW which is mirrored across 128-bit lanes.
7286 /// rotate* of the vector lanes.
7702 // The SSE4A EXTRQ instruction can efficiently extend the first 2 lanes
7783 /// inputs to explicitly zero-extend and undef-lanes (sometimes undef due to
8025 // If we have 4 or fewer lanes we can cheaply shuffle the element into
8511 assert(Mask[0] != -1 && "No undef lanes in multi-input v2 shuffles!");
8512 assert(Mask[1] != -1 && "No undef lanes in multi-input v2 shuffles!");
8631 // This will only ever happen in the high lanes because we commute the
8657 // Handle the easy case where we have V1 in the low lanes and V2 in the
8658 // high lanes.
8670 // We have a mixture of V1 and V2 in both low and high lanes. Rather than
8883 /// The lowering strategy is to try to form pairs of input lanes which are
8885 /// to place them onto the right half, and finally unpack the paired lanes into
8892 /// This code also handles repeated 128-bit lanes of v8i16 shuffles, but each
9523 /// Any of these lanes can of course be undef.
9549 // Ignore undef lanes, we'll optimistically collapse them to the pattern we
9734 // PSHUFB approach because of its ability to zero lanes.
9837 // Check if any of the odd lanes in the v16i8 are used. If not, we can mask
9929 // When zeroing, we need to spread the zeroing across both lanes to widen.
10141 /// \brief Lower a vector shuffle crossing multiple 128-bit lanes as
10142 /// a permutation and blend of those lanes.
10276 /// \brief Lower a vector shuffle by first fixing the 128-bit lanes and then
10279 /// This will only succeed when the result of fixing the 128-bit lanes results
10281 /// each 128-bit lanes. This handles many cases where we can quickly blend away
10301 SmallVector<int, 4> Lanes;
10302 Lanes.resize(NumLanes, -1);
10311 if (Lanes[j] < 0) {
10313 Lanes[j] = Mask[i] / LaneSize;
10314 } else if (Lanes[j] != Mask[i] / LaneSize) {
10329 // First shuffle the lanes into place.
10335 if (Lanes[i] >= 0) {
10336 LaneMask[2 * i + 0] = 2*Lanes[i] + 0;
10337 LaneMask[2 * i + 1] = 2*Lanes[i] + 1;
10467 // Try to simplify this by merging 128-bit lanes to enable a lane-based
10469 // we will be able to shuffle even across lanes the other input in a single
10516 // When the shuffle is mirrored between the 128-bit lanes of the unit, we can
10517 lanes.
10536 // lanes.
10551 // Try to simplify this by merging 128-bit lanes to enable a lane-based
10553 // we will be able to shuffle even across lanes the other input in a single
10621 // two 128-bit lanes use the variable mask to VPERMILPS.
10642 // Try to simplify this by merging 128-bit lanes to enable a lane-based
10691 // lanes.
10726 // Try to simplify this by merging 128-bit lanes to enable a lane-based
10818 // Try to simplify this by merging 128-bit lanes to enable a lane-based
10894 lanes to enable a lane-based
11236 /// to involve fewer lanes of wider elements, consolidate symmetric patterns
11378 // There are 2 lanes if (NumElems > 8), and 1 lane otherwise.
11379 // We don't handle the >2 lanes case right now.
11386 // Blend for v16i16 should be symmetric for the both lanes.
15160 // pre-SSE41 targets unpack lower lanes and then sign-extend using SRAI.
15409 // lanes.
18278 // The shift amount is a variable, but it is the same for all vector lanes.
18762 // all lanes and the upper i64 is ignored. These shuffle masks
18763 // optimally zero-extend each lanes on SSE2/SSE41/AVX targets.
18796 // zero - a negative value will set all bits of the lanes to true
18934 // set all bits of the lanes to true and VSELECT uses that in
23024 // The incoming lanes are zero or undef, it doesn't matter which ones we
23030 // Ok, we have non-zero lanes, map them through.
23096 "Mask doesn't repeat in high 128-bit lanes!");
23459 /// the operands which explicitly discard the lanes which are unused by this
24316 // don't rely on particular values of undef lanes.
26423 // operate independently on 128-bit lanes.