Home | History | Annotate | Download | only in X86

Lines Matching full:pshufb

3489   case X86ISD::PSHUFB:
4285 case X86ISD::PSHUFB: {
6777 // pshufb when available. We can only use more than 2 unpack instructions
6778 // when zero extending i8 elements which also makes it easier to use pshufb.
6787 DAG.getNode(X86ISD::PSHUFB, DL, MVT::v16i8, InputV,
8263 /// \brief Helper to form a PSHUFB-based shuffle+blend.
8296 V1 = DAG.getNode(X86ISD::PSHUFB, DL, MVT::v16i8,
8300 V2 = DAG.getNode(X86ISD::PSHUFB, DL, MVT::v16i8,
8426 // If we can't directly blend but can use PSHUFB, that will be better as it
8657 // with PSHUFB. It is important to do this before we attempt to generate any
8659 // lowerings can find an instruction sequence that is faster than a PSHUFB, we
8661 // a PSHUFB in the end. But once we start blending from multiple inputs,
8662 // the complexity of DAG combining bad patterns back into PSHUFB is too high,
8664 // PSHUFB approach because of its ability to zero lanes.
8673 SDValue PSHUFB = lowerVectorShuffleAsPSHUFB(DL, MVT::v16i8, V1, V2, Mask,
8678 // important as a single pshufb is significantly faster for that.
8691 // shuffles will both be pshufb, in which case we shouldn't bother with
8698 return PSHUFB;
9760 X86ISD::PSHUFB, DL, MVT::v32i8,
9849 X86ISD::PSHUFB, DL, MVT::v32i8, V1,
12088 // On AVX2, v8i32 -> v8i16 becomed PSHUFB.
12106 In = DAG.getNode(X86ISD::PSHUFB, DL, MVT::v32i8, In, BV);
12126 // The PSHUFB mask:
17537 case X86ISD::PSHUFB: return "X86ISD::PSHUFB";
19544 /// for this operation, or into a PSHUFB instruction which is a fully general
19680 // If we have 3 or more shuffle instructions or a chain involving PSHUFB, we
19681 // can replace them with a single PSHUFB instruction profitably. Intel's
19682 // manuals suggest only using PSHUFB if doing so replacing 5 instructions, but
19683 // in practice PSHUFB tends to be *very* fast so we're more aggressive.
19704 Op = DAG.getNode(X86ISD::PSHUFB, DL, ByteVT, Op, PSHUFBMaskOp);
19729 /// PSHUFB instruction if available. We do this as the last combining step
19730 /// to ensure we avoid using PSHUFB if we can implement the shuffle with
19741 /// would simplify under the threshold for PSHUFB formation because of
19822 case X86ISD::PSHUFB:
23913 case X86ISD::PSHUFB: