Home | History | Annotate | Download | only in X86

Lines Matching defs:Lower

10 // This file defines the interfaces that X86 uses to lower LLVM code into a
376 // If we don't have F16C support, then lower half float conversions
542 // Lower this to FGETSIGNx86 plus an AND.
876 // Custom lower build_vector, vector_shuffle, and extract_vector_elt.
928 // Custom lower v2i64 and v2f64 selects.
1265 // Custom lower several nodes for 256-bit types.
1276 // Do not attempt to custom lower other non-256-bit vectors
1574 // Custom lower several nodes.
1594 // Do not attempt to custom lower other non-512-bit vectors
1738 // We want to custom lower some of our intrinsics.
1747 // Only custom-lower 64-bit SADDO and friends on 64-bit because we don't
1993 // Do not use f64 to lower memcpy if source is string constant. It's
2373 /// Lower the result values of a call into the
3092 // that we can lower this successfully without moving the return address
3136 // Lower arguments at fp - stackoffset + fpdiff.
3314 // For tail calls lower the arguments to the 'real' stack slots. Sibcalls
3608 // Mask out lower bits, add stackalignment once plus the 12 bytes.
4594 // 1. Subvector should be inserted in the lower part (IdxVal == 0)
4618 // Zero lower bits of the Vec
4646 /// BUILD_VECTORS returns a larger BUILD_VECTOR while we're trying to lower
5086 /// Custom lower build_vector of v16i8.
5156 /// Custom lower build_vector of v8i16.
5187 /// Custom lower build_vector of v4i32 or v4f32.
5252 // See if we can lower this build_vector to a INSERTPS.
5778 // Lower BUILD_VECTOR operation for v8i1 and v16i1 types.
5975 /// the lower 128-bit of V0 and the upper 128-bit of V0. The second
5976 /// horizontal binop dag node would take as input the lower 128-bit of V1
5982 /// Otherwise, the first horizontal binop dag node takes as input the lower
5983 /// 128-bit of V0 and the lower 128-bit of V1, and the second horizontal binop
5989 /// If \p isUndefLO is set, then the algorithm propagates UNDEF to the lower
6133 /// Lower BUILD_VECTOR to a horizontal add/sub operation if possible.
6468 // Build both the lower and upper subvector.
6469 SDValue Lower = DAG.getNode(ISD::BUILD_VECTOR, dl, HVT,
6474 // Recreate the wider vector with the lower and upper part.
6476 return Concat128BitVectors(Lower, Upper, VT, NumElems, DAG, dl);
6477 return Concat256BitVectors(Lower, Upper, VT, NumElems, DAG, dl);
6739 // shuffles and lower them to optimal instruction patterns without leaving
7139 // We can lower these with PBLENDW which is mirrored across 128-bit lanes.
7155 // Attempt to lower to a bitmask if we can. VPAND is faster than VPBLENDVB.
7198 /// \brief Try to lower as a blend of elements from two inputs followed by
7256 // Try to lower with the simpler initial blend strategy unless one of the
7271 /// \brief Try to lower a vector shuffle as a byte rotation.
7276 /// try to generically lower a vector shuffle through such an pattern. It
7278 /// PSRLDQ/PSLLDQ/POR, only whether the mask is valid to lower in that form.
7292 assert(!isNoopShuffleMask(Mask) && "We shouldn't lower no-op shuffles!");
7400 /// \brief Try to lower a vector shuffle as a bit shift (shifts in zeros).
7490 /// \brief Try to lower a vector shuffle using SSE4a EXTRQ/INSERTQ.
7505 // EXTRQ: Extract Len elements from lower half of source, starting at Idx.
7506 // Remainder of lower half result is zero and upper half is all undef.
7509 // lower half that isn't zeroable.
7516 // Attempt to match first Len sequential elements from the lower half.
7527 // elements must be in the lower half.
7553 // INSERTQ: Extract lowest Len elements from lower half of second source and
7586 // Match the remaining elements of the lower half.
7621 /// \brief Lower a vector shuffle as a zero or any extension.
7776 /// \brief Try to lower a vector shuffle as a zero extension on any microarch.
7778 /// This routine will try to do everything in its power to cleverly lower
7801 // Define a helper function to check a particular ext-scale and lower to it if
7803 auto Lower = [&](int Scale) -> SDValue {
7874 if (SDValue V = Lower(NumElements / NumExtElements))
7883 // MOVQ, copying the lower 64-bits and zero-extending to the upper 64-bits.
7943 /// \brief Try to lower insertion of a single element into a zero vector.
7945 /// This is a common pattern that we have especially efficient patterns to lower
7993 // If V1 can't be treated as a zero vector we have fewer options to lower
8046 /// \brief Try to lower broadcast of a single - truncated - integer element,
8055 "We can only lower integer broadcasts with AVX2!");
8099 /// \brief Try to lower broadcast of a single element.
8278 /// \brief Try to lower a shuffle as a permute of the inputs followed by an
8478 /// Tries to lower a 2-lane 64-bit shuffle using shuffle operations provided by
8572 // If we have direct support for blends, we should lower by decomposing into
8596 // To lower with a single SHUFPS we need to have the low half and high half
8606 /// \brief Lower a vector shuffle using the SHUFPS instruction.
8696 /// \brief Lower 4-lane 32-bit floating point shuffles.
8742 // There are special ways we can lower some single-element blends. However, we
8743 // have custom ways we can lower more complex single-element blends below that
8776 /// \brief Lower 4-lane i32 vector shuffles.
8791 // Whenever we can lower this as a zext, that instruction is strictly faster
8828 // There are special ways we can lower some single-element blends.
8858 // If we have direct support for blends, we should lower by decomposing into
8864 // Try to lower by permuting the inputs into an unpack instruction.
9409 // Whenever we can lower this as a zext, that instruction is strictly faster
9460 // There are special ways we can lower some single-element blends.
9580 /// This is a hybrid strategy to lower v16i8 vectors. It first attempts to
9726 // Check for SSSE3 which lets us lower all v16i8 shuffles much more directly
9771 // There are special ways we can lower some single-element blends.
9874 /// \brief Dispatching routine to lower various 128-bit x86 vector shuffles.
10095 "lower single-input shuffles as it "
10141 /// \brief Lower a vector shuffle crossing multiple 128-bit lanes as
10147 /// is lower than any other fully general cross-lane shuffle strategy I'm aware
10276 /// \brief Lower a vector shuffle by first fixing the 128-bit lanes and then
10477 // If we have AVX2 then we always want to lower with a blend because an v4 we
10500 assert(Subtarget->hasAVX2() && "We can only lower v4i64 with AVX2!");
10517 // use lower
10590 // options to efficiently lower the shuffle.
10648 // If we have AVX2 then we always want to lower with a blend because at v8 we
10671 assert(Subtarget->hasAVX2() && "We can only lower v8i32 with AVX2!");
10673 // Whenever we can lower this as a zext, that instruction is strictly faster
10750 assert(Subtarget->hasAVX2() && "We can only lower v16i16 with AVX2!");
10752 // Whenever we can lower this as a zext, that instruction is strictly faster
10841 assert(Subtarget->hasAVX2() && "We can only lower v32i8 with AVX2!");
10843 // Whenever we can lower this as a zext, that instruction is strictly faster
10904 /// \brief High-level routine to lower various 256-bit x86 vector shuffles.
10966 /// \brief Try to lower a vector shuffle as a 128-bit shuffles.
11107 assert(Subtarget->hasBWI() && "We can only lower v32i16 with AVX-512-BWI!");
11122 assert(Subtarget->hasBWI() && "We can only lower v64i8 with AVX-512-BWI!");
11128 /// \brief High-level routine to lower various 512-bit x86 vector shuffles.
11140 "Cannot lower 512-bit vectors w/ basic ISA!");
11149 // lower them. Each lowering routine of a given type is allowed to assume that
11177 // Lower vXi1 vector shuffles.
11188 "Cannot lower 512-bit vectors w/o basic ISA!");
11250 "Can't lower MMX shuffles");
11278 // simple ones. Directly lower these as a buildvector of zeros.
11321 // ensure that the sum of indices for V1 is equal to or lower than the sum
11323 // indices for V1 is lower than the number of odd indices for V2.
11417 /// \brief Try to lower a VSELECT instruction to a vector shuffle.
11451 // Try to lower this to a blend-style vector shuffle. This can handle all
11481 // FIXME: We should custom lower this by fixing the condition and using i8
11680 // Note if the lower 64 bits of the result of the UNPCKHPD is then stored
11872 // Lower a node with an EXTRACT_SUBVECTOR opcode. This may result in
11898 // Lower a node with an INSERT_SUBVECTOR opcode. This may result in a
12189 // Lower ISD::GlobalTLSAddress using the "general dynamic" model, 32 bit
12203 // Lower ISD::GlobalTLSAddress using the "general dynamic" model, 64 bit
12250 // Lower ISD::GlobalTLSAddress using the "initial exec" or "local exec" model.
12341 // Darwin only has one model of TLS. Lower to that.
12457 /// LowerShiftParts - Lower SRA_PARTS and friends, which return two i32 values
12533 "Unknown SINT_TO_FP to lower!");
12976 // to i16, i32 or i64, and we lower it to a legal sequence.
12978 // Otherwise we lower it to a sequence ending with a FIST, return a
13013 "Unknown FP_TO_INT to lower!");
13024 // We lower FP->int64 into FISTP64 followed by a load from a temporary
13033 default: llvm_unreachable("Invalid FP_TO_SINT to lower!");
13093 assert(DstTy == MVT::i64 && "Invalid FP_TO_SINT to lower!");
13174 // Use vpunpcklwd for 4 lower elements v8i16 -> v4i32.
13176 // Concat upper and lower parts.
13179 // Use vpunpckldq for 4 lower elements v4i32 -> v2i64.
13181 // Concat upper and lower parts.
13478 // into an FNABS. We'll lower the FABS after that if it is still in use.
13642 // Lower ISD::FGETSIGN to (AND (X86ISD::FGETSIGNx86 ...) 1).
14484 // Lower using XOP integer comparisons.
14592 assert(Subtarget->hasSSE2() && "Don't know how to lower!");
14599 // bits of the inputs before performing those operations. The lower
14636 assert(Subtarget->hasSSE2() && !FlipSigns && "Don't know how to lower!");
14645 // Make sure the lower and upper halves are both all-ones.
14697 // Lower (X & (1 << N)) == 0 to BT(X, N).
14698 // Lower ((X >>u N) & 1) != 0 to BT(X, N).
14699 // Lower ((X >>s N) & 1) != 0 to BT(X, N).
14815 // Lower FP selects into a CMP/AND/ANDN/OR sequence when the necessary SSE ops
15160 // pre-SSE41 targets unpack lower lanes and then sign-extend using SRAI.
15245 // Lower vector extended loads using a shuffle. If SSSE3 is not available we
15255 assert(RegVT.isVector() && "We only custom lower vector sext loads.");
15257 "We only custom lower integer vector sext loads.");
15260 assert(Subtarget->hasSSE2() && "We only custom lower sext loads with SSE2.");
15281 // integer 256-bit operations needed to directly lower a sextload is if we
15345 "Can only lower sext loads with a single scalar load!");
15367 "We only lower types that form legal widened vector types");
15728 // Lower dynamic stack allocation to _alloca call for Cygwin/Mingw targets.
15738 bool Lower = (Subtarget->isOSWindows() && !Subtarget->isTargetMachO()) ||
15757 if (!Lower) {
16078 // SSE/AVX packed shifts only use the lower 64-bit of the shift count.
16747 default: return SDValue(); // Don't custom lower most intrinsics.
16757 // return an integer value, not just an instruction so lower it to the ptest
17082 // also used to custom lower READCYCLECOUNTER nodes.
17161 /// \brief Lower intrinsics for TRUNCATE_TO_MEM case
17748 /// \brief Lower a vector CTLZ using native supported vector CTLZ instruction.
17993 // Lower v16i8/v32i8 mul as promotion to v8i16/v16i16 vector
18058 // Multiply, mask the lower 8bits of the lo/hi results and pack
18066 // Lower v4i32 mul as 2x shuffle, 2x pmuludq, 2x shuffle.
18069 "Should not custom lower when pmuldq is available!");
18091 "Only know how to lower V2I64/V4I64/V8I64 multiply");
18211 // Emit two multiplies, one for the lower 2 ints and one for the higher 2
18322 // Splat sign to upper i32 dst, and SRA upper i32 src to lower i32.
18325 SDValue Lower = getTargetVShiftByConstNode(X86ISD::VSRAI, dl, ExVT, Ex,
18328 Ex = DAG.getVectorShuffle(ExVT, dl, Upper, Lower, {5, 1, 7, 3});
18330 Ex = DAG.getVectorShuffle(ExVT, dl, Upper, Lower,
18333 // SRA upper i32, SHL whole i64 and select lower i32.
18336 SDValue Lower =
18338 Lower = DAG.getBitcast(ExVT, Lower);
18340 Ex = DAG.getVectorShuffle(ExVT, dl, Upper, Lower, {4, 1, 6, 3});
18342 Ex = DAG.getVectorShuffle(ExVT, dl, Upper, Lower,
18566 assert(Subtarget->hasSSE2() && "Only custom lower when we have SSE2!");
18616 // If possible, lower this packed shift into a vector multiply instead of
18649 // Lower SHL with variable shift amount.
18660 // If possible, lower this shift as a sequence of two shifts by
18735 // immediate shifts, else we need to zero-extend each lane to the lower i64
18761 // The SSE2 shifts use the lower i64 as the same shift amount for
18805 // the 3 lower bits of each byte.
18835 // lower byte.
18877 // Logical shift the result back to the lower byte, leaving a zero upper
19052 // Lower the "add/sub/mul with overflow" instruction into a regular ins plus
19468 // The general idea is that every lower byte nibble in the input vector is an
19471 // higher nibbles for each byte and (2) a vector with the lower nibbles (and
20015 // require special handling for these nodes), lower them as literal NOOPs for
20036 // require special handling for these nodes), lower them as literal NOOPs for
20055 default: llvm_unreachable("Should not custom lower this!");
20672 // If lower 4G is not available, then we must use rip-relative addressing.
21439 // In this case we can lower all the CMOVs using a single inserted BB, and
21460 // Case 2, we lower cascaded CMOVs such as
21503 // If we lower both CMOVs in a single step, we can instead generate:
21562 // If we have a cascaded CMOV, we lower it to two successive branches to
23337 // moves upper half elements into the lower half part. For example:
24405 // subtarget. We custom lower VSELECT nodes with constant conditions and
24407 // lower, so we both check the operation's status and explicitly handle the
25021 // as above SHIFTs (only SHIFT on 1 has lower code size).
25619 // SHLD/SHRD instructions have lower register pressure, but on some
26306 // If we are a 64-bit capable x86, lower to a single movq load/store pair.
26326 // Otherwise, lower to two pairs of 32-bit loads / stores.
26817 // should be able to lower to FMAX/FMIN alone.
27909 // lower so don't worry about this.
28131 /// LowerAsmOperandForConstraint - Lower the specified operand into the Ops