Cross Reference: /external/llvm/lib/Target/X86/X86ISelLowering.cpp

Lines Matching defs:Lower
10 // This file defines the interfaces that X86 uses to lower LLVM code into a
376   // If we don't have F16C support, then lower half float conversions
542     // Lower this to FGETSIGNx86 plus an AND.
876     // Custom lower build_vector, vector_shuffle, and extract_vector_elt.
928     // Custom lower v2i64 and v2f64 selects.
1265     // Custom lower several nodes for 256-bit types.
1276       // Do not attempt to custom lower other non-256-bit vectors
1574     // Custom lower several nodes.
1594       // Do not attempt to custom lower other non-512-bit vectors
1738   // We want to custom lower some of our intrinsics.
1747   // Only custom-lower 64-bit SADDO and friends on 64-bit because we don't
1993       // Do not use f64 to lower memcpy if source is string constant. It's
2373 /// Lower the result values of a call into the
3092     // that we can lower this successfully without moving the return address
3136     // Lower arguments at fp - stackoffset + fpdiff.
3314   // For tail calls lower the arguments to the 'real' stack slots.  Sibcalls
3608     // Mask out lower bits, add stackalignment once plus the 12 bytes.
4594   // 1. Subvector should be inserted in the lower part (IdxVal == 0)
4618     // Zero lower bits of the Vec
4646 /// BUILD_VECTORS returns a larger BUILD_VECTOR while we're trying to lower
5086 /// Custom lower build_vector of v16i8.
5156 /// Custom lower build_vector of v8i16.
5187 /// Custom lower build_vector of v4i32 or v4f32.
5252   // See if we can lower this build_vector to a INSERTPS.
5778 // Lower BUILD_VECTOR operation for v8i1 and v16i1 types.
5975 /// the lower 128-bit of V0 and the upper 128-bit of V0. The second
5976 /// horizontal binop dag node would take as input the lower 128-bit of V1
5982 /// Otherwise, the first horizontal binop dag node takes as input the lower
5983 /// 128-bit of V0 and the lower 128-bit of V1, and the second horizontal binop
5989 /// If \p isUndefLO is set, then the algorithm propagates UNDEF to the lower
6133 /// Lower BUILD_VECTOR to a horizontal add/sub operation if possible.
6468     // Build both the lower and upper subvector.
6469     SDValue Lower = DAG.getNode(ISD::BUILD_VECTOR, dl, HVT,
6474     // Recreate the wider vector with the lower and upper part.
6476       return Concat128BitVectors(Lower, Upper, VT, NumElems, DAG, dl);
6477     return Concat256BitVectors(Lower, Upper, VT, NumElems, DAG, dl);
6739 // shuffles and lower them to optimal instruction patterns without leaving
7139       // We can lower these with PBLENDW which is mirrored across 128-bit lanes.
7155     // Attempt to lower to a bitmask if we can. VPAND is faster than VPBLENDVB.
7198 /// \brief Try to lower as a blend of elements from two inputs followed by
7256   // Try to lower with the simpler initial blend strategy unless one of the
7271 /// \brief Try to lower a vector shuffle as a byte rotation.
7276 /// try to generically lower a vector shuffle through such an pattern. It
7278 /// PSRLDQ/PSLLDQ/POR, only whether the mask is valid to lower in that form.
7292   assert(!isNoopShuffleMask(Mask) && "We shouldn't lower no-op shuffles!");
7400 /// \brief Try to lower a vector shuffle as a bit shift (shifts in zeros).
7490 /// \brief Try to lower a vector shuffle using SSE4a EXTRQ/INSERTQ.
7505   // EXTRQ: Extract Len elements from lower half of source, starting at Idx.
7506   // Remainder of lower half result is zero and upper half is all undef.
7509     // lower half that isn't zeroable.
7516     // Attempt to match first Len sequential elements from the lower half.
7527       // elements must be in the lower half.
7553   // INSERTQ: Extract lowest Len elements from lower half of second source and
7586         // Match the remaining elements of the lower half.
7621 /// \brief Lower a vector shuffle as a zero or any extension.
7776 /// \brief Try to lower a vector shuffle as a zero extension on any microarch.
7778 /// This routine will try to do everything in its power to cleverly lower
7801   // Define a helper function to check a particular ext-scale and lower to it if
7803   auto Lower = [&](int Scale) -> SDValue {
7874     if (SDValue V = Lower(NumElements / NumExtElements))
7883   // MOVQ, copying the lower 64-bits and zero-extending to the upper 64-bits.
7943 /// \brief Try to lower insertion of a single element into a zero vector.
7945 /// This is a common pattern that we have especially efficient patterns to lower
7993     // If V1 can't be treated as a zero vector we have fewer options to lower
8046 /// \brief Try to lower broadcast of a single - truncated - integer element,
8055          "We can only lower integer broadcasts with AVX2!");
8099 /// \brief Try to lower broadcast of a single element.
8278 /// \brief Try to lower a shuffle as a permute of the inputs followed by an
8478 /// Tries to lower a 2-lane 64-bit shuffle using shuffle operations provided by
8572   // If we have direct support for blends, we should lower by decomposing into
8596   // To lower with a single SHUFPS we need to have the low half and high half
8606 /// \brief Lower a vector shuffle using the SHUFPS instruction.
8696 /// \brief Lower 4-lane 32-bit floating point shuffles.
8742   // There are special ways we can lower some single-element blends. However, we
8743   // have custom ways we can lower more complex single-element blends below that
8776 /// \brief Lower 4-lane i32 vector shuffles.
8791   // Whenever we can lower this as a zext, that instruction is strictly faster
8828   // There are special ways we can lower some single-element blends.
8858   // If we have direct support for blends, we should lower by decomposing into
8864   // Try to lower by permuting the inputs into an unpack instruction.
9409   // Whenever we can lower this as a zext, that instruction is strictly faster
9460   // There are special ways we can lower some single-element blends.
9580 /// This is a hybrid strategy to lower v16i8 vectors. It first attempts to
9726   // Check for SSSE3 which lets us lower all v16i8 shuffles much more directly
9771   // There are special ways we can lower some single-element blends.
9874 /// \brief Dispatching routine to lower various 128-bit x86 vector shuffles.
10095                                             "lower single-input shuffles as it "
10141 /// \brief Lower a vector shuffle crossing multiple 128-bit lanes as
10147 /// is lower than any other fully general cross-lane shuffle strategy I'm aware
10276 /// \brief Lower a vector shuffle by first fixing the 128-bit lanes and then
10477   // If we have AVX2 then we always want to lower with a blend because an v4 we
10500   assert(Subtarget->hasAVX2() && "We can only lower v4i64 with AVX2!");
10517   // use lower
10590   // options to efficiently lower the shuffle.
10648   // If we have AVX2 then we always want to lower with a blend because at v8 we
10671   assert(Subtarget->hasAVX2() && "We can only lower v8i32 with AVX2!");
10673   // Whenever we can lower this as a zext, that instruction is strictly faster
10750   assert(Subtarget->hasAVX2() && "We can only lower v16i16 with AVX2!");
10752   // Whenever we can lower this as a zext, that instruction is strictly faster
10841   assert(Subtarget->hasAVX2() && "We can only lower v32i8 with AVX2!");
10843   // Whenever we can lower this as a zext, that instruction is strictly faster
10904 /// \brief High-level routine to lower various 256-bit x86 vector shuffles.
10966 /// \brief Try to lower a vector shuffle as a 128-bit shuffles.
11107   assert(Subtarget->hasBWI() && "We can only lower v32i16 with AVX-512-BWI!");
11122   assert(Subtarget->hasBWI() && "We can only lower v64i8 with AVX-512-BWI!");
11128 /// \brief High-level routine to lower various 512-bit x86 vector shuffles.
11140          "Cannot lower 512-bit vectors w/ basic ISA!");
11149   // lower them. Each lowering routine of a given type is allowed to assume that
11177 // Lower vXi1 vector shuffles.
11188          "Cannot lower 512-bit vectors w/o basic ISA!");
11250          "Can't lower MMX shuffles");
11278   // simple ones. Directly lower these as a buildvector of zeros.
11321   // ensure that the sum of indices for V1 is equal to or lower than the sum
11323   // indices for V1 is lower than the number of odd indices for V2.
11417 /// \brief Try to lower a VSELECT instruction to a vector shuffle.
11451   // Try to lower this to a blend-style vector shuffle. This can handle all
11481     // FIXME: We should custom lower this by fixing the condition and using i8
11680     // Note if the lower 64 bits of the result of the UNPCKHPD is then stored
11872 // Lower a node with an EXTRACT_SUBVECTOR opcode.  This may result in
11898 // Lower a node with an INSERT_SUBVECTOR opcode.  This may result in a
12189 // Lower ISD::GlobalTLSAddress using the "general dynamic" model, 32 bit
12203 // Lower ISD::GlobalTLSAddress using the "general dynamic" model, 64 bit
12250 // Lower ISD::GlobalTLSAddress using the "initial exec" or "local exec" model.
12341     // Darwin only has one model of TLS.  Lower to that.
12457 /// LowerShiftParts - Lower SRA_PARTS and friends, which return two i32 values
12533          "Unknown SINT_TO_FP to lower!");
12976 // to i16, i32 or i64, and we lower it to a legal sequence.
12978 // Otherwise we lower it to a sequence ending with a FIST, return a
13013          "Unknown FP_TO_INT to lower!");
13024   // We lower FP->int64 into FISTP64 followed by a load from a temporary
13033   default: llvm_unreachable("Invalid FP_TO_SINT to lower!");
13093     assert(DstTy == MVT::i64 && "Invalid FP_TO_SINT to lower!");
13174   //   Use vpunpcklwd for 4 lower elements  v8i16 -> v4i32.
13176   //   Concat upper and lower parts.
13179   //   Use vpunpckldq for 4 lower elements  v4i32 -> v2i64.
13181   //   Concat upper and lower parts.
13478   // into an FNABS. We'll lower the FABS after that if it is still in use.
13642   // Lower ISD::FGETSIGN to (AND (X86ISD::FGETSIGNx86 ...) 1).
14484   // Lower using XOP integer comparisons.
14592       assert(Subtarget->hasSSE2() && "Don't know how to lower!");
14599       // bits of the inputs before performing those operations. The lower
14636       assert(Subtarget->hasSSE2() && !FlipSigns && "Don't know how to lower!");
14645       // Make sure the lower and upper halves are both all-ones.
14697   // Lower (X & (1 << N)) == 0 to BT(X, N).
14698   // Lower ((X >>u N) & 1) != 0 to BT(X, N).
14699   // Lower ((X >>s N) & 1) != 0 to BT(X, N).
14815   // Lower FP selects into a CMP/AND/ANDN/OR sequence when the necessary SSE ops
15160   // pre-SSE41 targets unpack lower lanes and then sign-extend using SRAI.
15245 // Lower vector extended loads using a shuffle. If SSSE3 is not available we
15255   assert(RegVT.isVector() && "We only custom lower vector sext loads.");
15257          "We only custom lower integer vector sext loads.");
15260   assert(Subtarget->hasSSE2() && "We only custom lower sext loads with SSE2.");
15281     // integer 256-bit operations needed to directly lower a sextload is if we
15345          "Can only lower sext loads with a single scalar load!");
15367          "We only lower types that form legal widened vector types");
15728 // Lower dynamic stack allocation to _alloca call for Cygwin/Mingw targets.
15738   bool Lower = (Subtarget->isOSWindows() && !Subtarget->isTargetMachO()) ||
15757   if (!Lower) {
16078     // SSE/AVX packed shifts only use the lower 64-bit of the shift count.
16747   default: return SDValue();    // Don't custom lower most intrinsics.
16757   // return an integer value, not just an instruction so lower it to the ptest
17082 // also used to custom lower READCYCLECOUNTER nodes.
17161 /// \brief Lower intrinsics for TRUNCATE_TO_MEM case
17748 /// \brief Lower a vector CTLZ using native supported vector CTLZ instruction.
17993   // Lower v16i8/v32i8 mul as promotion to v8i16/v16i16 vector
18058     // Multiply, mask the lower 8bits of the lo/hi results and pack
18066   // Lower v4i32 mul as 2x shuffle, 2x pmuludq, 2x shuffle.
18069            "Should not custom lower when pmuldq is available!");
18091          "Only know how to lower V2I64/V4I64/V8I64 multiply");
18211   // Emit two multiplies, one for the lower 2 ints and one for the higher 2
18322       // Splat sign to upper i32 dst, and SRA upper i32 src to lower i32.
18325       SDValue Lower = getTargetVShiftByConstNode(X86ISD::VSRAI, dl, ExVT, Ex,
18328         Ex = DAG.getVectorShuffle(ExVT, dl, Upper, Lower, {5, 1, 7, 3});
18330         Ex = DAG.getVectorShuffle(ExVT, dl, Upper, Lower,
18333       // SRA upper i32, SHL whole i64 and select lower i32.
18336       SDValue Lower =
18338       Lower = DAG.getBitcast(ExVT, Lower);
18340         Ex = DAG.getVectorShuffle(ExVT, dl, Upper, Lower, {4, 1, 6, 3});
18342         Ex = DAG.getVectorShuffle(ExVT, dl, Upper, Lower,
18566   assert(Subtarget->hasSSE2() && "Only custom lower when we have SSE2!");
18616   // If possible, lower this packed shift into a vector multiply instead of
18649   // Lower SHL with variable shift amount.
18660   // If possible, lower this shift as a sequence of two shifts by
18735   // immediate shifts, else we need to zero-extend each lane to the lower i64
18761       // The SSE2 shifts use the lower i64 as the same shift amount for
18805     // the 3 lower bits of each byte.
18835       // lower byte.
18877       // Logical shift the result back to the lower byte, leaving a zero upper
19052   // Lower the "add/sub/mul with overflow" instruction into a regular ins plus
19468   // The general idea is that every lower byte nibble in the input vector is an
19471   // higher nibbles for each byte and (2) a vector with the lower nibbles (and
20015   // require special handling for these nodes), lower them as literal NOOPs for
20036   // require special handling for these nodes), lower them as literal NOOPs for
20055   default: llvm_unreachable("Should not custom lower this!");
20672     // If lower 4G is not available, then we must use rip-relative addressing.
21439   // In this case we can lower all the CMOVs using a single inserted BB, and
21460   // Case 2, we lower cascaded CMOVs such as
21503   // If we lower both CMOVs in a single step, we can instead generate:
21562   // If we have a cascaded CMOV, we lower it to two successive branches to
23337     // moves upper half elements into the lower half part. For example:
24405     // subtarget. We custom lower VSELECT nodes with constant conditions and
24407     // lower, so we both check the operation's status and explicitly handle the
25021   // as above SHIFTs (only SHIFT on 1 has lower code size).
25619   // SHLD/SHRD instructions have lower register pressure, but on some
26306     // If we are a 64-bit capable x86, lower to a single movq load/store pair.
26326     // Otherwise, lower to two pairs of 32-bit loads / stores.
26817   //       should be able to lower to FMAX/FMIN alone.
27909     // lower so don't worry about this.
28131 /// LowerAsmOperandForConstraint - Lower the specified operand into the Ops
OpenGrok