LLVM学习笔记（62) - 码农知识堂

LLVM学习笔记（62)
4.4.3.3. X86TargetLowering子对象

在X86Subtarget构造函数的314行，接着调用X86TargetLowering构造函数构建X86Subtarget中的该类型的子对象TLInfo。

这个TargetLowering派生类，由基于SelectionDAG的指令选择器用于描述LLVM代码如何被降级为SelectionDAG操作。至于其他，这个类展示了：
- 用于各种ValueType的一个初始寄存器类别，
- 目标机器原生支持哪些操作，
- Setcc操作的返回类型，
- 可用作偏移数的类型，及
- 各种高级特性，比如通过常量将除法转换为一组乘法是否合算
4.4.3.3.1. TargetLowering

首先看一下基类TargetLowering的构造函数。

40       TargetLowering::TargetLowering(const TargetMachine &tm)

41       : TargetLoweringBase(tm) {}

TargetLoweringBase构造函数的定义如下。它为各个目标机器提供了基准设置，各目标机器可以在自己的TargetLowering派生类的构造函数里重新设置相关的参数。

532     TargetLoweringBase::TargetLoweringBase(const TargetMachine &tm) : TM(tm) {

533     initActions();

534

535     // Perform these initializations only once.

536     MaxStoresPerMemset = MaxStoresPerMemcpy = MaxStoresPerMemmove =

537     MaxLoadsPerMemcmp = 8;

538     MaxGluedStoresPerMemcpy = 0;

539     MaxStoresPerMemsetOptSize = MaxStoresPerMemcpyOptSize

540         = MaxStoresPerMemmoveOptSize = MaxLoadsPerMemcmpOptSize = 4;

541     UseUnderscoreSetJmp = false;

542     UseUnderscoreLongJmp = false;

543     HasMultipleConditionRegisters = false;

544     HasExtractBitsInsn = false;

545     JumpIsExpensive = JumpIsExpensiveOverride;

546     PredictableSelectIsExpensive = false;

547     EnableExtLdPromotion = false;

548     HasFloatingPointExceptions = true;

549     StackPointerRegisterToSaveRestore = 0;

550     BooleanContents = UndefinedBooleanContent;

551     BooleanFloatContents = UndefinedBooleanContent;

552     BooleanVectorContents = UndefinedBooleanContent;

553     SchedPreferenceInfo = Sched::ILP;

554     JumpBufSize = 0;

555     JumpBufAlignment = 0;

556     MinFunctionAlignment = 0;

557     PrefFunctionAlignment = 0;

558     PrefLoopAlignment = 0;

559     GatherAllAliasesMaxDepth = 18;

560     MinStackArgumentAlignment = 1;

561     // TODO: the default will be switched to 0 in the next commit, along

562     // with the Target-specific changes necessary.

563     MaxAtomicSizeInBitsSupported = 1024;

564

565     MinCmpXchgSizeInBits = 0;

566     SupportsUnalignedAtomics = false;

567

568     std::fill(std::begin(LibcallRoutineNames), std::end(LibcallRoutineNames), nullptr);

569

570     InitLibcalls(TM.getTargetTriple());

571     InitCmpLibcallCCs(CmpLibcallCCs);

572     }

在750行调用initActions()初始化各种action。从下面的代码可以看到，有这些action：OpActions、LoadExtActions、TruncStoreActions、IndexedModeActions与CondCodeActions。它们都是整数类型的数组，数组的内容则是一个LegalizeAction枚举类型。这个枚举类型表示指定的操作对一个目标机器是否合法。如果不是，应该采取什么行动使它们合法：

43       namespace LegalizeActions {

44       enum LegalizeAction : std::uint8_t {

45       /// The operation is expected to be selectable directly by the target, and

46       /// no transformation is necessary.

47       Legal,

48

49       /// The operation should be synthesized from multiple instructions acting on

50       /// a narrower scalar base-type. For example a 64-bit add might be

51       /// implemented in terms of 32-bit add-with-carry.

52       NarrowScalar,

53

54       /// The operation should be implemented in terms of a wider scalar

55       /// base-type. For example a <2 x s8> add could be implemented as a <2

56       /// x s32> add (ignoring the high bits).

57       WidenScalar,

58

59       /// The (vector) operation should be implemented by splitting it into

60       /// sub-vectors where the operation is legal. For example a <8 x s64> add

61       /// might be implemented as 4 separate <2 x s64> adds.

62       FewerElements,

63

64       /// The (vector) operation should be implemented by widening the input

65       /// vector and ignoring the lanes added by doing so. For example <2 x i8> is

66       /// rarely legal, but you might perform an <8 x i8> and then only look at

67       /// the first two results.

68       MoreElements,

69

70       /// The operation itself must be expressed in terms of simpler actions on

71       /// this target. E.g. a SREM replaced by an SDIV and subtraction.

72       Lower,

73

74       /// The operation should be implemented as a call to some kind of runtime

75       /// support library. For example this usually happens on machines that don't

76       /// support floating-point operations natively.

77       Libcall,

78

79       /// The target wants to do something special with this combination of

80       /// operand and type. A callback will be issued when it is needed.

81       Custom,

82

83       /// This operation is completely unsupported on the target. A programming

84       /// error has occurred.

85       Unsupported,

86

87       /// Sentinel value for when no action was found in the specified table.

88       NotFound,

89

90       /// Fall back onto the old rules.

91       /// TODO: Remove this once we've migrated

92       UseLegacyRules,

93       };

94       } // end namespace LegalizeActions

因此上述数组的定义分别为：
- LegalizeAction OpActions[MVT::LAST_VALUETYPE][ISD::BUILTIN_OP_END]
对于每个操作符以及每个类型，保存一个LegalizeAction值，指示指令选择如何处理该操作。大多数操作是合法的（即目标机器原生支持），但是不支持的操作应该被描述。注意这里不考虑非法值类型上的操作。
- uint16_t LoadExtActions[MVT::LAST_VALUETYPE][MVT::LAST_VALUETYPE]
对于每个载入扩展类型以及每个值类型，保存一个LegalizeAction值，指示指令选择应如何应对涉及一个指定值类型及其扩展类型的载入。使用4比特为每个载入类型保存动作，4个载入类型为一组。
- LegalizeAction TruncStoreActions[MVT::LAST_VALUETYPE][MVT::LAST_VALUETYPE]
对于每个值类型对，保存一个LegalizeAction值，指示涉及一个指定值类型及其截断类型的截断载入是否合法。
- uint8_t IndexedModeActions[MVT::LAST_VALUETYPE][ISD::LAST_INDEXED_MODE]
其中ISD::LAST_INDEXED_MODE是内存地址索引模式的数量。对于每个索引模式以及每个值类型，保存一对LegalizeAction值来指示指令选择应如何应对保存及载入。第一维是参考的value_type。第二维代表读写的各种模式。
- uint32_t CondCodeActions[ISD::SETCC_INVALID][(MVT::LAST_VALUETYPE + 7) / 8]
其中ISD::SETCC_INVALID是LLVM IR条件指令的数量。因此对每个条件码（ISD::CondCode）保存一个LegalizeAction值，指示指令选择应如何处理该条件码。每个CC活动使用4比特。
- 另外，TargetDAGCombineArray是另一个数组定义。它的类型是：
unsigned char TargetDAGCombineArray[(ISD::BUILTIN_OP_END+CHAR_BIT-1)/CHAR_BIT]

它是一个位图，每个LLVM IR操作对应一个位，如果是1，表示该操作期望使用目标机器的回调方法PerformDAGCombine()来执行指令合并。

574     void TargetLoweringBase::initActions() {

575     // All operations default to being supported.

576     memset(OpActions, 0, sizeof(OpActions));

577     memset(LoadExtActions, 0, sizeof(LoadExtActions));

578     memset(TruncStoreActions, 0, sizeof(TruncStoreActions));

579     memset(IndexedModeActions, 0, sizeof(IndexedModeActions));

580     memset(CondCodeActions, 0, sizeof(CondCodeActions));

581     std::fill(std::begin(RegClassForVT), std::end(RegClassForVT), nullptr);

582     std::fill(std::begin(TargetDAGCombineArray),

583                 std::end(TargetDAGCombineArray), 0);

584

585     // Set default actions for various operations.

586     for (MVT VT : MVT::all_valuetypes()) {

587         // Default all indexed load / store to expand.

588         for (unsigned IM = (unsigned)ISD::PRE_INC;

589              IM != (unsigned)ISD::LAST_INDEXED_MODE; ++IM) {

590           setIndexedLoadAction(IM, VT, Expand);

591           setIndexedStoreAction(IM, VT, Expand);

592         }

593

594         // Most backends expect to see the node which just returns the value loaded.

595        setOperationAction(ISD::ATOMIC_CMP_SWAP_WITH_SUCCESS, VT, Expand);

596

597         // These operations default to expand.

598         setOperationAction(ISD::FGETSIGN, VT, Expand);

599         setOperationAction(ISD::CONCAT_VECTORS, VT, Expand);

600         setOperationAction(ISD::FMINNUM, VT, Expand);

601         setOperationAction(ISD::FMAXNUM, VT, Expand);

602         setOperationAction(ISD::FMINNAN, VT, Expand);

603         setOperationAction(ISD::FMAXNAN, VT, Expand);

604         setOperationAction(ISD::FMAD, VT, Expand);

605         setOperationAction(ISD::SMIN, VT, Expand);

606         setOperationAction(ISD::SMAX, VT, Expand);

607         setOperationAction(ISD::UMIN, VT, Expand);

608         setOperationAction(ISD::UMAX, VT, Expand);

609         setOperationAction(ISD::ABS, VT, Expand);

610

611         // Overflow operations default to expand

612         setOperationAction(ISD::SADDO, VT, Expand);

613         setOperationAction(ISD::SSUBO, VT, Expand);

614         setOperationAction(ISD::UADDO, VT, Expand);

615         setOperationAction(ISD::USUBO, VT, Expand);

616         setOperationAction(ISD::SMULO, VT, Expand);

617         setOperationAction(ISD::UMULO, VT, Expand);

618

619         // ADDCARRY operations default to expand

620         setOperationAction(ISD::ADDCARRY, VT, Expand);

621         setOperationAction(ISD::SUBCARRY, VT, Expand);

622         setOperationAction(ISD::SETCCCARRY, VT, Expand);

623

624         // ADDC/ADDE/SUBC/SUBE default to expand.

625         setOperationAction(ISD::ADDC, VT, Expand);

626         setOperationAction(ISD::ADDE, VT, Expand);

627         setOperationAction(ISD::SUBC, VT, Expand);

628         setOperationAction(ISD::SUBE, VT, Expand);

629

630         // These default to Expand so they will be expanded to CTLZ/CTTZ by default.

631         setOperationAction(ISD::CTLZ_ZERO_UNDEF, VT, Expand);

632         setOperationAction(ISD::CTTZ_ZERO_UNDEF, VT, Expand);

633

634         setOperationAction(ISD::BITREVERSE, VT, Expand);

635

636         // These library functions default to expand.

637         setOperationAction(ISD::FROUND, VT, Expand);

638         setOperationAction(ISD::FPOWI, VT, Expand);

639

640         // These operations default to expand for vector types.

641         if (VT.isVector()) {

642           setOperationAction(ISD::FCOPYSIGN, VT, Expand);

643           setOperationAction(ISD::ANY_EXTEND_VECTOR_INREG, VT, Expand);

644           setOperationAction(ISD::SIGN_EXTEND_VECTOR_INREG, VT, Expand);

645           setOperationAction(ISD::ZERO_EXTEND_VECTOR_INREG, VT, Expand);

646         }

647

648         // For most targets @llvm.get.dynamic.area.offset just returns 0.

649         setOperationAction(ISD::GET_DYNAMIC_AREA_OFFSET, VT, Expand);

650     }

651

652     // Most targets ignore the @llvm.prefetch intrinsic.

653     setOperationAction(ISD::PREFETCH, MVT::Other, Expand);

654

655     // Most targets also ignore the @llvm.readcyclecounter intrinsic.

656     setOperationAction(ISD::READCYCLECOUNTER, MVT::i64, Expand);

657

658     // ConstantFP nodes default to expand. Targets can either change this to

659     // Legal, in which case all fp constants are legal, or use isFPImmLegal()

660     // to optimize expansions for certain constants.

661     setOperationAction(ISD::ConstantFP, MVT::f16, Expand);

662     setOperationAction(ISD::ConstantFP, MVT::f32, Expand);

663     setOperationAction(ISD::ConstantFP, MVT::f64, Expand);

664     setOperationAction(ISD::ConstantFP, MVT::f80, Expand);

665     setOperationAction(ISD::ConstantFP, MVT::f128, Expand);

666

667     // These library functions default to expand.

668     for (MVT VT : {MVT::f32, MVT::f64, MVT::f128}) {

669         setOperationAction(ISD::FLOG ,      VT, Expand);

670         setOperationAction(ISD::FLOG2,      VT, Expand);

671         setOperationAction(ISD::FLOG10,     VT, Expand);

672         setOperationAction(ISD::FEXP ,      VT, Expand);

673         setOperationAction(ISD::FEXP2,      VT, Expand);

674         setOperationAction(ISD::FFLOOR,     VT, Expand);

675         setOperationAction(ISD::FNEARBYINT, VT, Expand);

676         setOperationAction(ISD::FCEIL,      VT, Expand);

677         setOperationAction(ISD::FRINT,      VT, Expand);

678         setOperationAction(ISD::FTRUNC,     VT, Expand);

679         setOperationAction(ISD::FROUND,     VT, Expand);

680     }

681

682     // Default ISD::TRAP to expand (which turns it into abort).

683     setOperationAction(ISD::TRAP, MVT::Other, Expand);

684

685     // On most systems, DEBUGTRAP and TRAP have no difference. The "Expand"

686     // here is to inform DAG Legalizer to replace DEBUGTRAP with TRAP.

687     setOperationAction(ISD::DEBUGTRAP, MVT::Other, Expand);

688     }

576~580行将所有这些容器都置0了，意味着所有的action都是合法的，而且所有的操作都不需要回调PerformDAGCombine。接下来的代码将个别的操作设置为Expand，下面会看到X86的派生类型还会进行自己的改写。

执行完initActions()后，在TargetLoweringBase构造函数，接下来初始化这些参数成员。
- MaxStoresPerMemset
- MaxLoadsPerMemcmp
- MaxStoresPerMemcpy
- MaxStoresPerMemmove
在降级@llvm.memset/@llvm.memcpy/@llvm.memmove时，这个域指明替换memset/memcpy/ memmove调用所需的最大储存次数。目标机器必须基于代价门限设置这个值。应该假设目标机器将根据对齐限制，首先使用尽可能多的最大的储存操作，然后如果需要较小的操作。例如，在32位机器上以16比特对齐保存9字节将导致4次2字节储存与1次单字节储存。这仅适用于设置一个常量大小的常量数组。
- MaxStoresPerMemcpyOptSize
- MaxLoadsPerMemcmpOptSize
- MaxStoresPerMemmoveOptSize
替换memcpy/memmove调用的最大储存次数，用于带有OptSize属性的函数。
- UseUnderscoreSetJmp，UseUnderscoreLongJmp
表示是否使用_setjmp或_longjmp来实现llvm.setjmp或llvm.longjmp。
- MaxGluedStoresPerMemcpy
在基于MaxStoresPerMemcpy内联memcpy时，说明保持在一起的最大储存指令数。这有助于后面的成对与向量化。
- HasMultipleConditionRegisters
告诉代码生成器目标机器是否有多个（可分配）条件寄存器用于保存比较结果。如果有多个条件寄存器，代码生成器就不会激进地将比较下沉到使用者所在基本块。
- HasExtractBitsInsn
告诉代码生成器目标机器是否有BitExtract指令。如果对BitExtract指令，使用者生成一个与shift组合的and指令，代码生成器将激进地将shift下沉到使用者所在基本块。
- JumpIsExpensive
告诉代码生成器不要生成额外的流控指令，而是应该尝试通过预测合并流控指令。
- PredictableSelectIsExpensive
告诉代码生成器，如果一个分支的预测通常是正确的，select比该跳转代价要更高。
- EnableExtLdPromotion
表示目标机器是否希望使用将ext(promotableInst1(...(promotableInstN(load))))转换为promotedInst1(...(promotedInstN(ext(load))))的优化。
- HasFloatingPointExceptions
表示目标机器是否支持或在意保留浮点数的异常行为。
- StackPointerRegisterToSaveRestore
如果设置为一个物理寄存器，就指定了llvm.savestack及llvm.restorestack应该保存及恢复的寄存器。
- BooleanContents
- BooleanFloatContents
- BooleanVectorContents
它们都是BooleanContent枚举类型，其定义如下：

140     enum BooleanContent {

141         UndefinedBooleanContent,    // Only bit 0 counts, the rest can hold garbage.

142         ZeroOrOneBooleanContent,        // All bits zero except for bit 0.

143         ZeroOrNegativeOneBooleanContent // All bits equal to bit 0.

144     };

用于表示各自大于i1类型中的布尔值高位的内容。
- SchedPreferenceInfo
表示目标机器的调度偏好，通常为了达到总周期数最短或最低寄存器压力的目的。它的类型是Sched::Preference，这个枚举类型给出了LLVM目前支持的调度器类型。

95           enum Preference {

96             None,             // No preference

97             Source,           // Follow source order.

98             RegPressure,      // Scheduling for lowest register pressure.

99             Hybrid,           // Scheduling for both latency and register pressure.

100           ILP,              // Scheduling for ILP in low register pressure mode.

101           VLIW              // Scheduling for VLIW targets.

102         };

103     }
- JumpBufSize
- JumpBufAlignment
目标机器jmp_buf缓冲的字节数以及对齐要求。
- MinFunctionAlignment
- PrefFunctionAlignment
- PrefLoopAlignment
分别表示函数的最小对齐要求（用于优化代码大小时，防止显式提供的对齐要求导致错误代码），函数的期望对齐要求（用于没有对齐要求且优化速度时），以及期望的循环对齐要求。
- MinStackArgumentAlignment
栈上任何参数所需的最小对齐要求。

在568行的容器LibcallRoutineNames的定义是：const char *LibcallRoutineNames[RTLIB:: UNKNOWN_LIBCALL]。其中RTLIB::UNKNOWN_LIBCALL是后端可以发布的运行时库函数调用的数量。这些库函数由RTLIB::Libcall枚举类型描述。这个表由下面的方法根据配置文件来填充：

118     void TargetLoweringBase::InitLibcalls(const Triple &TT) {

119     #define HANDLE_LIBCALL(code, name) \

120     setLibcallName(RTLIB::code, name);

121     #include "llvm/IR/RuntimeLibcalls.def"

122     #undef HANDLE_LIBCALL

123     // Initialize calling conventions to their default.

124     for (int LC = 0; LC < RTLIB::UNKNOWN_LIBCALL; ++LC)

125         setLibcallCallingConv((RTLIB::Libcall)LC, CallingConv::C);

126

127     // A few names are different on particular architectures or environments.

128     if (TT.isOSDarwin()) {

129         // For f16/f32 conversions, Darwin uses the standard naming scheme, instead

130         // of the gnueabi-style __gnu_*_ieee.

131         // FIXME: What about other targets?

132         setLibcallName(RTLIB::FPEXT_F16_F32, "__extendhfsf2");

133         setLibcallName(RTLIB::FPROUND_F32_F16, "__truncsfhf2");

134

135         // Some darwins have an optimized __bzero/bzero function.

136         switch (TT.getArch()) {

137         case Triple::x86:

138         case Triple::x86_64:

139           if (TT.isMacOSX() && !TT.isMacOSXVersionLT(10, 6))

140             setLibcallName(RTLIB::BZERO, "__bzero");

141           break;

142         case Triple::aarch64:

143           setLibcallName(RTLIB::BZERO, "bzero");

144           break;

145         default:

146           break;

147         }

148

149         if (darwinHasSinCos(TT)) {

150           setLibcallName(RTLIB::SINCOS_STRET_F32, "__sincosf_stret");

151           setLibcallName(RTLIB::SINCOS_STRET_F64, "__sincos_stret");

152           if (TT.isWatchABI()) {

153             setLibcallCallingConv(RTLIB::SINCOS_STRET_F32,

154                                   CallingConv::ARM_AAPCS_VFP);

155             setLibcallCallingConv(RTLIB::SINCOS_STRET_F64,

156                                   CallingConv::ARM_AAPCS_VFP);

157           }

158         }

159     } else {

160         setLibcallName(RTLIB::FPEXT_F16_F32, "__gnu_h2f_ieee");

161         setLibcallName(RTLIB::FPROUND_F32_F16, "__gnu_f2h_ieee");

162     }

163

164     if (TT.isGNUEnvironment() || TT.isOSFuchsia()) {

165         setLibcallName(RTLIB::SINCOS_F32, "sincosf");

166         setLibcallName(RTLIB::SINCOS_F64, "sincos");

167         setLibcallName(RTLIB::SINCOS_F80, "sincosl");

168         setLibcallName(RTLIB::SINCOS_F128, "sincosl");

169         setLibcallName(RTLIB::SINCOS_PPCF128, "sincosl");

170     }

171

172     if (TT.isOSOpenBSD()) {

173         setLibcallName(RTLIB::STACKPROTECTOR_CHECK_FAIL, nullptr);

174     }

175     }

配置文件RuntimeLibcalls.def定义了后端可以生成的运行时库调用。它包含的内容形如：

HANDLE_LIBCALL(SHL_I16, "__ashlhi3")

在InitLibcalls()开头生成的宏定义会合成这样的枚举值：RTLIB::SHL_I16。这个枚举值实际上也是根据RuntimeLibcalls.def的内容生成的（RuntimeLibcalls.h）。因此，setLibcallName()就是以这些枚举值为下标记录对应的函数名。

setLibcallCallingConv()则是以这些枚举值为下标记录对应函数使用的调用惯例。缺省都是与C调用惯例兼容的LLVM缺省调用惯例。

571行的CmpLibcallCCs的定义是：ISD::CondCode CmpLibcallCCs[RTLIB::UNKNOWN_LIBCALL]。因此InitCmpLibcallCCs()是通过CmpLibcallCCs将RTLIB::Libcall中关于比较的函数关联到反映它们布尔结果的ISD::CondCode值。
相关阅读:
自动巡查、自动换充电……浙江这两台无人机“巢穴”派大用场
 EthernetIP 转MODBUS RTU协议网关连接FANUC机器人作为EthernetIP通信从站
 千帆竞发-Redis分布式锁
 linux基本指令(下)
HarmonyOS 应用生命周期有哪些？按返回键会调用哪些生命周期？
Python从入门到入土-进阶语法
 亚洲央行部署外汇储备以对抗货币空头
 数字通信和fpga概述——杜勇版本学习笔记
 goerli 测试网资源整理
 一文读懂，WMS仓储管理系统与ERP有什么区别
原文地址：https://blog.csdn.net/wuhui_gdnt/article/details/134462064