• LLVM学习笔记(62)


    4.4.3.3. X86TargetLowering子对象

    在X86Subtarget构造函数的314行,接着调用X86TargetLowering构造函数构建X86Subtarget中的该类型的子对象TLInfo。

    这个TargetLowering派生类,由基于SelectionDAG的指令选择器用于描述LLVM代码如何被降级为SelectionDAG操作。至于其他,这个类展示了:

    • 用于各种ValueType的一个初始寄存器类别,
    • 目标机器原生支持哪些操作,
    • Setcc操作的返回类型,
    • 可用作偏移数的类型,及
    • 各种高级特性,比如通过常量将除法转换为一组乘法是否合算

    4.4.3.3.1. TargetLowering

    首先看一下基类TargetLowering的构造函数。

    40       TargetLowering::TargetLowering(const TargetMachine &tm)

    41         : TargetLoweringBase(tm) {}

    TargetLoweringBase构造函数的定义如下。它为各个目标机器提供了基准设置,各目标机器可以在自己的TargetLowering派生类的构造函数里重新设置相关的参数。

    532     TargetLoweringBase::TargetLoweringBase(const TargetMachine &tm) : TM(tm) {

    533       initActions();

    534    

    535       // Perform these initializations only once.

    536       MaxStoresPerMemset = MaxStoresPerMemcpy = MaxStoresPerMemmove =

    537       MaxLoadsPerMemcmp = 8;

    538       MaxGluedStoresPerMemcpy = 0;

    539       MaxStoresPerMemsetOptSize = MaxStoresPerMemcpyOptSize

    540         = MaxStoresPerMemmoveOptSize = MaxLoadsPerMemcmpOptSize = 4;

    541       UseUnderscoreSetJmp = false;

    542       UseUnderscoreLongJmp = false;

    543       HasMultipleConditionRegisters = false;

    544       HasExtractBitsInsn = false;

    545       JumpIsExpensive = JumpIsExpensiveOverride;

    546       PredictableSelectIsExpensive = false;

    547       EnableExtLdPromotion = false;

    548       HasFloatingPointExceptions = true;

    549       StackPointerRegisterToSaveRestore = 0;

    550       BooleanContents = UndefinedBooleanContent;

    551       BooleanFloatContents = UndefinedBooleanContent;

    552       BooleanVectorContents = UndefinedBooleanContent;

    553       SchedPreferenceInfo = Sched::ILP;

    554       JumpBufSize = 0;

    555       JumpBufAlignment = 0;

    556       MinFunctionAlignment = 0;

    557       PrefFunctionAlignment = 0;

    558       PrefLoopAlignment = 0;

    559       GatherAllAliasesMaxDepth = 18;

    560       MinStackArgumentAlignment = 1;

    561       // TODO: the default will be switched to 0 in the next commit, along

    562      // with the Target-specific changes necessary.

    563       MaxAtomicSizeInBitsSupported = 1024;

    564    

    565       MinCmpXchgSizeInBits = 0;

    566       SupportsUnalignedAtomics = false;

    567    

    568       std::fill(std::begin(LibcallRoutineNames), std::end(LibcallRoutineNames), nullptr);

    569    

    570       InitLibcalls(TM.getTargetTriple());

    571       InitCmpLibcallCCs(CmpLibcallCCs);

    572     }

    在750行调用initActions()初始化各种action。从下面的代码可以看到,有这些action:OpActions、LoadExtActions、TruncStoreActions、IndexedModeActions与CondCodeActions。它们都是整数类型的数组,数组的内容则是一个LegalizeAction枚举类型。这个枚举类型表示指定的操作对一个目标机器是否合法。如果不是,应该采取什么行动使它们合法:

    43       namespace LegalizeActions {

    44       enum LegalizeAction : std::uint8_t {

    45         /// The operation is expected to be selectable directly by the target, and

    46         /// no transformation is necessary.

    47         Legal,

    48      

    49         /// The operation should be synthesized from multiple instructions acting on

    50         /// a narrower scalar base-type. For example a 64-bit add might be

    51         /// implemented in terms of 32-bit add-with-carry.

    52         NarrowScalar,

    53      

    54         /// The operation should be implemented in terms of a wider scalar

    55         /// base-type. For example a <2 x s8> add could be implemented as a <2

    56         /// x s32> add (ignoring the high bits).

    57         WidenScalar,

    58      

    59         /// The (vector) operation should be implemented by splitting it into

    60         /// sub-vectors where the operation is legal. For example a <8 x s64> add

    61         /// might be implemented as 4 separate <2 x s64> adds.

    62         FewerElements,

    63      

    64         /// The (vector) operation should be implemented by widening the input

    65         /// vector and ignoring the lanes added by doing so. For example <2 x i8> is

    66         /// rarely legal, but you might perform an <8 x i8> and then only look at

    67         /// the first two results.

    68         MoreElements,

    69      

    70         /// The operation itself must be expressed in terms of simpler actions on

    71         /// this target. E.g. a SREM replaced by an SDIV and subtraction.

    72         Lower,

    73      

    74         /// The operation should be implemented as a call to some kind of runtime

    75         /// support library. For example this usually happens on machines that don't

    76         /// support floating-point operations natively.

    77         Libcall,

    78      

    79         /// The target wants to do something special with this combination of

    80         /// operand and type. A callback will be issued when it is needed.

    81         Custom,

    82      

    83         /// This operation is completely unsupported on the target. A programming

    84         /// error has occurred.

    85         Unsupported,

    86      

    87         /// Sentinel value for when no action was found in the specified table.

    88         NotFound,

    89      

    90         /// Fall back onto the old rules.

    91         /// TODO: Remove this once we've migrated

    92         UseLegacyRules,

    93       };

    94       } // end namespace LegalizeActions

    因此上述数组的定义分别为:

    • LegalizeAction OpActions[MVT::LAST_VALUETYPE][ISD::BUILTIN_OP_END]

    对于每个操作符以及每个类型,保存一个LegalizeAction值,指示指令选择如何处理该操作。大多数操作是合法的(即目标机器原生支持),但是不支持的操作应该被描述。注意这里不考虑非法值类型上的操作。

    • uint16_t LoadExtActions[MVT::LAST_VALUETYPE][MVT::LAST_VALUETYPE]

    对于每个载入扩展类型以及每个值类型,保存一个LegalizeAction值,指示指令选择应如何应对涉及一个指定值类型及其扩展类型的载入。使用4比特为每个载入类型保存动作,4个载入类型为一组。

    • LegalizeAction TruncStoreActions[MVT::LAST_VALUETYPE][MVT::LAST_VALUETYPE]

    对于每个值类型对,保存一个LegalizeAction值,指示涉及一个指定值类型及其截断类型的截断载入是否合法。

    • uint8_t IndexedModeActions[MVT::LAST_VALUETYPE][ISD::LAST_INDEXED_MODE]

    其中ISD::LAST_INDEXED_MODE是内存地址索引模式的数量。对于每个索引模式以及每个值类型,保存一对LegalizeAction值来指示指令选择应如何应对保存及载入。第一维是参考的value_type。第二维代表读写的各种模式。

    • uint32_t CondCodeActions[ISD::SETCC_INVALID][(MVT::LAST_VALUETYPE + 7) / 8]

    其中ISD::SETCC_INVALID是LLVM IR条件指令的数量。因此对每个条件码(ISD::CondCode)保存一个LegalizeAction值,指示指令选择应如何处理该条件码。每个CC活动使用4比特。

    • 另外,TargetDAGCombineArray是另一个数组定义。它的类型是:

    unsigned char TargetDAGCombineArray[(ISD::BUILTIN_OP_END+CHAR_BIT-1)/CHAR_BIT]

    它是一个位图,每个LLVM IR操作对应一个位,如果是1,表示该操作期望使用目标机器的回调方法PerformDAGCombine()来执行指令合并。

    574     void TargetLoweringBase::initActions() {

    575       // All operations default to being supported.

    576       memset(OpActions, 0, sizeof(OpActions));

    577       memset(LoadExtActions, 0, sizeof(LoadExtActions));

    578       memset(TruncStoreActions, 0, sizeof(TruncStoreActions));

    579       memset(IndexedModeActions, 0, sizeof(IndexedModeActions));

    580       memset(CondCodeActions, 0, sizeof(CondCodeActions));

    581       std::fill(std::begin(RegClassForVT), std::end(RegClassForVT), nullptr);

    582       std::fill(std::begin(TargetDAGCombineArray),

    583                 std::end(TargetDAGCombineArray), 0);

    584    

    585       // Set default actions for various operations.

    586       for (MVT VT : MVT::all_valuetypes()) {

    587         // Default all indexed load / store to expand.

    588         for (unsigned IM = (unsigned)ISD::PRE_INC;

    589              IM != (unsigned)ISD::LAST_INDEXED_MODE; ++IM) {

    590           setIndexedLoadAction(IM, VT, Expand);

    591           setIndexedStoreAction(IM, VT, Expand);

    592         }

    593    

    594         // Most backends expect to see the node which just returns the value loaded.

    595        setOperationAction(ISD::ATOMIC_CMP_SWAP_WITH_SUCCESS, VT, Expand);

    596    

    597         // These operations default to expand.

    598         setOperationAction(ISD::FGETSIGN, VT, Expand);

    599         setOperationAction(ISD::CONCAT_VECTORS, VT, Expand);

    600         setOperationAction(ISD::FMINNUM, VT, Expand);

    601         setOperationAction(ISD::FMAXNUM, VT, Expand);

    602         setOperationAction(ISD::FMINNAN, VT, Expand);

    603         setOperationAction(ISD::FMAXNAN, VT, Expand);

    604         setOperationAction(ISD::FMAD, VT, Expand);

    605         setOperationAction(ISD::SMIN, VT, Expand);

    606         setOperationAction(ISD::SMAX, VT, Expand);

    607         setOperationAction(ISD::UMIN, VT, Expand);

    608         setOperationAction(ISD::UMAX, VT, Expand);

    609         setOperationAction(ISD::ABS, VT, Expand);

    610    

    611         // Overflow operations default to expand

    612         setOperationAction(ISD::SADDO, VT, Expand);

    613         setOperationAction(ISD::SSUBO, VT, Expand);

    614         setOperationAction(ISD::UADDO, VT, Expand);

    615         setOperationAction(ISD::USUBO, VT, Expand);

    616         setOperationAction(ISD::SMULO, VT, Expand);

    617         setOperationAction(ISD::UMULO, VT, Expand);

    618    

    619         // ADDCARRY operations default to expand

    620         setOperationAction(ISD::ADDCARRY, VT, Expand);

    621         setOperationAction(ISD::SUBCARRY, VT, Expand);

    622         setOperationAction(ISD::SETCCCARRY, VT, Expand);

    623    

    624         // ADDC/ADDE/SUBC/SUBE default to expand.

    625         setOperationAction(ISD::ADDC, VT, Expand);

    626         setOperationAction(ISD::ADDE, VT, Expand);

    627         setOperationAction(ISD::SUBC, VT, Expand);

    628         setOperationAction(ISD::SUBE, VT, Expand);

    629    

    630         // These default to Expand so they will be expanded to CTLZ/CTTZ by default.

    631         setOperationAction(ISD::CTLZ_ZERO_UNDEF, VT, Expand);

    632         setOperationAction(ISD::CTTZ_ZERO_UNDEF, VT, Expand);

    633    

    634         setOperationAction(ISD::BITREVERSE, VT, Expand);

    635    

    636         // These library functions default to expand.

    637         setOperationAction(ISD::FROUND, VT, Expand);

    638         setOperationAction(ISD::FPOWI, VT, Expand);

    639    

    640         // These operations default to expand for vector types.

    641         if (VT.isVector()) {

    642           setOperationAction(ISD::FCOPYSIGN, VT, Expand);

    643           setOperationAction(ISD::ANY_EXTEND_VECTOR_INREG, VT, Expand);

    644           setOperationAction(ISD::SIGN_EXTEND_VECTOR_INREG, VT, Expand);

    645           setOperationAction(ISD::ZERO_EXTEND_VECTOR_INREG, VT, Expand);

    646         }

    647    

    648         // For most targets @llvm.get.dynamic.area.offset just returns 0.

    649         setOperationAction(ISD::GET_DYNAMIC_AREA_OFFSET, VT, Expand);

    650       }

    651    

    652       // Most targets ignore the @llvm.prefetch intrinsic.

    653       setOperationAction(ISD::PREFETCH, MVT::Other, Expand);

    654    

    655      // Most targets also ignore the @llvm.readcyclecounter intrinsic.

    656       setOperationAction(ISD::READCYCLECOUNTER, MVT::i64, Expand);

    657    

    658       // ConstantFP nodes default to expand.  Targets can either change this to

    659       // Legal, in which case all fp constants are legal, or use isFPImmLegal()

    660       // to optimize expansions for certain constants.

    661       setOperationAction(ISD::ConstantFP, MVT::f16, Expand);

    662       setOperationAction(ISD::ConstantFP, MVT::f32, Expand);

    663       setOperationAction(ISD::ConstantFP, MVT::f64, Expand);

    664       setOperationAction(ISD::ConstantFP, MVT::f80, Expand);

    665       setOperationAction(ISD::ConstantFP, MVT::f128, Expand);

    666    

    667       // These library functions default to expand.

    668       for (MVT VT : {MVT::f32, MVT::f64, MVT::f128}) {

    669         setOperationAction(ISD::FLOG ,      VT, Expand);

    670         setOperationAction(ISD::FLOG2,      VT, Expand);

    671         setOperationAction(ISD::FLOG10,     VT, Expand);

    672         setOperationAction(ISD::FEXP ,      VT, Expand);

    673         setOperationAction(ISD::FEXP2,      VT, Expand);

    674         setOperationAction(ISD::FFLOOR,     VT, Expand);

    675         setOperationAction(ISD::FNEARBYINT, VT, Expand);

    676         setOperationAction(ISD::FCEIL,      VT, Expand);

    677         setOperationAction(ISD::FRINT,      VT, Expand);

    678         setOperationAction(ISD::FTRUNC,     VT, Expand);

    679         setOperationAction(ISD::FROUND,     VT, Expand);

    680       }

    681    

    682       // Default ISD::TRAP to expand (which turns it into abort).

    683       setOperationAction(ISD::TRAP, MVT::Other, Expand);

    684    

    685       // On most systems, DEBUGTRAP and TRAP have no difference. The "Expand"

    686       // here is to inform DAG Legalizer to replace DEBUGTRAP with TRAP.

    687       setOperationAction(ISD::DEBUGTRAP, MVT::Other, Expand);

    688     }

    576~580行将所有这些容器都置0了,意味着所有的action都是合法的,而且所有的操作都不需要回调PerformDAGCombine。接下来的代码将个别的操作设置为Expand,下面会看到X86的派生类型还会进行自己的改写。

    执行完initActions()后,在TargetLoweringBase构造函数,接下来初始化这些参数成员。

    • MaxStoresPerMemset
    • MaxLoadsPerMemcmp
    • MaxStoresPerMemcpy
    • MaxStoresPerMemmove

    在降级@llvm.memset/@llvm.memcpy/@llvm.memmove时,这个域指明替换memset/memcpy/ memmove调用所需的最大储存次数。目标机器必须基于代价门限设置这个值。应该假设目标机器将根据对齐限制,首先使用尽可能多的最大的储存操作,然后如果需要较小的操作。例如,在32位机器上以16比特对齐保存9字节将导致4次2字节储存与1次单字节储存。这仅适用于设置一个常量大小的常量数组。

    • MaxStoresPerMemcpyOptSize
    • MaxLoadsPerMemcmpOptSize
    • MaxStoresPerMemmoveOptSize

    替换memcpy/memmove调用的最大储存次数,用于带有OptSize属性的函数。

    • UseUnderscoreSetJmp,UseUnderscoreLongJmp

    表示是否使用_setjmp或_longjmp来实现llvm.setjmp或llvm.longjmp。

    • MaxGluedStoresPerMemcpy

    在基于MaxStoresPerMemcpy内联memcpy时,说明保持在一起的最大储存指令数。这有助于后面的成对与向量化。

    • HasMultipleConditionRegisters

    告诉代码生成器目标机器是否有多个(可分配)条件寄存器用于保存比较结果。如果有多个条件寄存器,代码生成器就不会激进地将比较下沉到使用者所在基本块。

    • HasExtractBitsInsn

    告诉代码生成器目标机器是否有BitExtract指令。如果对BitExtract指令,使用者生成一个与shift组合的and指令,代码生成器将激进地将shift下沉到使用者所在基本块。

    • JumpIsExpensive

    告诉代码生成器不要生成额外的流控指令,而是应该尝试通过预测合并流控指令。

    • PredictableSelectIsExpensive

    告诉代码生成器,如果一个分支的预测通常是正确的,select比该跳转代价要更高。

    • EnableExtLdPromotion

    表示目标机器是否希望使用将ext(promotableInst1(...(promotableInstN(load))))转换为promotedInst1(...(promotedInstN(ext(load))))的优化。

    • HasFloatingPointExceptions

    表示目标机器是否支持或在意保留浮点数的异常行为。

    • StackPointerRegisterToSaveRestore

    如果设置为一个物理寄存器,就指定了llvm.savestack及llvm.restorestack应该保存及恢复的寄存器。

    • BooleanContents
    • BooleanFloatContents
    • BooleanVectorContents

    它们都是BooleanContent枚举类型,其定义如下:

    140       enum BooleanContent {

    141         UndefinedBooleanContent,    // Only bit 0 counts, the rest can hold garbage.

    142         ZeroOrOneBooleanContent,        // All bits zero except for bit 0.

    143         ZeroOrNegativeOneBooleanContent // All bits equal to bit 0.

    144       };

    用于表示各自大于i1类型中的布尔值高位的内容。

    • SchedPreferenceInfo

    表示目标机器的调度偏好,通常为了达到总周期数最短或最低寄存器压力的目的。它的类型是Sched::Preference,这个枚举类型给出了LLVM目前支持的调度器类型。

    95           enum Preference {

    96             None,             // No preference

    97             Source,           // Follow source order.

    98             RegPressure,      // Scheduling for lowest register pressure.

    99             Hybrid,           // Scheduling for both latency and register pressure.

    100           ILP,              // Scheduling for ILP in low register pressure mode.

    101           VLIW              // Scheduling for VLIW targets.

    102         };

    103       }

    • JumpBufSize
    • JumpBufAlignment

    目标机器jmp_buf缓冲的字节数以及对齐要求。

    • MinFunctionAlignment
    • PrefFunctionAlignment
    • PrefLoopAlignment

    分别表示函数的最小对齐要求(用于优化代码大小时,防止显式提供的对齐要求导致错误代码),函数的期望对齐要求(用于没有对齐要求且优化速度时),以及期望的循环对齐要求。

    • MinStackArgumentAlignment

    栈上任何参数所需的最小对齐要求。

    在568行的容器LibcallRoutineNames的定义是:const char *LibcallRoutineNames[RTLIB:: UNKNOWN_LIBCALL]。其中RTLIB::UNKNOWN_LIBCALL是后端可以发布的运行时库函数调用的数量。这些库函数由RTLIB::Libcall枚举类型描述。这个表由下面的方法根据配置文件来填充:

    118     void TargetLoweringBase::InitLibcalls(const Triple &TT) {

    119     #define HANDLE_LIBCALL(code, name) \

    120       setLibcallName(RTLIB::code, name);

    121     #include "llvm/IR/RuntimeLibcalls.def"

    122     #undef HANDLE_LIBCALL

    123       // Initialize calling conventions to their default.

    124       for (int LC = 0; LC < RTLIB::UNKNOWN_LIBCALL; ++LC)

    125         setLibcallCallingConv((RTLIB::Libcall)LC, CallingConv::C);

    126    

    127       // A few names are different on particular architectures or environments.

    128       if (TT.isOSDarwin()) {

    129         // For f16/f32 conversions, Darwin uses the standard naming scheme, instead

    130         // of the gnueabi-style __gnu_*_ieee.

    131         // FIXME: What about other targets?

    132         setLibcallName(RTLIB::FPEXT_F16_F32, "__extendhfsf2");

    133         setLibcallName(RTLIB::FPROUND_F32_F16, "__truncsfhf2");

    134    

    135         // Some darwins have an optimized __bzero/bzero function.

    136         switch (TT.getArch()) {

    137         case Triple::x86:

    138         case Triple::x86_64:

    139           if (TT.isMacOSX() && !TT.isMacOSXVersionLT(10, 6))

    140             setLibcallName(RTLIB::BZERO, "__bzero");

    141           break;

    142         case Triple::aarch64:

    143           setLibcallName(RTLIB::BZERO, "bzero");

    144           break;

    145         default:

    146           break;

    147         }

    148    

    149         if (darwinHasSinCos(TT)) {

    150           setLibcallName(RTLIB::SINCOS_STRET_F32, "__sincosf_stret");

    151           setLibcallName(RTLIB::SINCOS_STRET_F64, "__sincos_stret");

    152           if (TT.isWatchABI()) {

    153             setLibcallCallingConv(RTLIB::SINCOS_STRET_F32,

    154                                   CallingConv::ARM_AAPCS_VFP);

    155             setLibcallCallingConv(RTLIB::SINCOS_STRET_F64,

    156                                   CallingConv::ARM_AAPCS_VFP);

    157           }

    158         }

    159       } else {

    160         setLibcallName(RTLIB::FPEXT_F16_F32, "__gnu_h2f_ieee");

    161         setLibcallName(RTLIB::FPROUND_F32_F16, "__gnu_f2h_ieee");

    162       }

    163    

    164       if (TT.isGNUEnvironment() || TT.isOSFuchsia()) {

    165         setLibcallName(RTLIB::SINCOS_F32, "sincosf");

    166         setLibcallName(RTLIB::SINCOS_F64, "sincos");

    167         setLibcallName(RTLIB::SINCOS_F80, "sincosl");

    168         setLibcallName(RTLIB::SINCOS_F128, "sincosl");

    169         setLibcallName(RTLIB::SINCOS_PPCF128, "sincosl");

    170       }

    171    

    172       if (TT.isOSOpenBSD()) {

    173         setLibcallName(RTLIB::STACKPROTECTOR_CHECK_FAIL, nullptr);

    174       }

    175     }

    配置文件RuntimeLibcalls.def定义了后端可以生成的运行时库调用。它包含的内容形如:

    HANDLE_LIBCALL(SHL_I16, "__ashlhi3")

    InitLibcalls()开头生成的宏定义会合成这样的枚举值:RTLIB::SHL_I16。这个枚举值实际上也是根据RuntimeLibcalls.def的内容生成的(RuntimeLibcalls.h)。因此,setLibcallName()就是以这些枚举值为下标记录对应的函数名。

    setLibcallCallingConv()则是以这些枚举值为下标记录对应函数使用的调用惯例。缺省都是与C调用惯例兼容的LLVM缺省调用惯例。

    571行的CmpLibcallCCs的定义是:ISD::CondCode CmpLibcallCCs[RTLIB::UNKNOWN_LIBCALL]。因此InitCmpLibcallCCs()是通过CmpLibcallCCs将RTLIB::Libcall中关于比较的函数关联到反映它们布尔结果的ISD::CondCode值。

     

  • 相关阅读:
    【C++】类和对象(中下)
    反射技巧让你的性能提升N倍
    CSS学习笔记05
    Java IO包中字符流Piped和CharArray简介说明
    全栈的自我修养 ———— js中的拖拽api
    es6中的let与const关键字及其与var关键字的不同
    SpringCloud 分布式锁与分布式事务
    ManageEngine第六次在Gartner的SIEM魔力象限中获认可
    sunxi-spi驱动的DMA配置
    Git协同开发解决冲突与Pr、Mr及路飞项目主页搭建
  • 原文地址:https://blog.csdn.net/wuhui_gdnt/article/details/134462064