• ARM64-内嵌汇编


    如《Using the GNU Compiler Collection-For gcc version 12.2.0》所说:
    The asm keyword allows you to embed assembler instructions within C code. GCC provides
    two forms of inline asm statements. A basic asm statement is one with no operands (see Section 6.47.1 [Basic Asm], page 652), while an extended asm statement (see Section 6.47.2
    [Extended Asm], page 653) includes one or more operands. The extended form is preferred
    for mixing C and assembly language within a function, but to include assembly language
    at top level you must use basic asm.
    asm关键字允许您在C代码中嵌入汇编指令。GCC提供了两种形式的内联asm语句。基本asm语句是没有操作数的(参见第6.47.1节[基本Asm],第652页),而扩展asm语句(参见第6.47.2节[扩展Asm],第653页)包含一个或多个操作数。扩展形式倾向用于在函数中混合C语言和汇编语言,但是要在顶层包含汇编语言,您必须使用基本asm语句。
    You can also use the asm keyword to override the assembler name for a C symbol, or to
    place a C variable in a specific register.
    您还可以使用asm关键字来覆盖C符号的汇编程序名称,或者在特定的寄存器中放置C变量。
    note:的下载地址为:Using the GNU Compiler Collection

    1 Basic Asm — Assembler Instructions Without Operands

    1.1 A basic asm statement has the following syntax:

    一个基本的asm语句具有以下语法:

    asm asm-qualifiers ( AssemblerInstructions )
    

    For the C language, the asm keyword is a GNU extension. When writing C code that can be compiled with ‘-ansi’ and the ‘-std’ options that select C dialects without GNU extensions, use __asm__ instead of asm (see Section 6.48 [Alternate Keywords], page 708).
    For the C++ language, asm is a standard keyword, but __asm__ can be used for code compiled
    with ‘-fno-asm’.
    对于C语言,asm关键字是一个GNU扩展名。当编写可以用‘-ansi’和‘-std’选项来选择没有GNU扩展的C语言时,使用__asm__来代替asm(参见第6.48节[替代关键字],第708页)。对于C++语言,asm是一个标准的关键字,但是__asm__可以用于使用‘-fno-asm’编译选项编译的代码。

    1.2 Qualifiers

    volatile The optional volatile qualifier has no effect. All basic asm blocks are implicitly volatile.
    inline If you use the inline qualifier, then for inlining purposes the size of the asm statement is taken as the smallest size possible (see Section 6.47.6 [Size of an asm], page 708).

    1.3 examples

    代码路径:arch/arm64/include/asm/barrier.h

    #define sev()           asm volatile("sev" : : : "memory")
    #define wfe()           asm volatile("wfe" : : : "memory")                                                                                              
    #define wfi()           asm volatile("wfi" : : : "memory")
    
    #define isb()           asm volatile("isb" : : : "memory")
    #define dmb(opt)        asm volatile("dmb " #opt : : : "memory")
    #define dsb(opt)        asm volatile("dsb " #opt : : : "memory")
    
    #define psb_csync()     asm volatile("hint #17" : : : "memory")
    #define __tsb_csync()   asm volatile("hint #18" : : : "memory")
    #define csdb()          asm volatile("hint #20" : : : "memory")
    

    2 Extended Asm - Assembler Instructions with C Expression Operands

    2.1 Extended Asm - syntax

    With extended asm you can read and write C variables from assembler and perform jumps from assembler code to C labels. Extended asm syntax uses colons (‘:’) to delimit the operand parameters after the assembler template:
    使用扩展的asm,您可以从汇编程序中读写C变量,并执行从汇编程序代码到C标签的跳转。扩展asm语法使用冒号(“:”)来分隔汇编器模板之后的操作数参数:

    asm asm-qualifiers ( AssemblerTemplate
    				: OutputOperands
    				[ : InputOperands
    				[ : Clobbers ] ])
    

    或者:

    asm asm-qualifiers ( AssemblerTemplate
    					: OutputOperands
    					: InputOperands
    					: Clobbers
    					: GotoLabels)
    

    2.2 Qualifiers

    volatile The typical use of extended asm statements is to manipulate input values to produce output values. However, your asm statements may also produce side effects. If so, you may need to use the volatile qualifier to disable certain optimizations. See [Volatile], page 655.
    扩展asm语句的典型用法是操作输入值以产生输出值。然而,你的asm语句也可能会产生副作用。如果是这样,您可能需要使用volatile 限定符来禁用某些优化。见[Volatile],第655页。
    inline If you use the inline qualifier, then for inlining purposes the size of the asm statement is taken as the smallest size possible (see Section 6.47.6 [Size of an asm], page 708).
    如果使用内联限定符,则为了内联目的,asm语句的大小是尽可能小(参见第6.47.6节[asm的大小],第708页)。
    goto This qualifier informs the compiler that the asm statement may perform a jump to one of the labels listed in the GotoLabels. See [GotoLabels], page 667.
    此限定符通知编译器,asm语句可能会执行跳转到Goto标签中列出的标签之一。参见[GotoLabels],第667页。

    2.3 Parameters

    2.3.1 AssemblerTemplate

    This is a literal string that is the template for the assembler code. It is a combination of fixed text and tokens that refer to the input, output, and goto parameters. See [AssemblerTemplate], page 657.
    这是一个作为汇编程序代码的模板的文字字符串。它是引用输入、输出和到参数的固定文本和标记的组合。请参见[AssemblerTemplate],第657页。

    2.3.2 OutputOperands

    int add(int a, int b)
    {
         int sum;
         __asm__ volatile (
                 "add %0, %1, %2"
                 :"=r"(sum)		/* OutputOperands */
                 :"r"(a), "r"(b)
                 :"cc"
         );
    
         return sum;
    }
    
    2.3.2.1 explanation

    A comma-separated list of the C variables modified by the instructions in the AssemblerTemplate.
    由汇编模板中的指令修改的C变量的逗号分隔列表。允许使用一个空的列表。
    An asm statement has zero or more output operands indicating the names of C variables modified by the assembler code.
    asm语句有零个或多个输出操作数,表示由汇编程序代码修改的C变量的名称。

    Operands are separated by commas. Each operand has this format:
    操作数用逗号分隔。每个操作数都具有以下格式:

    [ [asmSymbolicName] ] constraint (cvariablename)
    
    2.3.2.2 Flag Output Operands

    Some targets have a special register that holds the “flags” for the result of an operation or comparison. Normally, the contents of that register are either unmodifed by the asm, or the asm statement is considered to clobber the contents.
    有些目标有一个特殊的寄存器,它保存操作或比较结果的“标志”。通常,该寄存器的内容要么不被asm修改,要么asm语句被认为阻塞了这些内容。
    On some targets, a special form of output operand exists by which conditions in the flags register may be outputs of the asm. The set of conditions supported are target specific, but the general rule is that the output variable must be a scalar integer, and the value is boolean. When supported, the target defines the preprocessor symbol GCC_ASM_FLAG_OUTPUTS.
    在某些目标上,存在一种特殊形式的输出操作数,通过它,标志寄存器中的条件可以作为asm的输出。支持的条件集是特定于目标的,但一般的规则是输出变量必须是一个标量整数,并且值为布尔值。当受到支持时,目标对象定义了预处理器符号__GCC_ASM_FLAG_OUTPUTS__
    Because of the special nature of the flag output operands, the constraint may not include alternatives.
    由于标志输出操作数的特殊性质,该约束可能不包括备选方案。
    Most often, the target has only one flags register, and thus is an implied operand of many instructions. In this case, the operand should not be referenced within the assembler template via %0 etc, as there’s no corresponding text in the assembly language.
    通常,目标只有一个标志寄存器,因此是许多指令的隐含操作数。在这种情况下,操作数不应该通过%0等在汇编程序模板中被引用,因为在汇编程序语言中没有相应的文本。
    ARM、AArch64
    The flag output constraints for the ARM family are of the form ‘=@cccond’ where cond is one of the standard conditions defined in the ARM ARM for ConditionHolds.
    ARM族的标志输出约束的形式为“=@cccond”,其中cond是在条件持有的ARM ARM中定义的标准条件之一。
    eq Z flag set, or equal
    ne Z flag clear or not equal
    cs/hs C flag set or unsigned greater than equal
    cc/lo C flag clear or unsigned less than
    mi N flag set or “minus”
    pl N flag clear or “plus”
    vs V flag set or signed overflow
    vc V flag clear
    hi unsigned greater than
    ls unsigned less than equal
    ge signed greater than equal
    lt signed less than
    gt signed greater than
    le signed less than equal
    The flag output constraints are not supported in thumb1 mode.

    2.3.3 InputOperands

    int add(int a, int b)
    {
         int sum;
         __asm__ volatile (
                 "add %0, %1, %2"
                 :"=r"(sum)	
                 :"r"(a), "r"(b)	/* InputOperands */
                 :"cc"
         );
    
         return sum;
    }
    

    A comma-separated list of C expressions read by the instructions in the AssemblerTemplate. An empty list is permitted.
    由汇编模板中的指令读取的以逗号分隔的C表达式列表。允许使用一个空的列表。
    Input operands make values from C variables and expressions available to the assembly code.
    Operands are separated by commas. Each operand has this format:

    [ [asmSymbolicName] ] constraint (cexpression)
    

    2.3.4 Clobbers and Scratch Registers

    2.3.4.1 example
    int add(int a, int b)
    {
         int sum;
         __asm__ volatile (
                 "add %0, %1, %2"
                 :"=r"(sum)	
                 :"r"(a), "r"(b)
                 :"cc"	/* Clobbers */
         );
    
         return sum;
    }
    
    2.3.4.2 explanation

    A comma-separated list of registers or other values changed by the AssemblerTemplate, beyond those listed as outputs. An empty list is permitted.
    以逗号分隔的寄存器或其他值的列表,除了列为输出的值。允许使用一个空的列表。
    While the compiler is aware of changes to entries listed in the output operands, the inline asm code may modify more than just the outputs. For example, calculations may require additional registers, or the processor may overwrite a register as a side effect of a particular assembler instruction. In order to inform the compiler of these changes, list them in the clobber list. Clobber list items are either register names or the special clobbers (listed below). Each clobber list item is a string constant enclosed in double quotes and separated by commas.
    虽然编译器知道对输出操作数中列出的条目的更改,但内联asm代码可能修改的不仅仅是输出。例如,计算可能需要额外的寄存器,或者处理器可能覆盖一个寄存器作为一个特定的汇编器指令的副作用。为了通知编译器这些更改,请在整理器列表中列出它们。分类列表项目是注册名称或特殊分类器(下面列出)。每个clobber列表项都是一个字符串常量,用双引号括起来,并用逗号分隔。

    2.3.4.3 two special clobber arguments:

    “cc” The “cc” clobber indicates that the assembler code modifies the flags register. On some machines, GCC represents the condition codes as a specific hardware register; “cc” serves to name this register. On other machines, condition code handling is different, and specifying “cc” has no effect. But it is valid no matter what the target.
    “cc”阻塞表示汇编代码修改标志寄存器。在某些机器上,GCC将条件代码表示为一个特定的硬件寄存器;“cc”用于命名这个寄存器。在其他机器上,条件代码处理是不同的,并且指定“cc”没有效果。但无论目标如何,它都是有效的。
    “memory” The “memory” clobber tells the compiler that the assembly code performs memory reads or writes to items other than those listed in the input and output operands (for example, accessing the memory pointed to by one of the input parameters). To ensure memory contains correct values, GCC may need to flush specific register values to memory before executing the asm. Further, the compiler does not assume that any values read from memory before an asm remain unchanged after that asm; it reloads them as needed. Using the “memory” clobber effectively forms a read/write memory barrier for the compiler.
    “memory”告诉编译器,汇编代码执行除输入和输出操作数以外的项目的内存读写(例如,访问其中一个输入参数指向的内存)。为了确保内存包含正确的值,GCC可能需要在执行asm之前将特定的寄存器值刷新到内存中。此外,编译器并不假设在一个asm之前从内存中读取的任何值在该asm之后保持不变;它会根据需要重新加载它们。使用“memory”阻塞器有效地为编译器形成了一个读/写内存障碍。
    Note that this clobber does not prevent the processor from doing speculative reads past the asm statement. To prevent that, you need processor-specific fence instructions.
    注意,这个阻塞器不会阻止处理器通过asm语句进行投机读取。为了防止这种情况发生,您需要特定于处理器的栅栏指令。

    2.3.5 GotoLabels

    2.3.5.1 explanation

    When you are using the goto form of asm, this section contains the list of all C labels to which the code in the AssemblerTemplate may jump.
    当您使用asm的goto形式时,本部分包含了汇编模板中的代码可能跳转到的所有C标签的列表。
    asm statements may not perform jumps into other asm statements, only to the listed GotoLabels. GCC’s optimizers do not know about other jumps; therefore they cannot take account of them when deciding how to optimize.
    asm语句可能不会执行跳转到其他asm语句中,而只执行到列出的GotoLabels中。GCC的优化器不知道其他跳转;因此,在决定如何进行优化时,他们不能考虑到它们。
    asm goto allows assembly code to jump to one or more C labels. The GotoLabels section in an asm goto statement contains a comma-separated list of all C labels to which the assembler code may jump. GCC assumes that asm execution falls through to the next statement (if this is not the case, consider using the __builtin_unreachable intrinsic after the asm statement). Optimization of asm goto may be improved by using the hot and cold label attributes.
    asm goto允许汇编代码跳转到一个或多个C标签。asm goto语句中的GotoLabels部分包含一个以逗号分隔的列表,其中包含汇编程序代码可能跳转到的所有C标签。GCC假设asm的执行属于下一个语句(如果不是这样,请考虑在asm语句之后使用固有的__builtin_unreachable)。可以通过使用热标签属性和冷标签属性来改进asm goto的优化。
    If the assembler code does modify anything, use the “memory” clobber to force the optimizers to flush all register values to memory and reload them if necessary after the asm statement.
    Also note that an asm goto statement is always implicitly considered volatile.
    如果汇编程序代码确实修改了任何内容,请使用“memory”阻塞器强制优化器将所有寄存器值刷新到内存中,并在必要时在asm语句之后重新加载它们。还要注意,asm goto语句总是被隐式地认为是不稳定的。

    2.3.5.2 example

    The following example shows an asm goto that uses a memory clobber.
    下面的例子显示了一个使用内存阻塞的asm转。

    int frob(int x)
    {
    	int y;
    	
    	asm goto ("frob %%r5, %1; jc %l[error]; mov (%2), %%r5"
    				: /* No outputs. */
    				: "r"(x), "r"(&y)
    				: "r5", "memory"
    				: error);
    	return y;
    	
    error:
    	return -1;
    }
    

    2.4 examples

    2.4.1 add内嵌汇编函数

    在这里插入图片描述

    2.4.2 arch_local_save_flags and arch_irqs_disabled_flags

    代码路径:arch/arm64/include/asm/irqflags.h

    2.4.2.1 arch_local_save_flags
    /*
     * Save the current interrupt enable state.
     */
    static inline unsigned long arch_local_save_flags(void)
    {
            unsigned long flags;
    
            asm volatile(ALTERNATIVE(
                    "mrs    %0, daif",
                    __mrs_s("%0", SYS_ICC_PMR_EL1),
                    ARM64_HAS_IRQ_PRIO_MASKING)
                    : "=&r" (flags)
                    :
                    : "memory");                                                                                                                            
    
            return flags;
    }
    
    2.4.2.2 arch_irqs_disabled_flags
    static inline int arch_irqs_disabled_flags(unsigned long flags)
    {
            int res;
    
            asm volatile(ALTERNATIVE(
                    "and    %w0, %w1, #" __stringify(PSR_I_BIT),
                    "eor    %w0, %w1, #" __stringify(GIC_PRIO_IRQON),
                    ARM64_HAS_IRQ_PRIO_MASKING)
                    : "=&r" (res)
                    : "r" ((int) flags)
                    : "memory");
    
            return res;
    }
    

    3 Constraints for asm Operands

    Here are specific details on what constraint letters you can use with asm operands. Constraints can say whether an operand may be in a register, and which kinds of register;
    whether the operand can be a memory reference, and which kinds of address; whether the operand may be an immediate constant, and which possible values it may have. Constraints can also require two operands to match. Side-effects aren’t allowed in operands of inline asm, unless ‘<’ or ‘>’ constraints are used, because there is no guarantee that the side effects will happen exactly once in an instruction that can update the addressing register.
    下面是关于可以使用asm操作数的约束字母的具体细节。约束可以说一个操作数是否可以在寄存器中,哪种寄存器;操作数是否可以是内存引用,哪种地址;操作数是否可以是一个直接常数,它可能有哪些可能的值。约束条件也可能需要两个操作数来匹配。在内联asm操作数中不允许副作用,除非使用“<”或“>”约束,因为不能保证在可以更新寻址寄存器的指令中副作用会发生一次。

    3.1 Simple Constraints

    The simplest kind of constraint is a string full of letters, each of which describes one kind of operand that is permitted. Here are the letters that are allowed:
    最简单的约束类型是一个充满字母的字符串,每个字母都描述了一种允许的操作数。以下是允许使用的字母:

    3.1.1 whitespace

    Whitespace characters are ignored and can be inserted at any position except the first. This enables each alternative for different operands to be visually aligned in the machine description even if they have different number of constraints and modifiers.

    3.1.2 ‘m’

    A memory operand is allowed, with any kind of address that the machine supports in general. Note that the letter used for the general memory constraint can be re-defined by a back end using the TARGET_MEM_CONSTRAINT macro.

    3.1.3 ‘o’

    A memory operand is allowed, but only if the address is offsettable. This means that adding a small integer (actually, the width in bytes of the operand, as determined by its machine mode) may be added to the address and the result is also a valid memory address.
    For example, an address which is constant is offsettable; so is an address that is the sum of a register and a constant (as long as a slightly larger constant is also within the range of address-offsets supported by the machine); but an autoincrement or autodecrement address is not offsettable. More complicated indirect/indexed addresses may or may not be offsettable depending on the other addressing modes that the machine supports.
    Note that in an output operand which can be matched by another operand, the constraint letter ‘o’ is valid only when accompanied by both ‘<’ (if the target machine has predecrement addressing) and ‘>’ (if the target machine has preincrement addressing).

    3.1.4 ‘V’

    A memory operand that is not offsettable. In other words, anything that would fit the ‘m’ constraint but not the ‘o’ constraint.

    3.1.5 ‘<’

    A memory operand with autodecrement addressing (either predecrement or postdecrement) is allowed. In inline asm this constraint is only allowed if the operand is used exactly once in an instruction that can handle the side effects.
    Not using an operand with ‘<’ in constraint string in the inline asm pattern at all or using it in multiple instructions isn’t valid, because the side effects wouldn’t be performed or would be performed more than once. Furthermore,
    on some targets the operand with ‘<’ in constraint string must be accompanied by special instruction suffixes like %U0 instruction suffix on PowerPC or %P0 on IA-64.

    3.1.6 ‘>’

    A memory operand with autoincrement addressing (either preincrement or postincrement) is allowed. In inline asm the same restrictions as for ‘<’ apply.

    3.1.7 ‘r’

    A register operand is allowed provided that it is in a general register.

    3.1.8 ‘i’

    An immediate integer operand (one with constant value) is allowed. This includes symbolic constants whose values will be known only at assembly time or
    later.

    3.1.9 ‘n’

    An immediate integer operand with a known numeric value is allowed. Many systems cannot support assembly-time constants for operands less than a word wide. Constraints for these operands should use ‘n’ rather than ‘i’.

    3.1.10 ‘I’, ‘J’, ‘K’, . . . ‘P’

    Other letters in the range ‘I’ through ‘P’ may be defined in a machine-dependent fashion to permit immediate integer operands with explicit integer values in specified ranges. For example, on the 68000, ‘I’ is defined to stand for the range of values 1 to 8. This is the range permitted as a shift count in the shift instructions.

    3.1.11 ‘E’

    An immediate floating operand (expression code const_double) is allowed, but only if the target floating point format is the same as that of the host machine (on which the compiler is running).

    3.1.12 ‘F’

    An immediate floating operand (expression code const_double or const_vector) is allowed.

    3.1.13 ‘G’, ‘H’

    ‘G’ and ‘H’ may be defined in a machine-dependent fashion to permit immediate floating operands in particular ranges of values.

    3.1.14 ‘s’

    An immediate integer operand whose value is not an explicit integer is allowed. This might appear strange; if an insn allows a constant operand with a value not known at compile time, it certainly must allow any known value. So why use ‘s’ instead of ‘i’? Sometimes it allows better code to be generated.
    For example, on the 68000 in a fullword instruction it is possible to use an immediate operand; but if the immediate value is between −128 and 127, better code results from loading the value into a register and using the register. This is because the load into the register can be done with a ‘moveq’ instruction. We arrange for this to happen by defining the letter ‘K’ to mean “any integer outside
    the range −128 to 127”, and then specifying ‘Ks’ in the operand constraints.

    3.1.15 ‘g’

    Any register, memory or immediate integer operand is allowed, except for registers that are not general registers.

    3.1.16 ‘X’

    Any operand whatsoever is allowed.

    3.1.17 ‘0’, ‘1’, ‘2’, . . . ‘9’

    An operand that matches the specified operand number is allowed. If a digit is used together with letters within the same alternative, the digit should come last.
    This number is allowed to be more than a single digit. If multiple digits are encountered onsecutively, they are interpreted as a single decimal integer. There is scant chance for ambiguity, since to-date it has never been desirable that ‘10’ be interpreted as matching either operand 1 or operand 0. Should this be desired, one can use multiple alternatives instead.
    This is called a matching constraint and what it really means is that the assembler has only a single operand that fills two roles which asm distinguishes. For example, an add instruction uses two input operands and an output operand, but on most CISC machines an add instruction really has only two operands, one of them an input-output operand:

    addl #35,r12
    

    Matching constraints are used in these circumstances. More precisely, the two operands that match must include one input-only operand and one output-only operand. Moreover, the digit must be a smaller number than the number of the operand that uses it in the constraint.

    3.1.18 ‘p’

    An operand that is a valid memory address is allowed. This is for “load address” and “push address” instructions. ‘p’ in the constraint must be accompanied by address_operand as the predicate
    in the match_operand. This predicate interprets the mode specified in the match_operand as the mode of the memory reference for which the address would be valid.

    3.2 Constraint Modifier Characters

    Here are constraint modifier characters.

    3.2.1 ‘=’

    Means that this operand is written to by this instruction: the previous value is discarded and replaced by new data.

    3.2.2 ‘+’

    Means that this operand is both read and written by the instruction.
    When the compiler fixes up the operands to satisfy the constraints, it needs to know which operands are read by the instruction and which are written by it.
    ‘=’ identifies an operand which is only written; ‘+’ identifies an operand that is both read and written; all other operands are assumed to only be read.
    If you specify ‘=’ or ‘+’ in a constraint, you put it in the first character of the constraint string.

    3.2.3 ‘&’

    Means (in a particular alternative) that this operand is an earlyclobber operand, which is written before the instruction is finished using the input operands. Therefore, this operand may not lie in a register that is read by the instruction or as part of any memory address.
    ‘&’ applies only to the alternative in which it is written. In constraints with multiple alternatives, sometimes one alternative requires ‘&’ while others do not. See, for example, the ‘movdf’ insn of the 68000.
    An operand which is read by the instruction can be tied to an earlyclobber operand if its only use as an input occurs before the early result is written. Adding alternatives of this form often allows GCC to produce better code when only some of the read operands can be affected by the earlyclobber. See, for example, the ‘mulsi3’ insn of the ARM.
    Furthermore, if the earlyclobber operand is also a read/write operand, then that operand is written only after it’s used.
    ‘&’ does not obviate the need to write ‘=’ or ‘+’. As earlyclobber operands are always written, a read-only earlyclobber operand is ill-formed and will be rejected by the compiler.

    3.2.4 ‘%’

    Declares the instruction to be commutative for this operand and the following operand. This means that the compiler may interchange the two operands if that is the cheapest way to make all operands fit the constraints. ‘%’ applies to all alternatives and must appear as the first character in the constraint. Only read-only operands can use ‘%’.

    4 General purpose registers and AAPCS64 usage

    在这里插入图片描述

  • 相关阅读:
    wordpress图片压缩插件-免费批量wordpress图片压缩
    面试官:什么是MySQL 事务与 MVCC 原理?
    2023前端面试整理
    ReentrantLock 先删再批量保存 ReentrantLock有啥用
    1102:与指定数字相同的数的个数(信奥)
    FlinkSQL CDC实现同步oracle数据到mysql
    [React 进阶系列] React Context 案例学习:使用 TS 及 HOC 封装 Context
    CommonModule.dll动态链接库(DLL)文件丢失的处理方法
    2023最新盲盒交友脱单系统源码
    安卓WebApp开发-项目MiliSetu
  • 原文地址:https://blog.csdn.net/u014100559/article/details/126942875