Chapter 5. Machine Code Insertions

Package Machine_Code provides machine code support as described in the Ada 95 Reference Manual in two separate forms:

The two features are similar, and both closely related to the mechanism provided by the asm instruction in the GNU C compiler. Full understanding and use of the facilities in this package requires understanding the asm instruction as described in Using and Porting GNU CC by Richard Stallman. Calls to the function Asm and the procedure Asm have identical semantic restrictions and effects as described below. Both are provided so that the procedure call can be used as a statement, and the function call can be used to form a code_statement.

The first example given in the GNU CC documentation is the C asm instruction:


asm ("fsinx %1,%0" : "=f" (result) : "f" (angle));

The equivalent can be written in Ada as:


Asm ("fsinx %1,%0",
     My_Float'Asm_Output ("=f", result),
     My_Float'Asm_Input  ("f",  angle));

The first argument to Asm is the assembler template, and is identical to what is used in GNU CC. This string must be a static expression. The second argument is the output operand list. It is either a single Asm_Output attribute reference, or a list of such references enclosed in parentheses (technically an array aggregate of such references).

The Asm_Output attribute denotes a function that takes two parameters. The first is a string, the second is the name of a variable of the type designated by the attribute prefix. The first (string) argument is required to be a static expression and designates the constraint for the parameter (e.g. what kind of register is required). The second argument is the variable to be updated with the result. The possible values for constraint are the same as those used in the RTL, and are dependent on the configuration file used to build the GCC back end. If there are no output operands, then this argument may either be omitted, or explicitly given as No_Output_Operands.

The second argument of My_Float'Asm_Output functions as though it were an out parameter, which is a little curious, but all names have the form of expressions, so there is no syntactic irregularity, even though normally functions would not be permitted out parameters. The third argument is the list of input operands. It is either a single Asm_Input attribute reference, or a list of such references enclosed in parentheses (technically an array aggregate of such references).

The Asm_Input attribute denotes a function that takes two parameters. The first is a string, the second is an expression of the type designated by the prefix. The first (string) argument is required to be a static expression, and is the constraint for the parameter, (e.g. what kind of register is required). The second argument is the value to be used as the input argument. The possible values for the constraint are the same as those used in the RTL, and are dependent on the configuration file used to built the GCC back end.

If there are no input operands, this argument may either be omitted, or explicitly given as No_Input_Operands. The fourth argument, not present in the above example, is a list of register names, called the clobber argument. This argument, if given, must be a static string expression, and is a space or comma separated list of names of registers that must be considered destroyed as a result of the Asm call. If this argument is the null string (the default value), then the code generator assumes that no additional registers are destroyed.

The fifth argument, not present in the above example, called the volatile argument, is by default False. It can be set to the literal value True to indicate to the code generator that all optimizations with respect to the instruction specified should be suppressed, and that in particular, for an instruction that has outputs, the instruction will still be generated, even if none of the outputs are used. See the full description in the GCC manual for further details.

The Asm subprograms may be used in two ways. First the procedure forms can be used anywhere a procedure call would be valid, and correspond to what the RM calls "intrinsic" routines. Such calls can be used to intersperse machine instructions with other Ada statements. Second, the function forms, which return a dummy value of the limited private type Asm_Insn, can be used in code statements, and indeed this is the only context where such calls are allowed. Code statements appear as aggregates of the form:


Asm_Insn'(Asm (...));
Asm_Insn'(Asm_Volatile (...));

In accordance with RM rules, such code statements are allowed only within subprograms whose entire body consists of such statements. It is not permissible to intermix such statements with other Ada statements.

Typically the form using intrinsic procedure calls is more convenient and more flexible. The code statement form is provided to meet the RM suggestion that such a facility should be made available. The following is the exact syntax of the call to Asm (of course if named notation is used, the arguments may be given in arbitrary order, following the normal rules for use of positional and named arguments)


ASM_CALL ::= Asm (
                [Template =>] static_string_EXPRESSION
              [,[Outputs  =>] OUTPUT_OPERAND_LIST      ]
              [,[Inputs   =>] INPUT_OPERAND_LIST       ]
              [,[Clobber  =>] static_string_EXPRESSION ]
              [,[Volatile =>] static_boolean_EXPRESSION] )

OUTPUT_OPERAND_LIST ::=
    No_Output_Operands
  | OUTPUT_OPERAND_ATTRIBUTE
  | (OUTPUT_OPERAND_ATTRIBUTE {,OUTPUT_OPERAND_ATTRIBUTE})

OUTPUT_OPERAND_ATTRIBUTE ::=
    SUBTYPE_MARK'Asm_Output (static_string_EXPRESSION, NAME)

INPUT_OPERAND_LIST ::=
    No_Input_Operands
  | INPUT_OPERAND_ATTRIBUTE
  | (INPUT_OPERAND_ATTRIBUTE {,INPUT_OPERAND_ATTRIBUTE})

INPUT_OPERAND_ATTRIBUTE ::=
    SUBTYPE_MARK'Asm_Input (static_string_EXPRESSION, EXPRESSION)

5.1. Constraints for Operands

Here are specific details on what constraint letters you can use with Asm statement operands. Constraints can say whether an operand may be in a register, and which kinds of register; whether the operand can be a memory reference, and which kinds of address; whether the operand may be an immediate constant, and which possible values it may have. Constraints can also require two operands to match.

5.1.1. Simple Constraints

The simplest kind of constraint is a string full of letters, each of which describes one kind of operand that is permitted. Here are the letters that are allowed:

"m"

A memory operand is allowed, with any kind of address that the target computer supports in general.

"o"

A memory operand is allowed, but only if the address is offsettable. This means that adding a small integer (actually, the width in bytes of the operand, as determined by its machine mode) may be added to the address and the result is also a valid memory address.

For example, an address which is constant is offsettable; so is an address that is the sum of a register and a constant (as long as a slightly larger constant is also within the range of address-offsets supported by the machine); but an auto-increment or auto-decrement address is not offsettable. More complicated indirect/indexed addresses may or may not be offsettable depending on the other addressing modes that the machine supports.

Note that in an output operand which can be matched by another operand, the constraint letter "o" is valid only when accompanied by both "<" (if the target machine has pre-decrement addressing) and ">" (if the target machine has pre-increment addressing).

"V"

A memory operand that is not offsettable. In other words, anything that would fit the "m" constraint but not the "o" constraint.

"<"

A memory operand with auto-decrement addressing (either pre-decrement or post-decrement) is allowed.

">"

A memory operand with auto-increment addressing (either pre-increment or post-increment) is allowed.

"r"

A register operand is allowed provided that it is in a general register.

"d", "a", "f", ...

Other letters can be defined in machine-dependent fashion to stand for particular classes of registers. "d", "a" and "f" are defined on the 68000/68020 to stand for data, address and floating point registers.

"i"

An immediate integer operand (one with constant value) is allowed. This includes symbolic constants whose values will be known only at assembly time.

"n"

An immediate integer operand with a known numeric value is allowed. Many systems cannot support assembly-time constants for operands less than a word wide. Constraints for these operands should use "n" rather than "i".

"I", "J", "K", ... "P"

Other letters in the range "I" through "P" may be defined in a machine-dependent fashion to permit immediate integer operands with explicit integer values in specified ranges. For example, on the 68000, "I" is defined to stand for the range of values 1 to 8. This is the range permitted as a shift count in the shift instructions.

"E"

An immediate floating operand (expression code const_double) is allowed, but only if the target floating point format is the same as that of the host machine (on which the compiler is running).

"F"

An immediate floating operand (expression code const_double) is allowed.

"G", "H"

"G" and "H" may be defined in a machine-dependent fashion to permit immediate floating operands in particular ranges of values.

"s"

An immediate integer operand whose value is not an explicit integer is allowed.

This might appear strange; if an insn allows a constant operand with a value not known at compile time, it certainly must allow any known value. So why use "s" instead of "i"? Sometimes it allows better code to be generated.

For example, on the 68000 in a fullword instruction it is possible to use an immediate operand; but if the immediate value is between -128 and 127, better code results from loading the value into a register and using the register. This is because the load into the register can be done with a "moveq" instruction. We arrange for this to happen by defining the letter "K" to mean "any integer outside the range -128 to 127", and then specifying "Ks" in the operand constraints.

"g"

Any register, memory or immediate integer operand is allowed, except for registers that are not general registers.

"X"

Any operand whatsoever is allowed.

"0", "1", "2", ... "9"

An operand that matches the specified operand number is allowed. If a digit is used together with letters within the same alternative, the digit should come last.

This is called a matching constraint and what it really means is that the assembler has only a single operand that fills two roles which asm distinguishes. For example, an add instruction uses two input operands and an output operand, but on most CISC machines an add instruction really has only two operands, one of them an input-output operand:

addl #35,r12

Matching constraints are used in these circumstances. More precisely, the two operands that match must include one input-only operand and one output-only operand. Moreover, the digit must be a smaller number than the number of the operand that uses it in the constraint.

"p"

An operand that is a valid memory address is allowed. This is for "load address" and "push address" instructions.

"p" in the constraint must be accompanied by address_operand as the predicate in the match_operand. This predicate interprets the mode specified in the match_operand as the mode of the memory reference for which the address would be valid.

"Q", "R", "S", ... "U"

Letters in the range "Q" through "U" may be defined in a machine-dependent fashion to stand for arbitrary operand types.

5.1.2. Multiple Alternative Constraints

Sometimes a single instruction has multiple alternative sets of possible operands. For example, on the 68000, a logical-or instruction can combine register or an immediate value into memory, or it can combine any kind of operand into a register; but it cannot combine one memory location into another.

These constraints are represented as multiple alternatives. An alternative can be described by a series of letters for each operand. The overall constraint for an operand is made from the letters for this operand from the first alternative, a comma, the letters for this operand from the second alternative, a comma, and so on until the last alternative.

If all the operands fit any one alternative, the instruction is valid. Otherwise, for each alternative, the compiler counts how many instructions must be added to copy the operands so that that alternative applies. The alternative requiring the least copying is chosen. If two alternatives need the same amount of copying, the one that comes first is chosen. These choices can be altered with the "?" and "!" characters:

?

Disparage slightly the alternative that the "?" appears in, as a choice when no alternative applies exactly. The compiler regards this alternative as one unit more costly for each "?" that appears in it.

!

Disparage severely the alternative that the "!" appears in. This alternative can still be used if it fits without reloading, but if reloading is needed, some other alternative will be used.

5.1.3. Constraint Modifier Characters

Here are constraint modifier characters.

"="

Means that this operand is write-only for this instruction: the previous value is discarded and replaced by output data.

"+"

Means that this operand is both read and written by the instruction.

When the compiler fixes up the operands to satisfy the constraints, it needs to know which operands are inputs to the instruction and which are outputs from it. "=" identifies an output; "+" identifies an operand that is both input and output; all other operands are assumed to be input only.

"&"

Means (in a particular alternative) that this operand is an earlyclobber operand, which is modified before the instruction is finished using the input operands. Therefore, this operand may not lie in a register that is used as an input operand or as part of any memory address.

"&" applies only to the alternative in which it is written. In constraints with multiple alternatives, sometimes one alternative requires "&" while others do not. See, for example, the "movdf" insn of the 68000.

An input operand can be tied to an earlyclobber operand if its only use as an input occurs before the early result is written. Adding alternatives of this form often allows GCC to produce better code when only some of the inputs can be affected by the earlyclobber. See, for example, the "mulsi3" insn of the ARM.

"&" does not obviate the need to write "=".

"%"

Declares the instruction to be commutative for this operand and the following operand. This means that the compiler may interchange the two operands if that is the cheapest way to make all operands fit the constraints.

"#"

Says that all following characters, up to the next comma, are to be ignored as a constraint. They are significant only for choosing register preferences.