-
Notifications
You must be signed in to change notification settings - Fork 212
Description
Suppose I have an AArch64 ELF file containing:
- a MOV/MOVK instruction sequence to load a 64-bit constant
- all four instructions marked with appropriate R_AARCH64_MOVW_UABS relocations to the same symbol
- the relocation section is SHT_REL type, not SHT_RELA, so the addends are taken from the immediate fields of the instructions in the code section
- and all of those immediate fields are nonzero.
For example:
mov x0, #0x123 ; R_AARCH64_MOVW_UABS_G0_NC(mySymbol)
movk x0, #0x123, lsl #16 ; R_AARCH64_MOVW_UABS_G1_NC(mySymbol)
movk x0, #0x123, lsl #32 ; R_AARCH64_MOVW_UABS_G2_NC(mySymbol)
movk x0, #0x123, lsl #48 ; R_AARCH64_MOVW_UABS_G3(mySymbol)
What should the output be? There are two plausible interpretations:
- In all four relocations, the addend 0x123 is consistently treated as unshifted, so that the linker calculates the same value
mySymbol+0x123four times, and each time, extracts a different 16-bit chunk of it to put into x0. - In each relocation, the addend is treated as shifted by a multiple of 16 bits, as specified by the relocation. So I get the low 16 bits of
mySymbol+0x123; the next 16 bits ofmySymbol+0x1230000; the next 16 bits ofmySymbol+0x12300000000, and finally the high 16 bits ofmySymbol+0x123000000000000.
For the analogous case in AAELF32, this is reasonably clear. §5.6.1.1 "Addends and PC-bias compensation" says (my emphasis):
For relocations processing MOVW and MOVT instructions (in both Arm and Thumb state), the initial addend is formed by interpreting the 16-bit literal field of the instruction as a 16-bit signed value in the range -32768 <= A < 32768. The interpretation is the same whether the relocated place contains a MOVW instruction or a MOVT instruction.
So in AAELF32, I'd expect that the analogous code would consistently compute mySymbol+0x123, and deliver the bottom and top 16 bits of that constant. This also seems sensible because that might plausibly be a thing I'd want. The other interpretation, of shifting each constant, would be useful if it allowed you to consistently relocate the whole instruction sequence to get mySymbol plus an arbitrary full-width constant, but it doesn't allow that, because the relocations are processed independently, so a carry when adding to the low word can't be accounted for in the high word. So the way AAELF32 has defined it, it's possible to add a small constant to a symbol and load the result via MOVW+MOVT, without having to resort to SHT_RELA to specify a larger addend.
But AAELF64 is not so clear. It says this about addends, in §5.7.2 "Addends and PC-bias" (again my emphasis):
If the relocation relocates an instruction the immediate field of the instruction is extracted, scaled as required by the instruction field encoding, and sign-extended to 64 bits.
It's clear that "scaled as required" should apply to things like branch offsets being interpreted as a multiple of 4 bytes. But it's not clear whether it also means you should scale up implicit addends in MOVK by 16, 32 or 48 bits, or in ADRP by 12 bits.
I think the most useful answer would be no: those addends should be taken to be applied unshifted to the symbol value, for the same reason as in AAELF32. But either way, I think the current wording is unclear.