Skip to content

Conversation

@smithp35
Copy link
Contributor

@smithp35 smithp35 commented Jul 2, 2024

The GDAT(S + A) relocation operation requires a static linker to create a GOT entry for (S + A). Requiring at least one GOT entry for each unique tuple (S, A). Unfortunately no known static linker has implemented this correctly, with one of two forms being implemented instead:

  • GDAT(S) with the addend ignored.
  • GDAT(S) + A with a single GOT entry per S, and A added to the value of GDAT(S). These implementations are correct and consistent only for an addend (A) of zero.

No known compiler uses non-zero addends in relocations that use the GDAT(S+A) operation, although it is possible to generate them using assembly language.

This change synchronizes the ABI with the behavior of existing static linker implementations. The benefit of permitting code generators [*] to use a non zero addend in GDAT(S + A) is judged to be lower than implementing GDAT(S + A) correctly in existing static linkers, many of which assume that there is a single GOT entry per unique symbol S.

It is QoI whether a static linker gives an error if a non zero addend is used for a relocation that uses the GDAT(S) operation.

Fixes #217 Also resolves #247

[*] The most common use case for a non-zero addend is in constructing a C++ object with a vtable. The first two entries in the vtable are the offset to top and a pointer to RTTI, the vtable pointer in the object starts at offset 0x10. This offset can be encoded in the relocation addend. We would save an add instruction for each construction of a C++ object with a vtable if addends were permitted.

The GDAT(S + A) relocation operation requires a static linker to
create a GOT entry for (S + A). Requiring at least one GOT entry
for each unique tuple (S, A). Unfortunately no known static linker
has implemented this correctly, with one of two forms being
implemented instead:
* GDAT(S) with the addend ignored.
* GDAT(S) + A with a single GOT entry per S, and A added to the
  value of GDAT(S).
These implementations are correct and consistent only for an
addend (A) of zero.

No known compiler uses non-zero addends in relocations that use
the GDAT(S+A) operation, although it is possible to generate
them using assembly language.

This change synchronizes the ABI with the behavior of existing
static linker implementations. The benefit of permitting code
generators [*] to use a non zero addend in GDAT(S + A) is judged
to be lower than implementing GDAT(S + A) correctly in existing
static linkers, many of which assume that there is a single
GOT entry per unique symbol S.

It is QoI whether a static linker gives an error if a non zero
addend is used for a relocation that uses the GDAT(S) operation.

Fixes ARM-software#217
Also resolves ARM-software#247

[*] The most common use case for a non-zero addend is in
constructing a C++ object with a vtable. The first two entries
in the vtable are the offset to top and a pointer to RTTI, the
vtable pointer in the object starts at offset 0x10. This offset
can be encoded in the relocation addend. We would save an add
instruction for each construction of a C++ object with a vtable
if addends were permitted.
Copy link
Contributor

@MaskRay MaskRay left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

LLVM before https://reviews.llvm.org/D158577 could produce a non-zero addend for hand-written assembly, but to the best of my knowledge its code generator does not produce such assembly. D158577 was a corner case that was incompatible with static linkers.

@MaskRay
Copy link
Contributor

MaskRay commented Aug 9, 2024

Looks like this issue is still pending :)

@smithp35
Copy link
Contributor Author

smithp35 commented Aug 9, 2024

Looks like this issue is still pending :)

Yes, my apologies, I am a bit behind at the moment. Will hopefully get this merged next week.

@smithp35 smithp35 merged commit 201a7cb into ARM-software:main Aug 20, 2024
smithp35 added a commit to smithp35/abi-aa that referenced this pull request Feb 25, 2025
Bring TLS GOT generating relocations in line with non GOT generating
relocations in ARM-software#272.

The ABI rule is that static linkers should generate a GOT entry for
each unique tuple of (S,A). However static linkers such as GNU ld
and lld only generate a unique entry per unique S, and handle A
inconsistently. With GNU ld ignoring A and lld adding it after.
The only consistent behaviour between implementations is when
A is 0.
@smeenai
Copy link
Contributor

smeenai commented Aug 7, 2025

Is it okay for an implementation to accept an addend as an extension? LLD still does so, as far as I can tell, with the G(GDAT(S)) + A interpretation.

I'm asking because I've been looking into reducing the number of dynamic relocations in our Android app libraries, to improve load times during startup. Relative vtables have reduced those greatly, but we still have many RTTI-related ones left, and I've been idly contemplating a relative RTTI implementation to tackle those. The first entry in each RTTI struct is a pointer to a vtable address point, which currently looks something like:

.xword _ZTVN10__cxxabiv117__class_type_infoE + 8 // 16 if not using relative vtables

The addend is to convert the vtable symbol address to its address point. LLD's current addend implementation for GOTPCREL relocations would work perfectly for this, so I'm hoping that's not forbidden completely.

@smithp35
Copy link
Contributor Author

smithp35 commented Aug 7, 2025

Yes LLD does still accept an addend. As I understand it, for GOTPCREL32 lld uses R_GOT_PC

  case R_GOT_PC:
  case RE_AARCH64_AUTH_GOT_PC:
  case R_RELAX_TLS_GD_TO_IE:
    return r.sym->getGotVA(ctx) + a - p;

For an addend of 8 this would give an offset to an address 8 bytes above the GOT slot containing the address of S. As I understand it that is effectively the next GOT slot and wouldn't contain a predictable value (unless it is a double GOT slot generating relocation like those created by GTLSIDX (general dynamic TLS).

Is that what you need from your use case? I would have expected that you wanted G(GDAT(S+A)) - P. Where you get an offset to the GOT entry containing the address of S+A.

At the moment, the relocation behaviour with a non-zero addend is in the area of implementation defined behaviour. Effectively an unofficial extension. I couldn't find a useful use case for the current LLD behaviour although you may have found one. If we were to make a change I'd err on the side of saying something like "The static linker behaviour when a non zero addend is used is implementation defined."

@MaskRay
Copy link
Contributor

MaskRay commented Aug 7, 2025

% cat a.cc
struct A {
  virtual void f();
  virtual void g();
};

void A::f() {}
void A::g() {}

A *newA() { return new A; }
% clang++ -fexperimental-relative-c++-abi-vtables -S a.cc -o - --target=aarch64 -fpic
_ZTV1A.local:
        .word   0                               // 0x0
        .word   _ZTI1A@GOTPCREL-4  /////// @gotpcrel is temporary syntax, will change in LLVM
        .word   _ZN1A1fEv@PLT-_ZTV1A.local-8      /// @plt is temporary syntax
        .word   _ZN1A1gEv@PLT-_ZTV1A.local-8
        .size   _ZTV1A.local, 16

.word _ZTI1A@GOTPCREL-4 is relative to the vtable start instead of the current entry within the vtable, hence the -4 addend.

This interpretation can be seen as either G(GDAT(S))+A - P or G(GDAT(S)) - (P-A).
While the first is not a meaningful expression, the second accurately describes a GOT entry's location relative to a specific point near the current location.

Updating R_AARCH64_GOTPCREL32 to G(GDAT(S))-P+A looks good to me.

@smithp35
Copy link
Contributor Author

smithp35 commented Aug 7, 2025

Thanks for the example. Given that GNU ld doesn't currently implement R_AARCH64_GOTPCREL32 I think that there is scope to define as R_AARCH64_GOTPCREL32 G(GDAT(S))-P+A. Especially if there is existing code that uses a non-zero addend.

I'll see what I can do. Hopefully will have a PR up soon.

@smeenai
Copy link
Contributor

smeenai commented Aug 7, 2025

Thank you both of you! @smithp35 you're correct that the addend wouldn't do what I wanted in the case I'd brought up, plus I realized that making vtable pointers relative would be pretty hard to do efficiently anyway. @MaskRay's case is what relative vtables are relying on though, and we'd definitely want to keep that working.

smithp35 added a commit to smithp35/abi-aa that referenced this pull request Aug 7, 2025
Define the expression for R_AARCH64_GOTPCREL32 as GDAT(S)-P+A.
This matches the only implementation in clang and lld.

The relocation is used to calculate the offset from the start of
the vtable to a GOT entry that contains the address of the RTTI
object. As the table entry for the RTTI pointer is at an offset
from the start of the vtable the relocation addend contains
-offset to cancel out.

Previously in ARM-software#272
the relocation definition of relocations using GDAT(S+A) were
changed to require A to be 0 as lld and GNU ld were implementing
GDAT(S+A) as GDAT(S) + A and GDAT(S) + 0 respectively.

As this specific relocation is only implemented in clang and lld
it is safe to update the description to match the implementation
without affecting portability.

We use GDAT(S)-P+A rather than GDAT(S) + A - P as the latter
implies that we are calculating an offset to a different GOT
slot to GDAT(S) rather than an offset from P.

Discussion and example: ARM-software#272
@smithp35
Copy link
Contributor Author

smithp35 commented Aug 7, 2025

#342 created to update description for R_AARCH64_GOTPCREL32

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

static linkers (lld and GNU ld) out of sync with aaelf64 for GOT relocations with addends.

3 participants