-
Notifications
You must be signed in to change notification settings - Fork 105
[linux-6.6.y] LoongArch: Fix objtool issues and add jump table feature #609
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[linux-6.6.y] LoongArch: Fix objtool issues and add jump table feature #609
Conversation
commit c5b1184 upstream. Currently, there is an assembler message when generating kernel/bpf/core.o under CONFIG_OBJTOOL with LoongArch compiler toolchain: Warning: setting incorrect section attributes for .rodata..c_jump_table This is because the section ".rodata..c_jump_table" should be readonly, but there is a "W" (writable) part of the flags: $ readelf -S kernel/bpf/core.o | grep -A 1 "rodata..c" [34] .rodata..c_j[...] PROGBITS 0000000000000000 0000d2e0 0000000000000800 0000000000000000 WA 0 0 8 There is no above issue on x86 due to the generated section flag is only "A" (allocatable). In order to silence the warning on LoongArch, specify the attribute like ".rodata..c_jump_table,\"a\",@progbits #" explicitly, then the section attribute of ".rodata..c_jump_table" must be readonly in the kernel/bpf/core.o file. Before: $ objdump -h kernel/bpf/core.o | grep -A 1 "rodata..c" 21 .rodata..c_jump_table 00000800 0000000000000000 0000000000000000 0000d2e0 2**3 CONTENTS, ALLOC, LOAD, RELOC, DATA After: $ objdump -h kernel/bpf/core.o | grep -A 1 "rodata..c" 21 .rodata..c_jump_table 00000800 0000000000000000 0000000000000000 0000d2e0 2**3 CONTENTS, ALLOC, LOAD, RELOC, READONLY, DATA By the way, AFAICT, maybe the root cause is related with the different compiler behavior of various archs, so to some extent this change is a workaround for LoongArch, and also there is no effect for x86 which is the only port supported by objtool before LoongArch with this patch. Link: https://lkml.kernel.org/r/20240924062710.1243-1-yangtiezhu@loongson.cn Signed-off-by: Tiezhu Yang <yangtiezhu@loongson.cn> Cc: Josh Poimboeuf <jpoimboe@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: <stable@vger.kernel.org> [6.9+] Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Binbin Zhou <zhoubinbin@loongson.cn>
commit da5b2ad upstream. After commit a0f7085 ("LoongArch: Add RANDOMIZE_KSTACK_OFFSET support"), there are three new instructions "addi.d $fp, $sp, 32", "sub.d $sp, $sp, $t0" and "addi.d $sp, $fp, -32" for the secondary stack in do_syscall(), then there is a objtool warning "return with modified stack frame" and no handle_syscall() which is the previous frame of do_syscall() in the call trace when executing the command "echo l > /proc/sysrq-trigger". objdump shows something like this: 0000000000000000 <do_syscall>: 0: 02ff8063 addi.d $sp, $sp, -32 4: 29c04076 st.d $fp, $sp, 16 8: 29c02077 st.d $s0, $sp, 8 c: 29c06061 st.d $ra, $sp, 24 10: 02c08076 addi.d $fp, $sp, 32 ... 74: 0011b063 sub.d $sp, $sp, $t0 ... a8: 4c000181 jirl $ra, $t0, 0 ... dc: 02ff82c3 addi.d $sp, $fp, -32 e0: 28c06061 ld.d $ra, $sp, 24 e4: 28c04076 ld.d $fp, $sp, 16 e8: 28c02077 ld.d $s0, $sp, 8 ec: 02c08063 addi.d $sp, $sp, 32 f0: 4c000020 jirl $zero, $ra, 0 The instruction "sub.d $sp, $sp, $t0" changes the stack bottom and the new stack size is a random value, in order to find the return address of do_syscall() which is stored in the original stack frame after executing "jirl $ra, $t0, 0", it should use fp which points to the original stack top. At the beginning, the thought is tended to decode the secondary stack instruction "sub.d $sp, $sp, $t0" and set it as a label, then check this label for the two frame pointer instructions to change the cfa base and cfa offset during the period of secondary stack in update_cfi_state(). This is valid for GCC but invalid for Clang due to there are different secondary stack instructions for ClangBuiltLinux on LoongArch, something like this: 0000000000000000 <do_syscall>: ... 88: 00119064 sub.d $a0, $sp, $a0 8c: 00150083 or $sp, $a0, $zero ... Actually, it equals to a single instruction "sub.d $sp, $sp, $a0", but there is no proper condition to check it as a label like GCC, and so the beginning thought is not a good way. Essentially, there are two special frame pointer instructions which are "addi.d $fp, $sp, imm" and "addi.d $sp, $fp, imm", the first one points fp to the original stack top and the second one restores the original stack bottom from fp. Based on the above analysis, in order to avoid adding an arch-specific update_cfi_state(), we just add a member "frame_pointer" in the "struct symbol" as a label to avoid affecting the current normal case, then set it as true only if there is "addi.d $sp, $fp, imm". The last is to check this label for the two frame pointer instructions to change the cfa base and cfa offset in update_cfi_state(). Tested with the following two configs: (1) CONFIG_RANDOMIZE_KSTACK_OFFSET=y && CONFIG_RANDOMIZE_KSTACK_OFFSET_DEFAULT=n (2) CONFIG_RANDOMIZE_KSTACK_OFFSET=y && CONFIG_RANDOMIZE_KSTACK_OFFSET_DEFAULT=y By the way, there is no effect for x86 with this patch, tested on the x86 machine with Fedora 40 system. Cc: stable@vger.kernel.org # 6.9+ Signed-off-by: Tiezhu Yang <yangtiezhu@loongson.cn> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn> Signed-off-by: Binbin Zhou <zhoubinbin@loongson.cn>
commit 8037632 upstream. There exist some warnings when building kernel if CONFIG_CPU_HAS_LBT is set but CONFIG_CPU_HAS_LSX and CONFIG_CPU_HAS_LASX are not set. In this case, there are no definitions of _restore_lsx & _restore_lasx in fpu.S, just add some ifdefs to fix these warnings. AS arch/loongarch/kernel/fpu.o arch/loongarch/kernel/fpu.o: warning: objtool: unexpected relocation symbol type in .rela.discard.func_stack_frame_non_standard: 0 arch/loongarch/kernel/fpu.o: warning: objtool: unexpected relocation symbol type in .rela.discard.func_stack_frame_non_standard: 0 Cc: stable@vger.kernel.org # 6.9+ Fixes: cb8a2ef ("LoongArch: Add ORC stack unwinder support") Reported-by: kernel test robot <lkp@intel.com> Closes: https://lore.kernel.org/oe-kbuild-all/202408120955.qls5oNQY-lkp@intel.com/ Signed-off-by: Tiezhu Yang <yangtiezhu@loongson.cn> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn> Signed-off-by: Binbin Zhou <zhoubinbin@loongson.cn>
commit b8468bd upstream. For now, it can enable objtool for Clang, just remove !CC_IS_CLANG for HAVE_OBJTOOL in arch/loongarch/Kconfig. Signed-off-by: Tiezhu Yang <yangtiezhu@loongson.cn> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn> Signed-off-by: Binbin Zhou <zhoubinbin@loongson.cn>
commit a7e0837 upstream. When building kernel with "make CC=clang defconfig", LLVM Assembler is used due to LLVM_IAS=0 is not specified, then AS_HAS_THIN_ADD_SUB is not set, thus objtool can not be built after enable it for Clang. config AS_HAS_THIN_ADD_SUB is to check whether -mthin-add-sub option is available to know R_LARCH_{32,64}_PCREL are supported for GNU Assembler, there is no such an option for LLVM Assembler. The minimal version of Clang is 18 for building LoongArch kernel, and Clang >= 17 has already supported R_LARCH_{32,64}_PCREL, that is to say, there is no need to depend on AS_HAS_THIN_ADD_SUB for Clang, so just set AS_HAS_THIN_ADD_SUB as y if AS_IS_LLVM. Fixes: 120dd41 ("LoongArch: Only allow OBJTOOL & ORC unwinder if toolchain supports -mthin-add-sub") Signed-off-by: Tiezhu Yang <yangtiezhu@loongson.cn> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn> Signed-off-by: Binbin Zhou <zhoubinbin@loongson.cn>
In the relocation section ".rela.rodata" of each .o file compiled with LoongArch toolchain, there are various symbol types such as STT_NOTYPE, STT_OBJECT, STT_FUNC in addition to the usual STT_SECTION, it needs to use reloc symbol offset instead of reloc addend to find the destination instruction in find_jump_table() and add_jump_table(). This is preparation for later patch on LoongArch, there is no effect for the other archs with this patch. Signed-off-by: Tiezhu Yang <yangtiezhu@loongson.cn> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn> Signed-off-by: Binbin Zhou <zhoubinbin@loongson.cn>
There are some "unreachable instruction" objtool warnings when compiling with Clang on LoongArch, this is because the "break" instruction is set as dead end due to its type is INSN_BUG in decode_instructions() at the beginning, and it does not set insn->dead_end of the "break" instruction as false after checking ".rela.discard.reachable" in add_dead_ends(), so the next instruction of "break" is marked as unreachable. Actually, it can find the reachable instruction after parsing the section ".rela.discard.reachable", in some cases, the "break" instruction may not be the first previous instruction with scheduling by Machine Instruction Scheduler of LLVM, it should find more times and then set insn->dead_end of the "break" instruction as false. This is preparation for later patch on LoongArch, there is no effect for the other archs with this patch. Signed-off-by: Tiezhu Yang <yangtiezhu@loongson.cn> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn> Signed-off-by: Binbin Zhou <zhoubinbin@loongson.cn>
In the most cases, the entry size of rodata is 8 bytes because the relocation type is 64 bit, but when compiling with Clang on LoongArch, there exists 32 bit relocation type, the entry size of rodata should be 4 bytes in this case. This is preparation for later patch on LoongArch, there is no effect for the other archs with this patch. Signed-off-by: Tiezhu Yang <yangtiezhu@loongson.cn> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn> Signed-off-by: Binbin Zhou <zhoubinbin@loongson.cn>
When compiling with Clang on LoongArch, there exists 32 bit PC relative relocation type, it needs to get the offset with "S + A - PC" according to the spec of "ELF for the LoongArch Architecture". This is preparation for later patch on LoongArch, there is no effect for the other archs with this patch. Link: https://github.com/loongson/la-abi-specs/blob/release/laelf.adoc Signed-off-by: Tiezhu Yang <yangtiezhu@loongson.cn> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn> Signed-off-by: Binbin Zhou <zhoubinbin@loongson.cn>
When compiling with Clang on LoongArch, there exists unreachable entry of rodata which points to a position after the function return instruction, this is generated by compiler to fill the non-existent switch case, just skip the entry when parsing the relocation section of rodata. This is preparation for later patch on LoongArch, there is no effect for the other archs with this patch. Signed-off-by: Tiezhu Yang <yangtiezhu@loongson.cn> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn> Signed-off-by: Binbin Zhou <zhoubinbin@loongson.cn>
When compiling with Clang on LoongArch, there are unsorted table offset of rodata if there exist many jump tables, it will get the wrong table end and find the wrong jump destination instructions in add_jump_table(), so it needs to check the table size of rodata to break the process when parsing the relocation section of rodata. This is preparation for later patch on LoongArch, there is no effect for the other archs with this patch. Signed-off-by: Tiezhu Yang <yangtiezhu@loongson.cn> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn> Signed-off-by: Binbin Zhou <zhoubinbin@loongson.cn>
When compiling with Clang on LoongArch, there are unsorted table offset of rodata if there exist many jump tables, it will get the wrong table end and find the wrong jump destination instructions in add_jump_table(). Sort the rodata table offset by parsing ".rela.discard.tablejump_annotate" and then get each table size of rodata corresponded with each table jump instruction, it is used to check the table end and will break the process when parsing ".rela.rodata" to avoid getting the wrong jump destination instructions. Signed-off-by: Tiezhu Yang <yangtiezhu@loongson.cn> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn> Signed-off-by: Binbin Zhou <zhoubinbin@loongson.cn>
The objtool program need to analysis the control flow of each object file generated by compiler toolchain, it needs to know all the locations that a branch instruction may jump into, if a jump table is used, objtool has to correlate the jump instruction with the table. On x86 (which is the only port supported by objtool before LoongArch), there is a relocation type on the jump instruction and directly points to the table. But on LoongArch, the relocation is on another kind of instruction prior to the jump instruction, and also with scheduling it is not very easy to tell the offset of that instruction from the jump instruction. Furthermore, because LoongArch has -fsection-anchors (often enabled at -O1 or above) the relocation may actually points to a section anchor instead of the table itself. The good news is that after continuous analysis and discussion, at last a GCC patch "LoongArch: Add support to annotate tablejump" and a Clang patch "[LoongArch] Add options for annotate tablejump" have been merged into the upstream mainline, the compiler changes make life much easier for switch table support of objtool on LoongArch. By now, there is an additional section ".discard.tablejump_annotate" to store the jump info as pairs of addresses, each pair contains the address of jump instruction and the address of jump table. In order to find switch table, it is easy to parse the relocation section ".rela.discard.tablejump_annotate" to get table_sec and table_offset, the rest process is somehow like x86. Link: https://gcc.gnu.org/git/?p=gcc.git;a=commit;h=0ee028f55640 Link: llvm/llvm-project@4c2c17756739 Signed-off-by: Tiezhu Yang <yangtiezhu@loongson.cn> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn> Signed-off-by: Binbin Zhou <zhoubinbin@loongson.cn>
The objtool program need to analysis the control flow of each object file generated by compiler toolchain, it needs to know all the locations that a branch instruction may jump into, if a jump table is used, objtool has to correlate the jump instruction with the table. On x86 (which is the only port supported by objtool before LoongArch), there is a relocation type on the jump instruction and directly points to the table. But on LoongArch, the relocation is on another kind of instruction prior to the jump instruction, and also with scheduling it is not very easy to tell the offset of that instruction from the jump instruction. Furthermore, because LoongArch has -fsection-anchors (often enabled at -O1 or above) the relocation may actually points to a section anchor instead of the table itself. For the jump table of switch cases, a GCC patch "LoongArch: Add support to annotate tablejump" and a Clang patch "[LoongArch] Add options for annotate tablejump" have been merged into the upstream mainline, it can parse the additional section ".discard.tablejump_annotate" which stores the jump info as pairs of addresses, each pair contains the address of jump instruction and the address of jump table. For the jump table of computed gotos, it is indeed not easy to implement in the compiler, especially if there is more than one computed goto in a function such as ___bpf_prog_run(). objdump kernel/bpf/core.o shows that there are many table jump instructions in ___bpf_prog_run(), but there are no relocations on the table jump instructions and to the table directly on LoongArch. Without the help of compiler, in order to figure out the address of goto table for the special case of ___bpf_prog_run(), since the instruction sequence is relatively single and stable, it makes sense to add a helper find_reloc_of_rodata_c_jump_table() to find the relocation which points to the section ".rodata..c_jump_table". If find_reloc_by_table_annotate() failed, it means there is no relocation info of switch table address in ".rela.discard.tablejump_annotate", then objtool may find the relocation info of goto table ".rodata..c_jump_table" with find_reloc_of_rodata_c_jump_table(). Signed-off-by: Tiezhu Yang <yangtiezhu@loongson.cn> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn> Signed-off-by: Binbin Zhou <zhoubinbin@loongson.cn>
For now, it is time to remove -fno-jump-tables to enable jump table for objtool if the compiler has -mannotate-tablejump, otherwise it is better to remain -fno-jump-tables to keep compatibility with older compilers. Signed-off-by: Tiezhu Yang <yangtiezhu@loongson.cn> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn> Signed-off-by: Binbin Zhou <zhoubinbin@loongson.cn>
Enable ORC stack unwinder. Signed-off-by: Binbin Zhou <zhoubinbin@loongson.cn>
Reviewer's Guide by SourceryThis pull request backports objtool fixes, adds jump table support for LoongArch, and enables the ORC stack unwinder. The jump table support involves changes to how objtool identifies and processes jump tables, especially those compiled with Clang. The ORC stack unwinder is enabled through modifications to the FPU context restoration code. Updated class diagram for instruction structclassDiagram
class instruction {
+section *sec
+unsigned long offset
+unsigned long immediate
+unsigned long table_size
+u8 len
+u8 prev_len
+bool dead_end
}
Updated class diagram for symbol structclassDiagram
class symbol {
+u8 warned
+u8 embedded_insn
+u8 local_label
+u8 frame_pointer
+struct list_head pv_target
+struct reloc *relocs
}
File-Level Changes
Tips and commandsInteracting with Sourcery
Customizing Your ExperienceAccess your dashboard to:
Getting Help
|
|
Hi @AaronDot. Thanks for your PR. I'm waiting for a deepin-community member to verify that this patch is reasonable to test. If it is, they should reply with Once the patch is verified, the new status will be reflected by the I understand the commands that are listed here. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey @AaronDot - I've reviewed your changes - here's some feedback:
Overall Comments:
- Consider adding comments to explain the logic in
get_rodata_table_size_by_table_annotateandfind_reloc_by_table_annotate. - It might be helpful to add a comment explaining why CONFIG_CC_HAS_ANNOTATE_TABLEJUMP is needed.
Here's what I looked at during the review
- 🟡 General issues: 1 issue found
- 🟢 Security: all looks good
- 🟢 Testing: all looks good
- 🟡 Complexity: 1 issue found
- 🟢 Documentation: all looks good
Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.
| if (reloc->sym->type != STT_SECTION) | ||
| return; | ||
|
|
||
| orig_table = malloc(sizeof(struct table_info)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
issue (bug_risk): Potential memory leak with allocated table_info objects.
The allocated table_info structures are added into the local list but never freed after computing table_size. Consider freeing these allocations before the function returns.
| unsigned long rodata_offset; | ||
| }; | ||
|
|
||
| static void get_rodata_table_size_by_table_annotate(struct objtool_file *file, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
issue (complexity): Consider extracting the nested sorting logic into a dedicated helper function.
Consider extracting the nested sorting logic into its own routine to eliminate the duplicated pointer arithmetic and deeply nested loops. For example, if you have a list sort helper available (or can add one), you could write a dedicated comparator and sort routine:
static int table_info_cmp(void *priv, struct list_head *a, struct list_head *b)
{
struct table_info *ta = list_entry(a, struct table_info, jump_info);
struct table_info *tb = list_entry(b, struct table_info, jump_info);
return (ta->rodata_offset < tb->rodata_offset) ? -1 : (ta->rodata_offset > tb->rodata_offset);
}
static void sort_table_list(struct list_head *head)
{
/* Assuming list_sort is available; otherwise, implement your own sort using the comparator */
list_sort(NULL, head, table_info_cmp);
}Then update the function to delegate the sorting:
static void get_rodata_table_size_by_table_annotate(struct objtool_file *file,
struct instruction *insn)
{
struct section *rsec;
struct reloc *reloc;
struct list_head table_list;
struct table_info *table;
rsec = find_section_by_name(file->elf, ".rela.discard.tablejump_annotate");
if (!rsec)
return;
INIT_LIST_HEAD(&table_list);
for_each_reloc(rsec, reloc) {
if (reloc->sym->type != STT_SECTION)
return;
table = malloc(sizeof(struct table_info));
if (!table) {
WARN("malloc failed");
return;
}
table->insn_offset = reloc_addend(reloc);
reloc++;
table->rodata_offset = reloc_addend(reloc);
list_add_tail(&table->jump_info, &table_list);
if (reloc_idx(reloc) + 1 == sec_num_entries(rsec))
break;
}
/* Sort the list to simplify subsequent operations */
sort_table_list(&table_list);
list_for_each_entry(table, &table_list, jump_info) {
if (insn->offset == table->insn_offset) {
struct table_info *next_table = list_next_entry(table, jump_info);
while (&next_table->jump_info != &table_list &&
next_table->rodata_offset == table->rodata_offset)
next_table = list_next_entry(next_table, jump_info);
insn->table_size = next_table->rodata_offset - table->rodata_offset;
}
}
}This separation increases clarity, isolates pointer arithmetic, and makes the control flow easier to follow while keeping the functionality intact.
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: sourcery-ai[bot] The full list of commands accepted by this bot can be found here. DetailsNeeds approval from an approver in each of these files:Approvers can indicate their approval by writing |
|
/lgtm |
|
This pull request looks fine to me, but it's not going to resolve the clang compilation failures for loongarch in this repo. @opsiff |
9e905e8
into
deepin-community:linux-6.6.y
Summary:
Summary by Sourcery
This pull request enhances objtool support for the LoongArch architecture by adding jump table support, backporting upstream fixes, and enabling the ORC stack unwinder. It also improves CFI state updates and handles special cases for Clang-compiled code.
New Features:
Bug Fixes:
Enhancements:
addi.d fp,sp,immandaddi.d sp,fp,imminstructions on LoongArch, ensuring accurate call frame information.Build:
-mannotate-tablejumpcompiler flag, or fallback to-fno-jump-tablesfor older compilers.