Skip to content

Conversation

@kamalesh-babulal
Copy link
Contributor

Commit 5888f31 ("create-klp-module: support unbundled symbols")
breaks the livepatch modules build on ppc64le, with stalls after
module load:

INFO: rcu_sched self-detected stall on CPU
5-...: (21002 ticks this GP) idle=7ba/140000000000001/0
softirq=97524/97524 fqs=10499
(t=21003 jiffies g=23955 c=23954 q=2203)
NMI backtrace for cpu 5
CPU: 5 PID: 22188 Comm: cat Tainted: G OE K 4.14.0-rc7+ #3
Call Trace:
dump_stack+0xb0/0xf0 (unreliable)
nmi_cpu_backtrace+0x208/0x210
nmi_trigger_cpumask_backtrace+0x1c0/0x200
arch_trigger_cpumask_backtrace+0x28/0x40
rcu_dump_cpu_stacks+0xec/0x14c
rcu_check_callbacks+0x908/0xb10
update_process_times+0x48/0x90
tick_sched_handle.isra.5+0x4c/0x80
tick_sched_timer+0x60/0xe0
__hrtimer_run_queues+0xf8/0x360
hrtimer_interrupt+0xf8/0x330
__timer_interrupt+0x94/0x270
timer_interrupt+0xa4/0xe0
decrementer_common+0x114/0x120
--- interrupt: 901 at meminfo_proc_show+0x4c/0xd40
[livepatch_meminfo_string]
LR = meminfo_proc_show+0x50/0xd40 [livepatch_meminfo_string]
meminfo_proc_show+0x44/0xd40 [livepatch_meminfo_string] (unreliable)
klp_stub_insn_end+0x4/0x38
proc_reg_read+0x88/0xd0
__vfs_read+0x44/0x1b0
vfs_read+0xbc/0x1b0
SyS_read+0x68/0x110
system_call+0x58/0x6c

With GCC6+, every local function symbol has the value of 0x8 and
appending the symbol value to destination symbol miscalculates
the offset.

Fix the offset calculation, by subtracting symbol value 0x8 for
every destination functions, those are local functions as well.

Fixes: 5888f31 ("create-klp-module: support unbundled symbols")
Cc: Josh Poimboeuf jpoimboe@redhat.com
Signed-off-by: Kamalesh Babulal kamalesh@linux.vnet.ibm.com

Commit 5888f31 ("create-klp-module: support unbundled symbols")
breaks the livepatch modules build on ppc64le, with stalls after
module load:

INFO: rcu_sched self-detected stall on CPU
   5-...: (21002 ticks this GP) idle=7ba/140000000000001/0
softirq=97524/97524 fqs=10499
    (t=21003 jiffies g=23955 c=23954 q=2203)
 NMI backtrace for cpu 5
 CPU: 5 PID: 22188 Comm: cat Tainted: G           OE K 4.14.0-rc7+ dynup#3
 Call Trace:
 dump_stack+0xb0/0xf0 (unreliable)
 nmi_cpu_backtrace+0x208/0x210
 nmi_trigger_cpumask_backtrace+0x1c0/0x200
 arch_trigger_cpumask_backtrace+0x28/0x40
 rcu_dump_cpu_stacks+0xec/0x14c
 rcu_check_callbacks+0x908/0xb10
 update_process_times+0x48/0x90
 tick_sched_handle.isra.5+0x4c/0x80
 tick_sched_timer+0x60/0xe0
 __hrtimer_run_queues+0xf8/0x360
 hrtimer_interrupt+0xf8/0x330
 __timer_interrupt+0x94/0x270
 timer_interrupt+0xa4/0xe0
 decrementer_common+0x114/0x120
 --- interrupt: 901 at meminfo_proc_show+0x4c/0xd40
[livepatch_meminfo_string]
     LR = meminfo_proc_show+0x50/0xd40 [livepatch_meminfo_string]
 meminfo_proc_show+0x44/0xd40 [livepatch_meminfo_string] (unreliable)
 klp_stub_insn_end+0x4/0x38
 proc_reg_read+0x88/0xd0
 __vfs_read+0x44/0x1b0
  vfs_read+0xbc/0x1b0
 SyS_read+0x68/0x110
 system_call+0x58/0x6c

With GCC6+, every local function symbol has the value of 0x8 and
appending the symbol value to destination symbol, mis-calculates
the offset.

Fix the offset calculation, by subtracting symbol value 0x8 for
every destination functions, those are local functions as well.

Fixes: 5888f31 ("create-klp-module: support unbundled symbols")
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Signed-off-by: Kamalesh Babulal <kamalesh@linux.vnet.ibm.com>
@jpoimboe
Copy link
Member

jpoimboe commented Nov 6, 2017

Sorry for breaking this. I compiled on ppc, but I neglected to run it. We're working on some automated regression testing which should catch these types of regressions in the future.

At first glance, I think this patch will break x86 for cases like #700, where st_value can be nonzero on x86.

I need to look into this a little deeper, as I think I may see a way to simplify create_klp_relasecs_and_syms() a little bit for more commonality between x86 and ppc. I might post a followup pull request.

@jpoimboe
Copy link
Member

jpoimboe commented Nov 6, 2017

@kamalesh-babulal One question: looking at create_klp_relasecs_and_syms(), it appears to have some GCC 6 specific code (in two different places, including this patch). Does that mean that GCC 5 and older won't work?

@kamalesh-babulal
Copy link
Contributor Author

My bad. Yes, it will break #700, you might need some like below for powerpc. Where st_other is 0x60 for localentry for both sym->bind (global and local) and if it been compiled with gcc6, then the st_value is 0x8

+++ b/kpatch-build/create-klp-module.c
@@ -239,11 +239,13 @@ static void create_klp_relasecs_and_syms(struct kpatch_elf *kelf, struct section
                ALLOC_LINK(rela, &klp_relasec->relas);
                rela->sym = sym;
                rela->type = krelas[index].type;
-               if (!strcmp(dest->sec->name, ".toc"))
+               if (!strcmp(dest->sec->name, ".toc")) {
                        rela->offset = toc_offset;
-               else
+               } else {
                        rela->offset = krelas[index].offset + dest->sym.st_value;
-
+                       if (dest->sym.st_other == 0x60 && dest->sym.st_value == 0x8)
+                               rela->offset -= 0x8;
+               }
                /*
                 * GCC 6+ adds 0x8 to the offset of every local function entry
                 * in the .toc section, for avoiding the setup of the toc when

@kamalesh-babulal
Copy link
Contributor Author

Sorry for breaking this. I compiled on ppc, but I neglected to run it. We're working on some automated regression testing which should catch these types of regressions in the future.

Thanks for working on it. This issue is not seen, with just livepatch module load but only after issuing cat /proc/meminfo incase of meminfo-string.patch. I generally run one iteration of kernbench after loading every livepatch module, like a smoke test.

One question: looking at create_klp_relasecs_and_syms(), it appears to have some GCC 6 specific code (in two different places, including this patch). Does that mean that GCC 5 and older won't work?

It will just work for GCC 5 and lesser versions, but the adjuments are required with GCC 6, where the st_value is 0x8 for all localentry. Second place where the check placed for .toc section, the st_value is zero for GCC 5 and at the places where the offset is computed, we need to ensure the modules created using GCC5 and GCC6 are works without any issues.

Consider the readelf -S meminfo.o for both version of GCC:
$ gcc --version
gcc (Ubuntu 6.3.0-12ubuntu2) 6.3.0 20170406

27: 0000000000000000     0 FILE    LOCAL  DEFAULT  ABS meminfo.c
28: 0000000000000008  1552 FUNC    LOCAL  DEFAULT [<localentry>: 8]    10 meminfo_proc_show
29: 0000000000000008    68 FUNC    LOCAL  DEFAULT [<localentry>: 8]     4 meminfo_proc_open
30: 0000000000000008   216 FUNC    LOCAL  DEFAULT [<localentry>: 8]     6 show_val_kb
31: 0000000000000000   240 OBJECT  LOCAL  DEFAULT   21 meminfo_proc_fops
32: 0000000000000000     8 OBJECT  LOCAL  DEFAULT   23 __initcall_proc_meminfo_init5
33: 0000000000000000     7 OBJECT  LOCAL  DEFAULT   12 blanks.35336
34: 0000000000000008    80 FUNC    LOCAL  DEFAULT [<localentry>: 8]     2 proc_meminfo_init
35: 0000000000000000     0 NOTYPE  GLOBAL DEFAULT  UND proc_create

$ gcc --version
gcc (Ubuntu/IBM 5.4.1-2ubuntu2) 5.4.1 20160929

25: 0000000000000000     0 FILE    LOCAL  DEFAULT  ABS meminfo.c
26: 0000000000000000  1784 FUNC    LOCAL  DEFAULT [<localentry>: 8]    10 meminfo_proc_show
27: 0000000000000000    72 FUNC    LOCAL  DEFAULT [<localentry>: 8]     4 meminfo_proc_open
28: 0000000000000000   224 FUNC    LOCAL  DEFAULT [<localentry>: 8]     6 show_val_kb
29: 0000000000000000     7 OBJECT  LOCAL  DEFAULT   12 blanks.36525
30: 0000000000000000   240 OBJECT  LOCAL  DEFAULT   19 meminfo_proc_fops
31: 0000000000000000     8 OBJECT  LOCAL  DEFAULT   21 __initcall_proc_meminfo_init5
32: 0000000000000000    84 FUNC    LOCAL  DEFAULT [<localentry>: 8]     2 proc_meminfo_init
33: 0000000000000000     0 NOTYPE  GLOBAL DEFAULT  UND proc_create

@jpoimboe
Copy link
Member

jpoimboe commented Nov 7, 2017

I think your new patch isn't quite right, as I think it breaks #700 for ppc64le. In #700, the one-function-per-section assumption is broken. There are two total_mapping_size() functions in the .text.unlikely.total_mapping_size section. With GCC 6, the first function will have st_value of 8, but the second function will not. So we can't rely on the st_value there.

I need to think some more about how to get it to work for both GCC5 and GCC 6.

jpoimboe added a commit to jpoimboe/kpatch that referenced this pull request Nov 8, 2017
When creating .kpatch.relocations, there's no reason to convert the
relocation destinations to symbols.  In fact, it's actively harmful
because it makes it harder for create-klp-module to deal with the GCC 6+
8-byte localentry gap.

Fixes dynup#754.

Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
jpoimboe added a commit to jpoimboe/kpatch that referenced this pull request Nov 8, 2017
When creating .kpatch.relocations, there's no reason to convert the
relocation destinations to symbols.  In fact, it's actively harmful
because it makes it harder for create-klp-module to deal with the GCC 6+
8-byte localentry gap.

This also fixes a regression which was introduced in 5888f31, which
broke ppc64le relocations.

Fixes dynup#754.

Fixes: 5888f31 ("create-klp-module: support unbundled symbols")
Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
@jpoimboe
Copy link
Member

jpoimboe commented Nov 8, 2017

Let's continue the discussion in #757.

@jpoimboe jpoimboe closed this Nov 8, 2017
jpoimboe added a commit to jpoimboe/kpatch that referenced this pull request Nov 10, 2017
When creating .kpatch.relocations, there's no reason to convert the
relocation destinations to symbols.  In fact, it's actively harmful
because it makes it harder for create-klp-module to deal with the GCC 6+
8-byte localentry gap.

This also fixes a regression which was introduced in 5888f31, which
broke ppc64le relocations.

Fixes dynup#754.

Fixes: 5888f31 ("create-klp-module: support unbundled symbols")
Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
@kamalesh-babulal kamalesh-babulal deleted the ppc64le_fix branch December 4, 2017 03:35
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants