Skip to content

Conversation

@jpoimboe
Copy link
Member

@jpoimboe jpoimboe commented Jul 23, 2019

…ymbols

With the following patch:

diff --git a/net/sunrpc/xprtsock.c b/net/sunrpc/xprtsock.c
index b60c9c7498dd..39a39ca89230 100644
--- a/net/sunrpc/xprtsock.c
+++ b/net/sunrpc/xprtsock.c
@@ -1594,6 +1594,8 @@ static void xs_tcp_state_change(struct sock *sk)
 	struct rpc_xprt *xprt;
 	struct sock_xprt *transport;

+	asm("nop");
+
 	read_lock_bh(&sk->sk_callback_lock);
 	if (!(xprt = xprt_from_sock(sk)))
 		goto out;

I saw the following panic on a RHEL8 kernel:

  Unable to handle kernel paging request for data at address 0xcc0080040
  Faulting instruction address: 0xc000000000b1515c
  Oops: Kernel access of bad area, sig: 7 [#1]
  LE SMP NR_CPUS=2048 NUMA PowerNV
  Modules linked in: rpcsec_gss_krb5 nfsv4 dns_resolver nfs fscache nfsd auth_rpcgss nfs_acl lockd grace kpatch_4_18_0_118_1_1(OEK) i2c_dev sunrpc ofpart powernv_flash at24 sg xts ipmi_powernv ipmi_devintf ipmi_msghandler uio_pdrv_genirq uio mtd vmx_crypto ibmpowernv opal_prd xfs libcrc32c sd_mod ast i2c_algo_bit drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops ttm drm ahci libahci libata tg3 drm_panel_orientation_quirks dm_mirror dm_region_hash dm_log dm_mod
  CPU: 15 PID: 7814 Comm: kworker/u260:0 Kdump: loaded Tainted: G           OE K  --------- -  - 4.18.0-118.el8.ppc64le #1
  Workqueue: xprtiod xs_tcp_setup_socket [sunrpc]
  NIP:  c000000000b1515c LR: c000000000ad9968 CTR: c000000000b15140
  REGS: c000001fab6ff6b0 TRAP: 0300   Tainted: G           OE K  --------- -  -  (4.18.0-118.el8.ppc64le)
  MSR:  9000000000009033 <SF,HV,EE,ME,IR,DR,RI,LE>  CR: 44002222  XER: 20040000
  CFAR: c000000000078c7c DAR: 0000000cc0080040 DSISR: 00080000 IRQMASK: 0
  GPR00: c000000000ad9968 c000001fab6ff930 c000000001662800 0000000cc0080000
  GPR04: c00800000f5cfaa4 c000001f998fd0a8 c000001ff67e8080 c0000000016f46f0
  GPR08: c000001fb4918f80 0000000000000000 0000000cc0080040 c0000000011b8980
  GPR12: 0000000000002000 c000001ffffee200 c00000000017c458 c000001fe8a23a40
  GPR16: c00000000150e480 c000001fd6e90090 0000000000000000 0000000000000000
  GPR20: c00000000150e498 fffffffffffffef7 0000000000000402 0000000000000000
  GPR24: c000001fd6e90380 0000000000000000 c00800000f5cfaa4 0000000000000000
  GPR28: 00000000000004c4 c000001f998fd0a8 c00800000f5cfaa4 0000000cc0080000
  NIP [c000000000b1515c] dst_release+0x2c/0x110
  LR [c000000000ad9968] skb_release_head_state+0x178/0x190
  Call Trace:
  [c000001fab6ff930] [c000000000b15140] dst_release+0x10/0x110 (unreliable)
  [c000001fab6ff9a0] [c000000000ad9968] skb_release_head_state+0x178/0x190
  [c000001fab6ff9d0] [c000000000adb058] __kfree_skb+0x28/0x120
  [c000001fab6ffa00] [c000000000be8d64] tcp_rcv_state_process+0xc24/0x1180
  [c000001fab6ffa90] [c000000000cd5478] tcp_v6_do_rcv+0x1a8/0x5e0
  [c000001fab6ffae0] [c000000000ad1724] __release_sock+0xc4/0x1a0
  [c000001fab6ffb40] [c000000000ad1850] release_sock+0x50/0xe0
  [c000001fab6ffb70] [c000000000c20018] inet_stream_connect+0x68/0x90
  [c000001fab6ffbb0] [c000000000ac0f50] kernel_connect+0x30/0x50
  [c000001fab6ffbd0] [c00800000f55dc34] xs_tcp_setup_socket+0xbc/0x650 [sunrpc]
  [c000001fab6ffc70] [c000000000172014] process_one_work+0x2f4/0x5c0
  [c000001fab6ffd10] [c000000000172adc] worker_thread+0xcc/0x760
  [c000001fab6ffdc0] [c00000000017c5fc] kthread+0x1ac/0x1c0
  [c000001fab6ffe30] [c00000000000b75c] ret_from_kernel_thread+0x5c/0x80
  Instruction dump:
  60000000 3c4c00b5 3842d6d0 7c0802a6 4b563b61 fbe1fff8 f821ff91 7c7f1b79
  4182003c fbc10060 7c0004ac 395f0040 <7d205028> 3129ffff 7d20512d 40c2fff4

The problem is that the function has a GCC switch jump table, and the
.toc had the wrong offset for the jump table.

This is the switch jump table code from xs_tcp_state_changed():

  70:   12 00 3d 89     lbz     r9,18(r29)
  74:   0b 00 89 2b     cmplwi  cr7,r9,11
  78:   f8 02 9d 41     bgt     cr7,370 <xs_tcp_state_change+0x368>
  7c:   00 00 42 3d     addis   r10,r2,0
                        7c: R_PPC64_TOC16_HA    .toc+0x188
  80:   00 00 4a e9     ld      r10,0(r10)
                        80: R_PPC64_TOC16_LO_DS .toc+0x188
  84:   64 17 29 79     rldicr  r9,r9,2,61
  88:   aa 4a 2a 7d     lwax    r9,r10,r9
  8c:   14 52 29 7d     add     r9,r9,r10
  90:   a6 03 29 7d     mtctr   r9
  94:   20 04 80 4e     bctr
  98:   d8 02 00 00     .long 0x2d8
  9c:   38 00 00 00     .long 0x38
  a0:   d8 02 00 00     .long 0x2d8
  a4:   d8 02 00 00     .long 0x2d8
  a8:   68 02 00 00     .long 0x268
  ac:   d8 02 00 00     .long 0x2d8
  b0:   d8 02 00 00     .long 0x2d8
  b4:   c8 01 00 00     .long 0x1c8
  b8:   38 01 00 00     .long 0x138
  bc:   88 01 00 00     .long 0x188
  c0:   d8 02 00 00     .long 0x2d8
  c4:   68 01 00 00     .long 0x168

The switch jump table address is at offset 0x98. The code reads this
offset from .toc+0x188:

Relocation section '.rela.toc' at offset 0x75320 contains 134 entries:
    Offset             Info             Type               Symbol's Value  Symbol's Name + Addend
0000000000000188  0000003f00000026 R_PPC64_ADDR64         0000000000000000 .text.xs_tcp_state_change + 98

After create-diff-object runs, the .toc entry now looks like this:

0000000000000188  0000000200000026 R_PPC64_ADDR64         0000000000000008 xs_tcp_state_change + 98

Notice the offset is the same, but it's now referring to the function
symbol instead of the text symbol. That's done by
kpatch_replace_sections_syms().

On x86, that's not a problem, because the function symbol is at offset 0
in the .text.function section. So the section symbol and the function
symbol are at the same location.

But on ppc64le, with -ffunction-sections, GCC 6+ somehow thinks it's a
good idea to associate the function symbol with the localentry point,
which is at an 8-byte offset from its corresponding section:

   Num:    Value          Size Type    Bind   Vis      Ndx Name
     2: 0000000000000008  1228 FUNC    LOCAL  DEFAULT    3 xs_tcp_state_change  [<localentry>: 8]

Notice the "Value" is 8 instead of 0.

That causes the .toc entry's jump table address to be wrongly offset by
8 bytes.

The fix is to adjust the rela addend accordingly in
kpatch_replace_sections_syms().

Signed-off-by: Josh Poimboeuf jpoimboe@redhat.com

@jpoimboe
Copy link
Member Author

I'll try to add a unit test tomorrow.

jpoimboe added a commit to jpoimboe/kpatch-unit-test-objs that referenced this pull request Jul 23, 2019
This is a ppc64le test for dynup/kpatch#1007.  The following patch was
used:

diff --git a/net/sunrpc/xprtsock.c b/net/sunrpc/xprtsock.c
index b60c9c7498dd..39a39ca89230 100644
--- a/net/sunrpc/xprtsock.c
+++ b/net/sunrpc/xprtsock.c
@@ -1594,6 +1594,8 @@ static void xs_tcp_state_change(struct sock *sk)
 	struct rpc_xprt *xprt;
 	struct sock_xprt *transport;

+	asm("nop");
+
 	read_lock_bh(&sk->sk_callback_lock);
 	if (!(xprt = xprt_from_sock(sk)))
 		goto out;

Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
jpoimboe added 2 commits July 23, 2019 18:11
…ymbols

With the following patch:

diff --git a/net/sunrpc/xprtsock.c b/net/sunrpc/xprtsock.c
index b60c9c7498dd..39a39ca89230 100644
--- a/net/sunrpc/xprtsock.c
+++ b/net/sunrpc/xprtsock.c
@@ -1594,6 +1594,8 @@ static void xs_tcp_state_change(struct sock *sk)
 	struct rpc_xprt *xprt;
 	struct sock_xprt *transport;

+	asm("nop");
+
 	read_lock_bh(&sk->sk_callback_lock);
 	if (!(xprt = xprt_from_sock(sk)))
 		goto out;

I saw the following panic on a RHEL8 kernel:

  Unable to handle kernel paging request for data at address 0xcc0080040
  Faulting instruction address: 0xc000000000b1515c
  Oops: Kernel access of bad area, sig: 7 [#1]
  LE SMP NR_CPUS=2048 NUMA PowerNV
  Modules linked in: rpcsec_gss_krb5 nfsv4 dns_resolver nfs fscache nfsd auth_rpcgss nfs_acl lockd grace kpatch_4_18_0_118_1_1(OEK) i2c_dev sunrpc ofpart powernv_flash at24 sg xts ipmi_powernv ipmi_devintf ipmi_msghandler uio_pdrv_genirq uio mtd vmx_crypto ibmpowernv opal_prd xfs libcrc32c sd_mod ast i2c_algo_bit drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops ttm drm ahci libahci libata tg3 drm_panel_orientation_quirks dm_mirror dm_region_hash dm_log dm_mod
  CPU: 15 PID: 7814 Comm: kworker/u260:0 Kdump: loaded Tainted: G           OE K  --------- -  - 4.18.0-118.el8.ppc64le #1
  Workqueue: xprtiod xs_tcp_setup_socket [sunrpc]
  NIP:  c000000000b1515c LR: c000000000ad9968 CTR: c000000000b15140
  REGS: c000001fab6ff6b0 TRAP: 0300   Tainted: G           OE K  --------- -  -  (4.18.0-118.el8.ppc64le)
  MSR:  9000000000009033 <SF,HV,EE,ME,IR,DR,RI,LE>  CR: 44002222  XER: 20040000
  CFAR: c000000000078c7c DAR: 0000000cc0080040 DSISR: 00080000 IRQMASK: 0
  GPR00: c000000000ad9968 c000001fab6ff930 c000000001662800 0000000cc0080000
  GPR04: c00800000f5cfaa4 c000001f998fd0a8 c000001ff67e8080 c0000000016f46f0
  GPR08: c000001fb4918f80 0000000000000000 0000000cc0080040 c0000000011b8980
  GPR12: 0000000000002000 c000001ffffee200 c00000000017c458 c000001fe8a23a40
  GPR16: c00000000150e480 c000001fd6e90090 0000000000000000 0000000000000000
  GPR20: c00000000150e498 fffffffffffffef7 0000000000000402 0000000000000000
  GPR24: c000001fd6e90380 0000000000000000 c00800000f5cfaa4 0000000000000000
  GPR28: 00000000000004c4 c000001f998fd0a8 c00800000f5cfaa4 0000000cc0080000
  NIP [c000000000b1515c] dst_release+0x2c/0x110
  LR [c000000000ad9968] skb_release_head_state+0x178/0x190
  Call Trace:
  [c000001fab6ff930] [c000000000b15140] dst_release+0x10/0x110 (unreliable)
  [c000001fab6ff9a0] [c000000000ad9968] skb_release_head_state+0x178/0x190
  [c000001fab6ff9d0] [c000000000adb058] __kfree_skb+0x28/0x120
  [c000001fab6ffa00] [c000000000be8d64] tcp_rcv_state_process+0xc24/0x1180
  [c000001fab6ffa90] [c000000000cd5478] tcp_v6_do_rcv+0x1a8/0x5e0
  [c000001fab6ffae0] [c000000000ad1724] __release_sock+0xc4/0x1a0
  [c000001fab6ffb40] [c000000000ad1850] release_sock+0x50/0xe0
  [c000001fab6ffb70] [c000000000c20018] inet_stream_connect+0x68/0x90
  [c000001fab6ffbb0] [c000000000ac0f50] kernel_connect+0x30/0x50
  [c000001fab6ffbd0] [c00800000f55dc34] xs_tcp_setup_socket+0xbc/0x650 [sunrpc]
  [c000001fab6ffc70] [c000000000172014] process_one_work+0x2f4/0x5c0
  [c000001fab6ffd10] [c000000000172adc] worker_thread+0xcc/0x760
  [c000001fab6ffdc0] [c00000000017c5fc] kthread+0x1ac/0x1c0
  [c000001fab6ffe30] [c00000000000b75c] ret_from_kernel_thread+0x5c/0x80
  Instruction dump:
  60000000 3c4c00b5 3842d6d0 7c0802a6 4b563b61 fbe1fff8 f821ff91 7c7f1b79
  4182003c fbc10060 7c0004ac 395f0040 <7d205028> 3129ffff 7d20512d 40c2fff4

The problem is that the function has a GCC switch jump table, and the
.toc had the wrong offset for the jump table.

This is the switch jump table code from xs_tcp_state_changed():

  70:   12 00 3d 89     lbz     r9,18(r29)
  74:   0b 00 89 2b     cmplwi  cr7,r9,11
  78:   f8 02 9d 41     bgt     cr7,370 <xs_tcp_state_change+0x368>
  7c:   00 00 42 3d     addis   r10,r2,0
                        7c: R_PPC64_TOC16_HA    .toc+0x188
  80:   00 00 4a e9     ld      r10,0(r10)
                        80: R_PPC64_TOC16_LO_DS .toc+0x188
  84:   64 17 29 79     rldicr  r9,r9,2,61
  88:   aa 4a 2a 7d     lwax    r9,r10,r9
  8c:   14 52 29 7d     add     r9,r9,r10
  90:   a6 03 29 7d     mtctr   r9
  94:   20 04 80 4e     bctr
  98:   d8 02 00 00     .long 0x2d8
  9c:   38 00 00 00     .long 0x38
  a0:   d8 02 00 00     .long 0x2d8
  a4:   d8 02 00 00     .long 0x2d8
  a8:   68 02 00 00     .long 0x268
  ac:   d8 02 00 00     .long 0x2d8
  b0:   d8 02 00 00     .long 0x2d8
  b4:   c8 01 00 00     .long 0x1c8
  b8:   38 01 00 00     .long 0x138
  bc:   88 01 00 00     .long 0x188
  c0:   d8 02 00 00     .long 0x2d8
  c4:   68 01 00 00     .long 0x168

The switch jump table address is at offset 0x98.  The code reads this
offset from .toc+0x188:

Relocation section '.rela.toc' at offset 0x75320 contains 134 entries:
    Offset             Info             Type               Symbol's Value  Symbol's Name + Addend
0000000000000188  0000003f00000026 R_PPC64_ADDR64         0000000000000000 .text.xs_tcp_state_change + 98

After create-diff-object runs, the .toc entry now looks like this:

0000000000000188  0000000200000026 R_PPC64_ADDR64         0000000000000008 xs_tcp_state_change + 98

Notice the offset is the same, but it's now referring to the function
symbol instead of the text symbol.  That's done by
kpatch_replace_sections_syms().

On x86, that's not a problem, because the function symbol is at offset 0
in the .text.function section.  So the section symbol and the function
symbol are at the same location.

But on ppc64le, with -ffunction-sections, GCC 6+ somehow thinks it's a
good idea to associate the function symbol with the localentry point,
which is at an 8-byte offset from its corresponding section:

   Num:    Value          Size Type    Bind   Vis      Ndx Name
     2: 0000000000000008  1228 FUNC    LOCAL  DEFAULT    3 xs_tcp_state_change  [<localentry>: 8]

Notice the "Value" is 8 instead of 0.

That causes the .toc entry's jump table address to be wrongly offset by
8 bytes.

The fix is to adjust the rela addend accordingly in
kpatch_replace_sections_syms().

Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
@jpoimboe jpoimboe force-pushed the ppc-replace-sections-syms branch from 20169f3 to cef3360 Compare July 23, 2019 23:12
@jpoimboe
Copy link
Member Author

Unit test added.

@jpoimboe jpoimboe merged commit 814fc06 into dynup:master Jul 24, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants