-
Notifications
You must be signed in to change notification settings - Fork 336
create-diff-object/ppc64le: Fix replace_sections_syms() for bundled s… #1007
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Member
Author
|
I'll try to add a unit test tomorrow. |
jpoimboe
added a commit
to jpoimboe/kpatch-unit-test-objs
that referenced
this pull request
Jul 23, 2019
This is a ppc64le test for dynup/kpatch#1007. The following patch was used: diff --git a/net/sunrpc/xprtsock.c b/net/sunrpc/xprtsock.c index b60c9c7498dd..39a39ca89230 100644 --- a/net/sunrpc/xprtsock.c +++ b/net/sunrpc/xprtsock.c @@ -1594,6 +1594,8 @@ static void xs_tcp_state_change(struct sock *sk) struct rpc_xprt *xprt; struct sock_xprt *transport; + asm("nop"); + read_lock_bh(&sk->sk_callback_lock); if (!(xprt = xprt_from_sock(sk))) goto out; Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
…ymbols
With the following patch:
diff --git a/net/sunrpc/xprtsock.c b/net/sunrpc/xprtsock.c
index b60c9c7498dd..39a39ca89230 100644
--- a/net/sunrpc/xprtsock.c
+++ b/net/sunrpc/xprtsock.c
@@ -1594,6 +1594,8 @@ static void xs_tcp_state_change(struct sock *sk)
struct rpc_xprt *xprt;
struct sock_xprt *transport;
+ asm("nop");
+
read_lock_bh(&sk->sk_callback_lock);
if (!(xprt = xprt_from_sock(sk)))
goto out;
I saw the following panic on a RHEL8 kernel:
Unable to handle kernel paging request for data at address 0xcc0080040
Faulting instruction address: 0xc000000000b1515c
Oops: Kernel access of bad area, sig: 7 [#1]
LE SMP NR_CPUS=2048 NUMA PowerNV
Modules linked in: rpcsec_gss_krb5 nfsv4 dns_resolver nfs fscache nfsd auth_rpcgss nfs_acl lockd grace kpatch_4_18_0_118_1_1(OEK) i2c_dev sunrpc ofpart powernv_flash at24 sg xts ipmi_powernv ipmi_devintf ipmi_msghandler uio_pdrv_genirq uio mtd vmx_crypto ibmpowernv opal_prd xfs libcrc32c sd_mod ast i2c_algo_bit drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops ttm drm ahci libahci libata tg3 drm_panel_orientation_quirks dm_mirror dm_region_hash dm_log dm_mod
CPU: 15 PID: 7814 Comm: kworker/u260:0 Kdump: loaded Tainted: G OE K --------- - - 4.18.0-118.el8.ppc64le #1
Workqueue: xprtiod xs_tcp_setup_socket [sunrpc]
NIP: c000000000b1515c LR: c000000000ad9968 CTR: c000000000b15140
REGS: c000001fab6ff6b0 TRAP: 0300 Tainted: G OE K --------- - - (4.18.0-118.el8.ppc64le)
MSR: 9000000000009033 <SF,HV,EE,ME,IR,DR,RI,LE> CR: 44002222 XER: 20040000
CFAR: c000000000078c7c DAR: 0000000cc0080040 DSISR: 00080000 IRQMASK: 0
GPR00: c000000000ad9968 c000001fab6ff930 c000000001662800 0000000cc0080000
GPR04: c00800000f5cfaa4 c000001f998fd0a8 c000001ff67e8080 c0000000016f46f0
GPR08: c000001fb4918f80 0000000000000000 0000000cc0080040 c0000000011b8980
GPR12: 0000000000002000 c000001ffffee200 c00000000017c458 c000001fe8a23a40
GPR16: c00000000150e480 c000001fd6e90090 0000000000000000 0000000000000000
GPR20: c00000000150e498 fffffffffffffef7 0000000000000402 0000000000000000
GPR24: c000001fd6e90380 0000000000000000 c00800000f5cfaa4 0000000000000000
GPR28: 00000000000004c4 c000001f998fd0a8 c00800000f5cfaa4 0000000cc0080000
NIP [c000000000b1515c] dst_release+0x2c/0x110
LR [c000000000ad9968] skb_release_head_state+0x178/0x190
Call Trace:
[c000001fab6ff930] [c000000000b15140] dst_release+0x10/0x110 (unreliable)
[c000001fab6ff9a0] [c000000000ad9968] skb_release_head_state+0x178/0x190
[c000001fab6ff9d0] [c000000000adb058] __kfree_skb+0x28/0x120
[c000001fab6ffa00] [c000000000be8d64] tcp_rcv_state_process+0xc24/0x1180
[c000001fab6ffa90] [c000000000cd5478] tcp_v6_do_rcv+0x1a8/0x5e0
[c000001fab6ffae0] [c000000000ad1724] __release_sock+0xc4/0x1a0
[c000001fab6ffb40] [c000000000ad1850] release_sock+0x50/0xe0
[c000001fab6ffb70] [c000000000c20018] inet_stream_connect+0x68/0x90
[c000001fab6ffbb0] [c000000000ac0f50] kernel_connect+0x30/0x50
[c000001fab6ffbd0] [c00800000f55dc34] xs_tcp_setup_socket+0xbc/0x650 [sunrpc]
[c000001fab6ffc70] [c000000000172014] process_one_work+0x2f4/0x5c0
[c000001fab6ffd10] [c000000000172adc] worker_thread+0xcc/0x760
[c000001fab6ffdc0] [c00000000017c5fc] kthread+0x1ac/0x1c0
[c000001fab6ffe30] [c00000000000b75c] ret_from_kernel_thread+0x5c/0x80
Instruction dump:
60000000 3c4c00b5 3842d6d0 7c0802a6 4b563b61 fbe1fff8 f821ff91 7c7f1b79
4182003c fbc10060 7c0004ac 395f0040 <7d205028> 3129ffff 7d20512d 40c2fff4
The problem is that the function has a GCC switch jump table, and the
.toc had the wrong offset for the jump table.
This is the switch jump table code from xs_tcp_state_changed():
70: 12 00 3d 89 lbz r9,18(r29)
74: 0b 00 89 2b cmplwi cr7,r9,11
78: f8 02 9d 41 bgt cr7,370 <xs_tcp_state_change+0x368>
7c: 00 00 42 3d addis r10,r2,0
7c: R_PPC64_TOC16_HA .toc+0x188
80: 00 00 4a e9 ld r10,0(r10)
80: R_PPC64_TOC16_LO_DS .toc+0x188
84: 64 17 29 79 rldicr r9,r9,2,61
88: aa 4a 2a 7d lwax r9,r10,r9
8c: 14 52 29 7d add r9,r9,r10
90: a6 03 29 7d mtctr r9
94: 20 04 80 4e bctr
98: d8 02 00 00 .long 0x2d8
9c: 38 00 00 00 .long 0x38
a0: d8 02 00 00 .long 0x2d8
a4: d8 02 00 00 .long 0x2d8
a8: 68 02 00 00 .long 0x268
ac: d8 02 00 00 .long 0x2d8
b0: d8 02 00 00 .long 0x2d8
b4: c8 01 00 00 .long 0x1c8
b8: 38 01 00 00 .long 0x138
bc: 88 01 00 00 .long 0x188
c0: d8 02 00 00 .long 0x2d8
c4: 68 01 00 00 .long 0x168
The switch jump table address is at offset 0x98. The code reads this
offset from .toc+0x188:
Relocation section '.rela.toc' at offset 0x75320 contains 134 entries:
Offset Info Type Symbol's Value Symbol's Name + Addend
0000000000000188 0000003f00000026 R_PPC64_ADDR64 0000000000000000 .text.xs_tcp_state_change + 98
After create-diff-object runs, the .toc entry now looks like this:
0000000000000188 0000000200000026 R_PPC64_ADDR64 0000000000000008 xs_tcp_state_change + 98
Notice the offset is the same, but it's now referring to the function
symbol instead of the text symbol. That's done by
kpatch_replace_sections_syms().
On x86, that's not a problem, because the function symbol is at offset 0
in the .text.function section. So the section symbol and the function
symbol are at the same location.
But on ppc64le, with -ffunction-sections, GCC 6+ somehow thinks it's a
good idea to associate the function symbol with the localentry point,
which is at an 8-byte offset from its corresponding section:
Num: Value Size Type Bind Vis Ndx Name
2: 0000000000000008 1228 FUNC LOCAL DEFAULT 3 xs_tcp_state_change [<localentry>: 8]
Notice the "Value" is 8 instead of 0.
That causes the .toc entry's jump table address to be wrongly offset by
8 bytes.
The fix is to adjust the rela addend accordingly in
kpatch_replace_sections_syms().
Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
20169f3 to
cef3360
Compare
Member
Author
|
Unit test added. |
sm00th
approved these changes
Jul 24, 2019
kamalesh-babulal
approved these changes
Jul 24, 2019
joe-lawrence
approved these changes
Jul 24, 2019
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
…ymbols
With the following patch:
I saw the following panic on a RHEL8 kernel:
The problem is that the function has a GCC switch jump table, and the
.toc had the wrong offset for the jump table.
This is the switch jump table code from xs_tcp_state_changed():
The switch jump table address is at offset 0x98. The code reads this
offset from .toc+0x188:
After create-diff-object runs, the .toc entry now looks like this:
Notice the offset is the same, but it's now referring to the function
symbol instead of the text symbol. That's done by
kpatch_replace_sections_syms().
On x86, that's not a problem, because the function symbol is at offset 0
in the .text.function section. So the section symbol and the function
symbol are at the same location.
But on ppc64le, with -ffunction-sections, GCC 6+ somehow thinks it's a
good idea to associate the function symbol with the localentry point,
which is at an 8-byte offset from its corresponding section:
Notice the "Value" is 8 instead of 0.
That causes the .toc entry's jump table address to be wrongly offset by
8 bytes.
The fix is to adjust the rela addend accordingly in
kpatch_replace_sections_syms().
Signed-off-by: Josh Poimboeuf jpoimboe@redhat.com