-
Notifications
You must be signed in to change notification settings - Fork 336
[RFC] PC64le - do not use the patched functions as callbacks directly #789
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
The current patch applies on top of @euspectre patch. I have included the patch from @euspectre tree in this pull request for readability and testing purpose. I will redo the pull request, with my patch alone, in the final version. |
kpatch-build/create-diff-object.c
Outdated
| toc_dynrela_off = malloc(sizeof(int) * (nr/2)); | ||
| memset(toc_dynrela_off, 0, sizeof(*toc_dynrela_off)); | ||
| off_idx = 0; | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should check for malloc failure here: if (!toc_dynrela_off) ERROR("malloc") ... also isn't that memset going to only clear the first offset value (ie, toc_dynrela_off[0] and not toc_dynrela_off[0..n]) ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nitpick: also maybe the malloc should be sizeof(*toc_dynrela_off) * (nr/2) ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the review. Will include the suggestions in next iteration.
|
How much of the added code in |
It's a prototype for comments. Changes introduced to |
|
I'm confused about why we have to add a new .toc entry. Why can't we just convert those .toc.rela function ptr relas to dynrelas? |
I got little mixed with details and implemented it with duplicating |
464e608 to
d5eca81
Compare
|
With #791, Will wait for the comments on #791 and on this pull-request, before re-spinning v3. |
kpatch-build/create-diff-object.c
Outdated
| continue; | ||
| } | ||
| } | ||
| #else |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The control flow is way too complex and hard to follow.
What if we add a rela->needs_dynrela bool, which gets set in the first loop which increments the nr variable? Then we could move the may_need_dynrela() call there, and that would hopefully vastly simplify this code (e.g., no need for the toc_dynrela_off array or the spaghetti code flow).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
clever :). Will re-do the patch based on your suggestion.
d5eca81 to
465ec27
Compare
|
v3 Changes:
|
jpoimboe
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is looking much better now. I added some more comments.
Also, instead of merging, can you just cherry pick the #755 patch before yours? That way we'll be able to easily merge x86 and ppc support with this PR. Thanks.
| } | ||
| if (may_need_dynrela(rela)) | ||
| toc_rela(rela)->need_dynrela = 1; | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a lot better than before, but I'm still confused by this code. Why are all the toc special cases needed in function_ptr_rela(), may_need_rela(), and here? This code loops through all the sections, including .rela.toc., so can we just do something like
list_for_each_entry(rela, &sec->relas, list) {
nr++; /* upper bound on number of kpatch relas and symbols */
if (may_need_dynrela(rela))
rela->need_dynrela = 1;
}
along with removing all the toc_rela() calls in function_ptr_rela() and may_need_rela()?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Relocation section '.rela.toc' at offset 0xcc6b0 contains 46 entries:
...
0000000000000128 0000353400000026 R_PPC64_ADDR64 0000000000000000 genl_sk_destructing_waitq + 0
0000000000000130 0000003e00000026 R_PPC64_ADDR64 0000000000000000 .text.netlink_release + 35c
0000000000000138 0000002a00000026 R_PPC64_ADDR64 0000000000000000 .text.deferred_put_nlk_sk + 8
Relocation section '.rela.text.netlink_release' at offset 0xcadf0 contains 44 entries:
...
0000000000000390 000035580000000a R_PPC64_REL24 0000000000000000 __local_bh_enable_ip + 0
0000000000000398 0000007300000032 R_PPC64_TOC16_HA 0000000000000000 .toc + 138
00000000000003a0 0000007300000040 R_PPC64_TOC16_LO_DS 0000000000000000 .toc + 138
When looping with:
list_for_each_entry(rela, &sec->relas, list) {
nr++; /* upper bound on number of kpatch relas and symbols */
if (may_need_dynrela(rela))
rela->need_dynrela = 1;
}
netlink_release + 0x3a0 is not a function symbol, but of relocation type (R_PPC64_TOC16_HA/R_PPC64_TOC16_LO_DS)
referring to symbol in .rela.toc + 0x138, We need to operate on both rela->type and rela->sym->sec->rela + rela->addend, where netlink_release + 0x3a0 is checked for rela-type and .text.deferred_put_nlk_sk + 8 for STT_FUNC/switch label, but the need_dynrela should be set only for the .rela.toc symbol.
Without toc_rela() netlink_release + 0x3a0 referring to .text.deferred_put_nlk_sk + 8 symbol's need_dynrela will not be set, failing the check for STT_FUNC in function_ptr_rela() and while parsing .rela.toc section it's not obvious, if .text.deferred_put_nlk_sk + 8 need_dynrela to be set, because we can not determine if it's function ptr.
Alternatively, the above loop can be re-written as:
list_for_each_entry(rela, &sec->relas, list) {
nr++; /* upper bound on number of kpatch relas and symbols */
if (may_need_dynrela(rela))
toc_rela(rela)->need_dynrela = 1;
}
With an additional check in may_need_dynrela():
static int may_need_dynrela(const struct rela *rela)
{
/* Don't over-ride need_dynrela flag if set, required for .toc sym*/
if (rela->need_dynrela || !rela->sym->sec)
return 1;
...
}
may_need_dynrela() and function_ptr_rela() operates on both rela and rela->sym->sec->rela. We might need to keep toc_rela() in both.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the explanation. Sorry, I keep forgetting that the R_PPC64_TOC16_HA relocation is for the original section, not the .toc. Can you add the above readelf snippet to a comment above the if (may_need_dynrela(rela)) line explaining why the toc_rela() is getting need_dynrela set? Otherwise I'll keep getting confused :-)
Can you also explain why you need to check rela->need_dynrela before setting toc_rela(rela)->need_dynrela? I'm trying to understand how it would be possible for a ".toc + xxx)" relocation to be converted to a dynrela.
Assuming we do really need that, how about doing the loop like this?
list_for_each_entry(rela, &sec->relas, list) {
nr++; /* upper bound on number of kpatch relas and symbols */
if (!rela->need_dynrela && may_need_dynrela(rela))
toc_rela(rela)->need_dynrela = 1;
}
Also a comment for the !strchr(toc_rela(rela)->sym->name, '.') check in may_need_dynrela() would also be helpful.
Another thing that would make it easier for me to follow would be to rename rela1 to rela_toc in function_ptr_rela().
Thanks!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the if (may_need_dynrela(rela)) line explaining why the toc_rela() is getting need_dynrela set? Otherwise I'll keep getting confused :-)
Can you also explain why you need to check rela->need_dynrela before setting toc_rela(rela)->need_dynrela? I'm trying to understand how it would be possible for a ".toc + xxx)" relocation to be converted to a dynrela.
:) .rela.toc is superseded by .rela.text.* sections, those might have set the need_dynrela for entries in .rela.toc. Without the !rela->need_dynrela check, may_need_dynrela() for .rela.toc rela* will unset the need_dynrela flag, failing to rela->type check in function_ptr_rela().
.rela.toc rela type is always R_PPC64_ADDR64.
check in may_need_dynrela() would also be helpful.
function_ptr_rela(rela) will return 0, if the rela returned by toc_rela() is NULL. Duplicate check for NULL in may_need_dynrela(), can be avoided.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Without the !rela->need_dynrela check, may_need_dynrela() for .rela.toc rela* will unset the need_dynrela flag, failing to rela->type check in function_ptr_rela().
Hm, but if it's already set, how would it get unset? To clarify, I'm talking about your rewritten version of the loop:
list_for_each_entry(rela, &sec->relas, list) {
nr++; /* upper bound on number of kpatch relas and symbols */
if (may_need_dynrela(rela))
toc_rela(rela)->need_dynrela = 1;
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry got little mixed up here, true it will not get unset. Having the check for need_dynrela,
if (!rela->need_dynrela && may_need_dynrela(rela)) will speed the handling of .rela.toc entries but not sure, if its worth the optimization. Arguably .rela.toc is only on PPC64le and an additional check for rela entries for non-toc sections might be an overkill. I will re-spin the patch with
list_for_each_entry(rela, &sec->relas, list) {
nr++; /* upper bound on number of kpatch relas and symbols */
if (may_need_dynrela(rela))
toc_rela(rela)->need_dynrela = 1;
}
Thanks.
kpatch-build/kpatch-elf.h
Outdated
| int addend; | ||
| int offset; | ||
| char *string; | ||
| int need_dynrela; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you make this a bool?
kpatch-build/kpatch-elf.c
Outdated
| rela->type = GELF_R_TYPE(rela->rela.r_info); | ||
| rela->addend = rela->rela.r_addend; | ||
| rela->offset = rela->rela.r_offset; | ||
| rela->need_dynrela = 0; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No need for this explicit initialization, we already memset the rela struct to zero in ALLOC_LINK.
465ec27 to
f52035c
Compare
|
v4 Changes:
|
| * if exist, to check for nested functions instead of rela->sym. | ||
| * Referenced sym holds the function name and rela->name holds | ||
| * reference section name. | ||
| */ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think this comment is really needed. What I really wanted was a comment describing why it's doing the strchr() call with a check for the '.' character in the symbol name.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think this comment is really needed. What I really wanted was a comment describing why it's doing the strchr() call with a check for the '.' character in the symbol name.
Sorry if I have got it wrong, last paragraph from Evgenii comments above function_ptr_rela() explains why symbol name strchr() is checked for '.', Should that paragraph be moved here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah. Yes, I believe that last paragraph belongs here.
…ctly
A kernel crash happened in __do_softirq() in very rare cases when the
binary patch created from mainline commit be82485fbcbb
("netlink: fix an use-after-free issue for nlk groups") was unloaded.
Investigation has shown that the kernel tried to execute an RCU
callback, deferred_put_nlk_sk(), defined in the patch module after the
module had been unloaded.
The callback was set by the patched variant of netlink_release() and
the address of the patched deferred_put_nlk_sk() was used, rather than
the address of the original function.
Similar problems occur with workqueue functions as well.
As suggested in
dynup#755 (comment),
create-diff-object was modified so that the addresses of the original
functions were used in such situations, at least for x86 systems. A
similar fix for PowerPC was added as well.
Changes in v4:
* '#ifdef __x86_64__' was removed. It is not actually needed right now
because the constants for relocation types are different on different
architectures.
Changes in v3:
* Minor refactoring and a comment explaining what this all is about.
Quite lengthy, but the dynrela-related code is really far from obvious.
Changes in v2:
* Handle the nested functions the same way as before, because they are
unlikely to be used as asynchronous callbacks. Example: cmp() in
bch_cache_show() from drivers/md/bcache/sysfs.c in the kernel 4.4.
As the nested functions are local to the functions they are defined in,
the compiler names them in a similar way to static locals: <name>.<number>.
Currently, we filter out all functions with '.' in their names. If there
are any asynchronous callbacks in the kernel that have a dot in their
names too, they could be handled in the future patches. It is unclear
though, if the callbacks with such names can appear in the kernel.
Signed-off-by: Evgenii Shatokhin <eshatokhin@virtuozzo.com>
It was observed by Evgenii Shatokhin in PR#755, that when the RCU callback was called on the patched function, from unloaded livepatch module triggered a kernel crash. This patch implements the approach on PowerPC outlined in PR#755. With -mcmodel=large, like any other data, function pointers are also loaded relative to the current TOC base and are populated as relocation entries in .toc section. Every function passing a function pointer as the argument need to load the function address through .toc section + offset. Convert such .toc + offset relocation into a dynamic rela, which resolves to original function address, during module load. Also move the comment related to nested function check, into may_need_dynrela(). Suggested-by: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Evgenii Shatokhin <eshatokhin@virtuozzo.com> Cc: Joe Lawrence <jdl1291@gmail.com> Signed-off-by: Kamalesh Babulal <kamalesh@linux.vnet.ibm.com>
f52035c to
19b0aba
Compare
|
v5 Changes:
|
jpoimboe
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, this looks good. kpatch_create_intermediate_sections() could use some more cleanups, but I'll propose those in a separate PR.
|
👍 @kamalesh-babulal , I went reading back through @euspectre 's PR #755 , and remembered that I had posted this integration patch and test: function-pointer.patch.txt This should be applicable to PowerPC implementation as well, right? If so, I can add this test in a follow-up PR. |
|
I'm going to be out next week. Here's my (rough, untested) WIP for refactoring the dynrela code in case anybody's interested: https://github.com/jpoimboe/kpatch/tree/dynrela . I'll try to pick it up again in April. |
@joe-lawrence Yes, It will test the functionality on PowerPC too. Please add this test case to the integration tests. I have been testing this PR with the kernel commit be82485fbcb mentioned in the first comment of PR #755. Can you please merge this PR, so I can rebase and push the patch for PR #793 and other refactoring code too.
@jpoimboe I will try and review/test the patch next week. |
|
@kamalesh-babulal - okay, sounds good |
It was observed by Evgenii Shatokhin in #755, that when the RCU callback was called on the patched function, from unloaded livepatch module triggered a kernel crash.
This patch implements the approach on Powerpc outlined in #755. With -mcmodel=large, like any other data, function pointers are also loaded relative to the current TOC base and are populated as relocation entries in .toc section. Every function call/passing such function as pointers needs to access them through .toc section + offset.
The simplest approach would be to convert the .toc + offset relocation into a dynamic rela referring to the original function address but the same function is also a patched function. Thus, on PowerPC we need to append a new relocation entry into .rela.toc, duplicating the entry used as a function pointer and converting the newly added rela as dynamic rela.
Also, modify the function to load the new entry by modifying the addend value to newly created dynamic rela.