Skip to content

Conversation

@euspectre
Copy link
Contributor

@euspectre euspectre commented Nov 7, 2017

A kernel crash happened in __do_softirq() in very rare cases when the
binary patch created from mainline commit be82485fbcbb
("netlink: fix an use-after-free issue for nlk groups") was unloaded.

Kernel: 3.10.0-514.16.1.vz7.30.10, Virtuozzo 7, x86_64.

Investigation has shown that the kernel tried to execute an RCU callback,
deferred_put_nlk_sk(), defined in the patch module after the module had
been unloaded.

The callback was set by the patched variant of netlink_release(), that is
why the address of the patched deferred_put_nlk_sk() rather than of the
original one was used.

@euspectre
Copy link
Contributor Author

Not sure if livepatch-patch-hook.c needs a similar fix. It might but I am not very familiar with that code yet.

@jpoimboe
Copy link
Member

jpoimboe commented Nov 7, 2017

Thanks, this is an interesting issue. However I'm not convinced that this RCU synchronize really belongs in kpatch. Isn't the issue specific to the patch itself? I think a more appropriate fix would be to add a patch unload hook which calls rcu_barrier().

I think this is an example of a broader issue, which should be documented in the patch author guide: You have to be very careful if there are any function pointers to a patched function.

In fact, I wonder if we could detect such an issue (a pointer to a patched function exists) and warn about it in kpatch-build.

@euspectre
Copy link
Contributor Author

Well, if KPatch did not optimize the references to the patched functions when loading the patch module, only the addresses of the original functions were used and this problem would not happen.

Because of how KPatch core loads the patch module, it is perfectly OK to have the pointers to the patched functions. Such optimization of function references seems valuable.

I think, even with unload hooks available, having rcu_barrier() as a "safety net" in the exit function is worth it. It makes sure no RCU callbacks, not only the one from this particular patch, could harm the system when the patch module unloads.

Timer callbacks, workqueue functions and some other callbacks that might be set from the patched code could still be problematic - but unlike RCU, these should be waited for separately. There is no way to wait for all such pending callbacks at once, so one has to use hooks for these anyway.

@jpoimboe
Copy link
Member

jpoimboe commented Nov 8, 2017

Well, if KPatch did not optimize the references to the patched functions when loading the patch module, only the addresses of the original functions were used and this problem would not happen.

Hm. So maybe a better fix would be to ensure that function pointers always refer to the original function instead of the patched function?

@euspectre
Copy link
Contributor Author

euspectre commented Nov 9, 2017

Perhaps, it would be more reliable, yes.

However, I cannot see if it is doable for the callbacks only rather than for all functions. In an arch-independent way, I mean.

Perhaps, some performance measurements could show if using the pointers to the patched functions directly is really worth it. It looks so in the code, esp. for the "old" kpatch which does a table lookup in the common Ftrace filter each time the function is called. Livepatch does things differently, IIRC.

@jpoimboe
Copy link
Member

However, I cannot see if it is doable for the callbacks only rather than for all functions. In an arch-independent way, I mean.

Here's how I think it can be detected on x86:

    4154:       48 c7 c6 00 00 00 00    mov    $0x0,%rsi
                        4157: R_X86_64_32S      .text+0xb50
    415b:       e8 00 00 00 00          callq  4160 <netlink_release+0x3a0>
                        415c: R_X86_64_PC32     call_rcu-0x4

We can look for a R_X86_64_32S relocation which references a symbol to a replacement function. In such a case we can convert it to a klp rela for the original function.

Perhaps, some performance measurements could show if using the pointers to the patched functions directly is really worth it. It looks so in the code, esp. for the "old" kpatch which does a table lookup in the common Ftrace filter each time the function is called. Livepatch does things differently, IIRC.

Performance is important, but safety is more important. I would much rather be safe and take a small performance hit. And anyway, kpatch.ko does a hash lookup which should be pretty fast.

@euspectre
Copy link
Contributor Author

So, we can detect the assignment of callbacks by checking the relocation type and without decoding instructions and such, at least, on x86. This way we would not have to disable the pointer optimization everywhere but rather, for the callbacks only. Thanks for the explanation!

I'll experiment with that when I have time.

@euspectre
Copy link
Contributor Author

BTW, I have reproduced the issue using workqueue functions instead of RCU callbacks. I created a patch for create_user_ns() and free_user_ns() in a VZ7 kernel (based on version 3.10.0-693.1.1 from RHEL). free_user_ns() is a work function set by create_user_ns().

Loading/unloading the patch module in a loop in parallel with 'unshare --map-root-user --user sh -c whoami' triggered the crash in a couple of minutes.

So, yes, a more general solution rather than a fix for just RCU callbacks is needed.

I'll try to prepare a patch based on your suggestion.

@jpoimboe
Copy link
Member

Ok, thanks for looking at it! I haven't looked at what's needed, I'm guessing it's a change to kpatch_create_intermediate_sections(). Let me know if you need any help.

euspectre pushed a commit to euspectre/kpatch that referenced this pull request Nov 17, 2017
…ctly

A kernel crash happened in __do_softirq() in very rare cases when the
binary patch created from mainline commit be82485fbcbb
("netlink: fix an use-after-free issue for nlk groups") was unloaded.

Investigation has shown that the kernel tried to execute an RCU
callback, deferred_put_nlk_sk(), defined in the patch module after the
module had been unloaded.

The callback was set by the patched variant of netlink_release() and
the address of the patched deferred_put_nlk_sk() was used, rather than
the address of the original function.

Similar problems occur with workqueue functions as well.

As suggested in
dynup#755 (comment),
create-diff-object was modified so that the addresses of the original
functions were used in such situations, at least for x86 systems.

Signed-off-by: Evgenii Shatokhin <eshatokhin@virtuozzo.com>
@euspectre
Copy link
Contributor Author

Here it is, please take a look.

The patches created by kpatch-build with create-diff-object changed this way have passed my tests with RCU and workqueue so far.

@euspectre euspectre changed the title kmod/patch: make sure RCU callbacks finish before the patch module unloads kpatch-build, x86: do not use the patched functions as callbacks directly Nov 28, 2017
@euspectre
Copy link
Contributor Author

Comments?

@jpoimboe
Copy link
Member

jpoimboe commented Dec 2, 2017

Sorry, I've been swamped with other work and haven't had a chance to look at it. I'll try to review it soon...

@disaster123
Copy link

Just a note. This one gives me a segfault in create-diff-obj[2592]: segfault at 7f3fa3faea81 ip 00007f3fa364edcc sp 00007ffdf4c064a0 error 4 in libc-2.19.so[7f3fa3604000+1a1000]

@euspectre
Copy link
Contributor Author

@disaster123: Strange. Which source patch kpatch-build was processing when the segfault happened?

And - for which kernel?

@disaster123
Copy link

Patch is this one:
https://pastebin.com/zNUK5C2U

Kernel is an "older" OpenSuSE 42.3 Kernel - source / base:
https://github.com/openSUSE/kernel-source/tree/9c032968bf05aaff3eb55048440c07249e99db33

@euspectre
Copy link
Contributor Author

Thanks for the info. I'll try to reproduce the issue.

@disaster123
Copy link

Were you able to reproduce?

@euspectre
Copy link
Contributor Author

Unfortunately, I have my hands full right now. Perhaps, I'll find time for that later this week.

@euspectre
Copy link
Contributor Author

I have reproduced the problem, investigating...

@euspectre
Copy link
Contributor Author

euspectre commented Dec 10, 2017

Nested function cmp() in bch_cache_show(). No surprise. They already caused problems in the past.

The problem happens when create-diff-object processes drivers/md/bcache/sysfs.o.

(gdb) bt
<...>
#5  0x0000000000407aef in error (__format=0x410468 "ERROR: %s: %s: %d: lookup_local_symbol %s:%s needed for %s", __errnum=0, __status=1)
    at /usr/include/bits/error.h:42
#6  kpatch_create_intermediate_sections (kelf=0x623cc0, table=0x75f860, 
    hint=0x7ffff7fb4541 , objname=0x7fffffffe752 "bcache", 
    pmod_name=0x7fffffffe774 "test_patch") at create-diff-object.c:2426
#7  0x00000000004093f6 in main (argc=7, argv=0x7fffffffe318) at create-diff-object.c:2943

create-diff-object.c:

        if (rela->sym->bind == STB_LOCAL) {
	/* An unchanged local symbol */
	ret = lookup_local_symbol(table,
		rela->sym->name, &result);
	if (ret)
		ERROR("lookup_local_symbol %s:%s needed for %s",
		hint, rela->sym->name, sec->base->name); // <<< Here

create-diff-object was trying to process the reference to the function cmp.36204 in bch_cache_show(). From the source code of bch_cache_show() in drivers/md/bcache/sysfs.c:

        if (attr == &sysfs_priority_stats) {
		int cmp(const void *l, const void *r)
		{	return *((uint16_t *) r) - *((uint16_t *) l); }
	<...>
	sort(p, n, sizeof(uint16_t), cmp, NULL);

Technically, cmp() is a callback but I'd rather skip references to such nested functions in kpatch_create_intermediate_sections() somehow. I doubt they are used as asynchronous callbacks anywhere in the kernel. Therefore it should be OK to process the references to such callbacks the same way as before, without dynrelas.

The question is how to detect such functions in create-diff-object.

@euspectre
Copy link
Contributor Author

euspectre commented Dec 10, 2017

Assuming GCC creates the names for the nested functions using the same rules as for static locals (xxxx.nnnn), we could probably check if the name of the referred-to functions contains a dot.

EDIT: Might not work because optimized copies of the functions may also contain dots in the name. Need another way to detect nested functions, it seems.

@euspectre
Copy link
Contributor Author

By the way, the actual crash happens because the source file name ("sysfs.c", referred to by hint variable) has been freed by kpatch_elf_free(kelf_base) by that time.

This is a separate issue. The original issue with the nested function only helped reveal it.
Moving the call kpatch_elf_free(kelf_base) to the similar calls at the end of main() should fix it.

euspectre pushed a commit to euspectre/kpatch that referenced this pull request Dec 18, 2017
…ctly

A kernel crash happened in __do_softirq() in very rare cases when the
binary patch created from mainline commit be82485fbcbb
("netlink: fix an use-after-free issue for nlk groups") was unloaded.

Investigation has shown that the kernel tried to execute an RCU
callback, deferred_put_nlk_sk(), defined in the patch module after the
module had been unloaded.

The callback was set by the patched variant of netlink_release() and
the address of the patched deferred_put_nlk_sk() was used, rather than
the address of the original function.

Similar problems occur with workqueue functions as well.

As suggested in
dynup#755 (comment),
create-diff-object was modified so that the addresses of the original
functions were used in such situations, at least for x86 systems.

Changes in v2:

* Handle the nested functions the same way as before, because they are
unlikely to be used as asynchronous callbacks. Example: cmp() in
bch_cache_show() from drivers/md/bcache/sysfs.c in the kernel 4.4.

As the nested functions are local to the functions they are defined in,
the compiler names them in a similar way to static locals: <name>.<number>.
Currently, we filter out all functions with '.' in their names. If there
are any asynchronous callbacks in the kernel that have a dot in their
names too, they could be handled in the future patches. It is unclear
though, if the callbacks with such names can appear in the kernel.

Signed-off-by: Evgenii Shatokhin <eshatokhin@virtuozzo.com>
euspectre pushed a commit to euspectre/kpatch that referenced this pull request Dec 18, 2017
kpatch_create_patches_sections() and
kpatch_create_intermediate_sections() in create-diff-object.c expect
'hint' to be the name of the source file.

However, the string 'hint' refers to is owned by 'kelf_base' and is
freed before kpatch_create_*_sections() are called. As a result, if
these functions try to output errors and print 'hint',
create-diff-object will crash.

It seems, nothing else from 'kelf_base' is used at that stage, so let us
simply copy the file name and use that copy.

Found in the scope of dynup#755 but not
related to the main problem discussed there.

Signed-off-by: Evgenii Shatokhin <eshatokhin@virtuozzo.com>
@euspectre
Copy link
Contributor Author

I have updated the fixes, please review.

It seems, there is no way to recognize nested functions in the binary code except by their names. On the other hand, I checked a few kernels from different distros and all callbacks with '.' in their names were nested functions. So, my patch uses this to distinguish nested functions and the ordinary ones.

As for the crashes due to 'hint' pointing to nowhere, I think it is better to copy the relevant string rather than to defer freeing of kelf_base. This is what the second patch does.

hint = strdup(hint);
if (!hint)
ERROR("failed to copy file name.");

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is really minor, but if we move these lines up into the search for STT_FILE, there would only be a single assignment to hint, like hint = strdup(sym->name);

euspectre pushed a commit to euspectre/kpatch that referenced this pull request Feb 15, 2018
…ctly

A kernel crash happened in __do_softirq() in very rare cases when the
binary patch created from mainline commit be82485fbcbb
("netlink: fix an use-after-free issue for nlk groups") was unloaded.

Investigation has shown that the kernel tried to execute an RCU
callback, deferred_put_nlk_sk(), defined in the patch module after the
module had been unloaded.

The callback was set by the patched variant of netlink_release() and
the address of the patched deferred_put_nlk_sk() was used, rather than
the address of the original function.

Similar problems occur with workqueue functions as well.

As suggested in
dynup#755 (comment),
create-diff-object was modified so that the addresses of the original
functions were used in such situations, at least for x86 systems. A
similar fix for PowerPC was added as well.

Changes in v4:

* '#ifdef __x86_64__' was removed. It is not actually needed right now
because the constants for relocation types are different on different
architectures.

Changes in v3:

* Minor refactoring and a comment explaining what this all is about.
Quite lengthy, but the dynrela-related code is really far from obvious.

Changes in v2:

* Handle the nested functions the same way as before, because they are
unlikely to be used as asynchronous callbacks. Example: cmp() in
bch_cache_show() from drivers/md/bcache/sysfs.c in the kernel 4.4.

As the nested functions are local to the functions they are defined in,
the compiler names them in a similar way to static locals: <name>.<number>.
Currently, we filter out all functions with '.' in their names. If there
are any asynchronous callbacks in the kernel that have a dot in their
names too, they could be handled in the future patches. It is unclear
though, if the callbacks with such names can appear in the kernel.

Signed-off-by: Evgenii Shatokhin <eshatokhin@virtuozzo.com>
euspectre pushed a commit to euspectre/kpatch that referenced this pull request Feb 15, 2018
Found in the scope of dynup#755 but not
related to the main problem discussed there.

kpatch_create_patches_sections() and kpatch_create_intermediate_sections()
used 'hint' in error messages.

However, the string 'hint' refers to is owned by 'kelf_base' and is
freed before kpatch_create_*_sections() are called. As a result, if
these functions try to output errors and print 'hint',
create-diff-object will crash.

As suggested in the mentioned PR, 'hint' is actually no longer needed at
that stage, so I have removed it from kpatch_create_*_sections().
@euspectre
Copy link
Contributor Author

euspectre commented Feb 15, 2018

However, we should still wait until they're both ready, and merge them at the same time.

@jpoimboe That is fine with me. We may otherwise forget the fix for PowerPC anyway. There are quite a few other bugs to fix and tasks to do, as usual. Hopefully, there will be no new meltdown/spectre-like bugs in the near future ;-)

I have updated the patch, so a fix for PowerPC could be done on top of it. I cannot be of much help there, unfortunately, becase I know only a little about that architecture. That is why it is better to do that in a separate patch/PR.

@kamalesh-babulal
Copy link
Contributor

kamalesh-babulal commented Feb 16, 2018

For powerpc -mcmodel=large, the formula to detect them should still be relatively simple:

Thanks @jpoimboe . If the PowerPC changes are be tracked in another PR. I can open one, once we have a patch. I will build upon this approach, I have an idea on how to implement other bits on PowerPC. Need to implement the idea and see what more is needed. Thank you @euspectre.

@jpoimboe
Copy link
Member

Sounds good @kamalesh-babulal , thanks!

@euspectre
Copy link
Contributor Author

@kamalesh-babulal

If the PowerPC changes are be tracked in another PR. I can open one, once we have a patch.

Thanks! Looking forward to it.

@joe-lawrence
Copy link
Contributor

If I read the latest commentary correctly, we're going to leave this PR open until all supported arches have equivalent behavior?

If so, can we spin "kpatch-build: 'hint' is not needed in kpatch_create_*_sections()" as an independent PR ... I just hit this bug doing while working on s390x, so it would be nice to get out of the way :)

euspectre pushed a commit to euspectre/kpatch that referenced this pull request Feb 22, 2018
Found in the scope of dynup#755 but not
related to the main problem discussed there.

kpatch_create_patches_sections() and kpatch_create_intermediate_sections()
used 'hint' in error messages.

However, the string 'hint' refers to is owned by 'kelf_base' and is
freed before kpatch_create_*_sections() are called. As a result, if
these functions try to output errors and print 'hint',
create-diff-object will crash.

As suggested in the mentioned PR, 'hint' is actually no longer needed at
that stage, so I have removed it from kpatch_create_*_sections().
…ctly

A kernel crash happened in __do_softirq() in very rare cases when the
binary patch created from mainline commit be82485fbcbb
("netlink: fix an use-after-free issue for nlk groups") was unloaded.

Investigation has shown that the kernel tried to execute an RCU
callback, deferred_put_nlk_sk(), defined in the patch module after the
module had been unloaded.

The callback was set by the patched variant of netlink_release() and
the address of the patched deferred_put_nlk_sk() was used, rather than
the address of the original function.

Similar problems occur with workqueue functions as well.

As suggested in
dynup#755 (comment),
create-diff-object was modified so that the addresses of the original
functions were used in such situations, at least for x86 systems. A
similar fix for PowerPC was added as well.

Changes in v4:

* '#ifdef __x86_64__' was removed. It is not actually needed right now
because the constants for relocation types are different on different
architectures.

Changes in v3:

* Minor refactoring and a comment explaining what this all is about.
Quite lengthy, but the dynrela-related code is really far from obvious.

Changes in v2:

* Handle the nested functions the same way as before, because they are
unlikely to be used as asynchronous callbacks. Example: cmp() in
bch_cache_show() from drivers/md/bcache/sysfs.c in the kernel 4.4.

As the nested functions are local to the functions they are defined in,
the compiler names them in a similar way to static locals: <name>.<number>.
Currently, we filter out all functions with '.' in their names. If there
are any asynchronous callbacks in the kernel that have a dot in their
names too, they could be handled in the future patches. It is unclear
though, if the callbacks with such names can appear in the kernel.

Signed-off-by: Evgenii Shatokhin <eshatokhin@virtuozzo.com>
@euspectre
Copy link
Contributor Author

If so, can we spin "kpatch-build: 'hint' is not needed in kpatch_create_*_sections()" as an independent > PR ... I just hit this bug doing while working on s390x, so it would be nice to get out of the way :)

@joe-lawrence Makes sense. I have moved that hint-related fix into #788 and updated this PR.

kamalesh-babulal pushed a commit to kamalesh-babulal/kpatch that referenced this pull request Mar 19, 2018
…ctly

A kernel crash happened in __do_softirq() in very rare cases when the
binary patch created from mainline commit be82485fbcbb
("netlink: fix an use-after-free issue for nlk groups") was unloaded.

Investigation has shown that the kernel tried to execute an RCU
callback, deferred_put_nlk_sk(), defined in the patch module after the
module had been unloaded.

The callback was set by the patched variant of netlink_release() and
the address of the patched deferred_put_nlk_sk() was used, rather than
the address of the original function.

Similar problems occur with workqueue functions as well.

As suggested in
dynup#755 (comment),
create-diff-object was modified so that the addresses of the original
functions were used in such situations, at least for x86 systems. A
similar fix for PowerPC was added as well.

Changes in v4:

* '#ifdef __x86_64__' was removed. It is not actually needed right now
because the constants for relocation types are different on different
architectures.

Changes in v3:

* Minor refactoring and a comment explaining what this all is about.
Quite lengthy, but the dynrela-related code is really far from obvious.

Changes in v2:

* Handle the nested functions the same way as before, because they are
unlikely to be used as asynchronous callbacks. Example: cmp() in
bch_cache_show() from drivers/md/bcache/sysfs.c in the kernel 4.4.

As the nested functions are local to the functions they are defined in,
the compiler names them in a similar way to static locals: <name>.<number>.
Currently, we filter out all functions with '.' in their names. If there
are any asynchronous callbacks in the kernel that have a dot in their
names too, they could be handled in the future patches. It is unclear
though, if the callbacks with such names can appear in the kernel.

Signed-off-by: Evgenii Shatokhin <eshatokhin@virtuozzo.com>
@jpoimboe
Copy link
Member

This is superseded by #789 which has both x86 and ppc versions of the change.

@jpoimboe jpoimboe closed this Mar 20, 2018
kamalesh-babulal pushed a commit to kamalesh-babulal/kpatch that referenced this pull request Mar 21, 2018
…ctly

A kernel crash happened in __do_softirq() in very rare cases when the
binary patch created from mainline commit be82485fbcbb
("netlink: fix an use-after-free issue for nlk groups") was unloaded.

Investigation has shown that the kernel tried to execute an RCU
callback, deferred_put_nlk_sk(), defined in the patch module after the
module had been unloaded.

The callback was set by the patched variant of netlink_release() and
the address of the patched deferred_put_nlk_sk() was used, rather than
the address of the original function.

Similar problems occur with workqueue functions as well.

As suggested in
dynup#755 (comment),
create-diff-object was modified so that the addresses of the original
functions were used in such situations, at least for x86 systems. A
similar fix for PowerPC was added as well.

Changes in v4:

* '#ifdef __x86_64__' was removed. It is not actually needed right now
because the constants for relocation types are different on different
architectures.

Changes in v3:

* Minor refactoring and a comment explaining what this all is about.
Quite lengthy, but the dynrela-related code is really far from obvious.

Changes in v2:

* Handle the nested functions the same way as before, because they are
unlikely to be used as asynchronous callbacks. Example: cmp() in
bch_cache_show() from drivers/md/bcache/sysfs.c in the kernel 4.4.

As the nested functions are local to the functions they are defined in,
the compiler names them in a similar way to static locals: <name>.<number>.
Currently, we filter out all functions with '.' in their names. If there
are any asynchronous callbacks in the kernel that have a dot in their
names too, they could be handled in the future patches. It is unclear
though, if the callbacks with such names can appear in the kernel.

Signed-off-by: Evgenii Shatokhin <eshatokhin@virtuozzo.com>
jpoimboe added a commit to jpoimboe/kpatch that referenced this pull request Apr 19, 2018
With dynup#755, we started using dynrelas for function pointers.  However,
this behavior only makes sense for function pointers to existing
functions.  For function pointers to *new* functions, just use a normal
rela.

The 'function-ptr-new' unit test is from the following patch:

  https://github.com/dynup/kpatch/files/1927198/new-static-callback.patch.txt

Fixes dynup#834.

Fixes: 495e619 ("kpatch-build, x86: do not use the patched functions as callbacks directly")
Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
jpoimboe added a commit to jpoimboe/kpatch that referenced this pull request Apr 19, 2018
With dynup#755, we started using dynrelas for function pointers.  However,
this behavior only makes sense for function pointers to existing
functions.  For function pointers to *new* functions, just use a normal
rela.

The 'function-ptr-new' unit test is from the following patch:

  https://github.com/dynup/kpatch/files/1927198/new-static-callback.patch.txt

Fixes dynup#834.
Fixes: 495e619 ("kpatch-build, x86: do not use the patched functions as callbacks directly")
Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
euspectre pushed a commit to euspectre/kpatch that referenced this pull request Apr 29, 2019
…ctly

A kernel crash happened in __do_softirq() in very rare cases when the
binary patch created from mainline commit be82485fbcbb
("netlink: fix an use-after-free issue for nlk groups") was unloaded.

Investigation has shown that the kernel tried to execute an RCU
callback, deferred_put_nlk_sk(), defined in the patch module after the
module had been unloaded.

The callback was set by the patched variant of netlink_release() and
the address of the patched deferred_put_nlk_sk() was used, rather than
the address of the original function.

Similar problems occur with workqueue functions as well.

As suggested in
dynup#755 (comment),
create-diff-object was modified so that the addresses of the original
functions were used in such situations, at least for x86 systems. A
similar fix for PowerPC was added as well.

Changes in v4:

* '#ifdef __x86_64__' was removed. It is not actually needed right now
because the constants for relocation types are different on different
architectures.

Changes in v3:

* Minor refactoring and a comment explaining what this all is about.
Quite lengthy, but the dynrela-related code is really far from obvious.

Changes in v2:

* Handle the nested functions the same way as before, because they are
unlikely to be used as asynchronous callbacks. Example: cmp() in
bch_cache_show() from drivers/md/bcache/sysfs.c in the kernel 4.4.

As the nested functions are local to the functions they are defined in,
the compiler names them in a similar way to static locals: <name>.<number>.
Currently, we filter out all functions with '.' in their names. If there
are any asynchronous callbacks in the kernel that have a dot in their
names too, they could be handled in the future patches. It is unclear
though, if the callbacks with such names can appear in the kernel.

Signed-off-by: Evgenii Shatokhin <eshatokhin@virtuozzo.com>
euspectre pushed a commit to euspectre/kpatch that referenced this pull request Apr 29, 2019
With dynup#755, we started using dynrelas for function pointers.  However,
this behavior only makes sense for function pointers to existing
functions.  For function pointers to *new* functions, just use a normal
rela.

The 'function-ptr-new' unit test is from the following patch:

  https://github.com/dynup/kpatch/files/1927198/new-static-callback.patch.txt

Fixes dynup#834.

Fixes: 495e619 ("kpatch-build, x86: do not use the patched functions as callbacks directly")
Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>

Added in the scope of https://jira.sw.ru/browse/PSBM-87678 but it fixes
a different issue in the same code of create-diff-object.

dynup#834

From the description of the issue:
--------------
If a source patch adds a global function foo() that takes a callback as
an argument, and a static function bar() that is used as such callback,
create-diff-object may fail to find bar() in some cases. Namely, - if foo()
is defined in one source file, but called in another source file where bar()
is defined.

create-diff-object complains as follows:

    base.o: new function: foo
    meminfo.o: changed function: meminfo_proc_open
    meminfo.o: new function: bar
    /usr/libexec/kpatch/create-diff-object: ERROR: meminfo.o
      kpatch_create_intermediate_sections: 2485: lookup_local_symbol bar needed
      for .text.meminfo_proc_open
    ERROR: 1 error(s) encountered.
--------------

Signed-off-by: Evgenii Shatokhin <eshatokhin@virtuozzo.com>
@euspectre euspectre deleted the rcu_barrier_on_exit branch July 21, 2020 07:10
sumanthkorikkar added a commit to sumanthkorikkar/kpatch that referenced this pull request Mar 7, 2025
Typically, symbols within the same object file can be referenced
directly for better performance. However, when the patched code stores a
function pointer (R_390_GOTENT and rela symbol type is STT_FUNC), using
a dynrela is a safer approach. This ensures that if the function is used
as a asynchronous callback, the kernel does not attempt to execute an
invalid pointer after the module is unloaded. Instead, with a dynrela,
the kernel will invoke the original function, preventing potential
crashes.

Test program: Test if the original function ptr address is printed
during patch load / unload.
iii-i/linux@0f400db

Patched code (function ptr rela):
void *patchme(void) { printk(KERN_NOTICE "patched\n"); return patchme; }
   0:   c0 04 00 00 00 00       jgnop   0 <patchme>
   6:   eb ef f0 88 00 24       stmg    %r14,%r15,136(%r15)
   c:   c4 28 00 00 00 00       lgrl    %r2,c <patchme+0xc>
                        e: R_390_GOTENT .LC0+0x2
  12:   b9 04 00 ef             lgr     %r14,%r15
  16:   e3 f0 ff e8 ff 71       lay     %r15,-24(%r15)
  1c:   e3 e0 f0 98 00 24       stg     %r14,152(%r15)
  22:   c0 e5 00 00 00 00       brasl   %r14,22 <patchme+0x22>
                        24: R_390_PLT32DBL      _printk+0x2
  28:   c4 28 00 00 00 00       lgrl    %r2,28 <patchme+0x28>
                        2a: R_390_GOTENT        patchme+0x2  <- func ptr
  2e:   eb ef f0 a0 00 04       lmg     %r14,%r15,160(%r15)
  34:   c0 f4 00 00 00 00       jg      34 <patchme+0x34>
                        36: R_390_PC32DBL       __s390_indirect_jump_r14+0x2
  3a:   07 07                   nopr    %r7
  3c:   07 07                   nopr    %r7
  3e:   07 07                   nopr    %r7

Reference:
1. dynup#1437
2. dynup#755

Signed-off-by: Sumanth Korikkar <sumanthk@linux.ibm.com>
sumanthkorikkar added a commit to sumanthkorikkar/kpatch that referenced this pull request Mar 7, 2025
Typically, symbols within the same object file can be referenced
directly for better performance. However, when the patched code stores a
function pointer (R_390_GOTENT and rela symbol type is STT_FUNC), using
a dynrela is a safer approach. This ensures that if the function is used
as a asynchronous callback, the kernel does not attempt to execute an
invalid pointer after the module is unloaded. Instead, with a dynrela,
the kernel will invoke the original function, preventing potential
crashes.

Test program: Test if the original function ptr address is printed
during patch load / unload.
iii-i/linux@0f400db

Patched code (function ptr rela):
void *patchme(void) { printk(KERN_NOTICE "patched\n"); return patchme; }
   0:   c0 04 00 00 00 00       jgnop   0 <patchme>
   6:   eb ef f0 88 00 24       stmg    %r14,%r15,136(%r15)
   c:   c4 28 00 00 00 00       lgrl    %r2,c <patchme+0xc>
                        e: R_390_GOTENT .LC0+0x2
  12:   b9 04 00 ef             lgr     %r14,%r15
  16:   e3 f0 ff e8 ff 71       lay     %r15,-24(%r15)
  1c:   e3 e0 f0 98 00 24       stg     %r14,152(%r15)
  22:   c0 e5 00 00 00 00       brasl   %r14,22 <patchme+0x22>
                        24: R_390_PLT32DBL      _printk+0x2
  28:   c4 28 00 00 00 00       lgrl    %r2,28 <patchme+0x28>
                        2a: R_390_GOTENT        patchme+0x2  <- func ptr
  2e:   eb ef f0 a0 00 04       lmg     %r14,%r15,160(%r15)
  34:   c0 f4 00 00 00 00       jg      34 <patchme+0x34>
                        36: R_390_PC32DBL       __s390_indirect_jump_r14+0x2
  3a:   07 07                   nopr    %r7
  3c:   07 07                   nopr    %r7
  3e:   07 07                   nopr    %r7

Reference:
1. dynup#1437
2. dynup#755

Signed-off-by: Sumanth Korikkar <sumanthk@linux.ibm.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants