-
Notifications
You must be signed in to change notification settings - Fork 59.8k
FSM Client and Server #331
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Closed
Closed
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
rogerq
pushed a commit
to rogerq/linux
that referenced
this pull request
Jan 9, 2017
… recovery
The keystone remoteproc driver performs an error recovery by scheduling
a workqueue from the keystone_remoteproc_exception_interrupt() handler
when using in-kernel remoteproc core loader/boot mechanism. This interrupt
is registered with IRQF_ONESHOT at the moment, and it results in a
"scheduling while atomic" BUG when running on RT-Linux. Oneshot interrupts
keep the irq line masked until the threaded handler has finished, and
the workqueue scheduling uses spinlocks for synchronization which get
transformed to rt_mutexes on RT. So, fix this by not using IRQF_ONESHOT
while requesting the interrupt. This interrupt is processed by UIO
framework when using the userspace based load/boot mechanism, and
doesn't need any changes in that path.
remoteproc0: crash detected in 10800000.dsp0: type device exception
BUG: sleeping function called from invalid context at kernel/locking/rtmutex.c:931
in_atomic(): 1, irqs_disabled(): 128, pid: 53, name: irq/66-soc:keys
1 lock held by irq/66-soc:keys/53:
#0: (&kirq->wa_lock){......}, at: [<c031773c>] keystone_irq_handler+0xec/0x240
irq event stamp: 170018
hardirqs last enabled at (170017): [<c06b2920>] _raw_spin_unlock_irqrestore+0x78/0x80
hardirqs last disabled at (170018): [<c06b2734>] _raw_spin_lock_irqsave+0x1c/0x58
softirqs last enabled at (0): [<c0023664>] copy_process+0x2bc/0x1678
softirqs last disabled at (0): [< (null)>] (null)
Preemption disabled at:[< (null)>] (null)
CPU: 0 PID: 53 Comm: irq/66-soc:keys Tainted: G W 4.4.36-rt43-03400-gba94d7c1a7fa torvalds#331
Hardware name: Keystone
[<c0017568>] (unwind_backtrace) from [<c00139e0>] (show_stack+0x10/0x14)
[<c00139e0>] (show_stack) from [<c02e4c00>] (dump_stack+0x98/0xc4)
[<c02e4c00>] (dump_stack) from [<c06b2ca0>] (rt_spin_lock+0x24/0x5c)
[<c06b2ca0>] (rt_spin_lock) from [<c003b254>] (queue_work_on+0x60/0x194)
[<c003b254>] (queue_work_on) from [<bf04e488>] (keystone_rproc_exception_interrupt+0x10/0x18 [keystone_remoteproc])
[<bf04e488>] (keystone_rproc_exception_interrupt [keystone_remoteproc]) from [<c0081ff8>] (handle_irq_event_percpu+0x8c/0x178)
[<c0081ff8>] (handle_irq_event_percpu) from [<c008211c>] (handle_irq_event+0x38/0x5c)
[<c008211c>] (handle_irq_event) from [<c0085384>] (handle_level_irq+0xc4/0x168)
[<c0085384>] (handle_level_irq) from [<c0081654>] (generic_handle_irq+0x24/0x34)
[<c0081654>] (generic_handle_irq) from [<c0317748>] (keystone_irq_handler+0xf8/0x240)
[<c0317748>] (keystone_irq_handler) from [<c00830f8>] (irq_forced_thread_fn+0x20/0x74)
[<c00830f8>] (irq_forced_thread_fn) from [<c0083470>] (irq_thread+0x15c/0x230)
[<c0083470>] (irq_thread) from [<c0044898>] (kthread+0xf0/0x108)
[<c0044898>] (kthread) from [<c00102d0>] (ret_from_fork+0x14/0x24)
remoteproc0: handling crash #1 in 10800000.dsp0!!
remoteproc0: recovering 10800000.dsp0
remoteproc0: stopped remote processor 10800000.dsp0
remoteproc0: powering up 10800000.dsp0
remoteproc0: Booting fw image keystone-dsp0-fw, size 3704928
remoteproc0: remote processor 10800000.dsp0 is now up
virtio_rpmsg_bus virtio0: rpmsg host is online
virtio_rpmsg_bus virtio0: creating channel rpmsg-proto addr 0x3d
remoteproc0: registered virtio0 (type 7)
Signed-off-by: Suman Anna <s-anna@ti.com>
laijs
added a commit
to laijs/linux
that referenced
this pull request
Feb 16, 2017
Fix torvalds#331 test_getdents64() doesn't test the return value of snprintf(), it ends up stackoverflow when it continues to do snprintf(). Signed-off-by: Lai Jiangshan <jiangshanlai@gmail.com>
tobetter
pushed a commit
to tobetter/linux
that referenced
this pull request
Dec 23, 2017
Netconsole support for XU4's network card
fengguang
pushed a commit
to 0day-ci/linux
that referenced
this pull request
Nov 10, 2019
Inside print_request(), we query the context/timeline name. Nothing immediately protects the context from being freed if the request is complete -- we rely on serialisation by the caller to keep the name valid until they finish using it. Inside intel_engine_dump(), we generally only print the requsts in the execution queue protected by the engine->active.lock, but we also show the pending execlists ports which are not protected and so require an rcu_read_lock to keep the pointer valid. [ 1695.700883] BUG: KASAN: use-after-free in i915_fence_get_timeline_name+0x53/0x90 [i915] [ 1695.700981] Read of size 8 at addr ffff8887344f4d50 by task gem_ctx_persist/2968 [ 1695.701068] [ 1695.701156] CPU: 1 PID: 2968 Comm: gem_ctx_persist Tainted: G U 5.4.0-rc6+ torvalds#331 [ 1695.701246] Hardware name: Intel Corporation NUC7i5BNK/NUC7i5BNB, BIOS BNKBL357.86A.0052.2017.0918.1346 09/18/2017 [ 1695.701334] Call Trace: [ 1695.701424] dump_stack+0x5b/0x90 [ 1695.701870] ? i915_fence_get_timeline_name+0x53/0x90 [i915] [ 1695.701964] print_address_description.constprop.7+0x36/0x50 [ 1695.702408] ? i915_fence_get_timeline_name+0x53/0x90 [i915] [ 1695.702856] ? i915_fence_get_timeline_name+0x53/0x90 [i915] [ 1695.702947] __kasan_report.cold.10+0x1a/0x3a [ 1695.703390] ? i915_fence_get_timeline_name+0x53/0x90 [i915] [ 1695.703836] i915_fence_get_timeline_name+0x53/0x90 [i915] [ 1695.704241] print_request+0x82/0x2e0 [i915] [ 1695.704638] ? fwtable_read32+0x133/0x360 [i915] [ 1695.705042] ? write_timestamp+0x110/0x110 [i915] [ 1695.705133] ? _raw_spin_lock_irqsave+0x79/0xc0 [ 1695.705221] ? refcount_inc_not_zero_checked+0x91/0x110 [ 1695.705306] ? refcount_dec_and_mutex_lock+0x50/0x50 [ 1695.705709] ? intel_engine_find_active_request+0x202/0x230 [i915] [ 1695.706115] intel_engine_dump+0x2c9/0x900 [i915] Fixes: c36eebd ("drm/i915/gt: execlists->active is serialised by the tasklet") Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
fengguang
pushed a commit
to 0day-ci/linux
that referenced
this pull request
Nov 12, 2019
Inside print_request(), we query the context/timeline name. Nothing immediately protects the context from being freed if the request is complete -- we rely on serialisation by the caller to keep the name valid until they finish using it. Inside intel_engine_dump(), we generally only print the requests in the execution queue protected by the engine->active.lock, but we also show the pending execlists ports which are not protected and so require a rcu_read_lock to keep the pointer valid. [ 1695.700883] BUG: KASAN: use-after-free in i915_fence_get_timeline_name+0x53/0x90 [i915] [ 1695.700981] Read of size 8 at addr ffff8887344f4d50 by task gem_ctx_persist/2968 [ 1695.701068] [ 1695.701156] CPU: 1 PID: 2968 Comm: gem_ctx_persist Tainted: G U 5.4.0-rc6+ torvalds#331 [ 1695.701246] Hardware name: Intel Corporation NUC7i5BNK/NUC7i5BNB, BIOS BNKBL357.86A.0052.2017.0918.1346 09/18/2017 [ 1695.701334] Call Trace: [ 1695.701424] dump_stack+0x5b/0x90 [ 1695.701870] ? i915_fence_get_timeline_name+0x53/0x90 [i915] [ 1695.701964] print_address_description.constprop.7+0x36/0x50 [ 1695.702408] ? i915_fence_get_timeline_name+0x53/0x90 [i915] [ 1695.702856] ? i915_fence_get_timeline_name+0x53/0x90 [i915] [ 1695.702947] __kasan_report.cold.10+0x1a/0x3a [ 1695.703390] ? i915_fence_get_timeline_name+0x53/0x90 [i915] [ 1695.703836] i915_fence_get_timeline_name+0x53/0x90 [i915] [ 1695.704241] print_request+0x82/0x2e0 [i915] [ 1695.704638] ? fwtable_read32+0x133/0x360 [i915] [ 1695.705042] ? write_timestamp+0x110/0x110 [i915] [ 1695.705133] ? _raw_spin_lock_irqsave+0x79/0xc0 [ 1695.705221] ? refcount_inc_not_zero_checked+0x91/0x110 [ 1695.705306] ? refcount_dec_and_mutex_lock+0x50/0x50 [ 1695.705709] ? intel_engine_find_active_request+0x202/0x230 [i915] [ 1695.706115] intel_engine_dump+0x2c9/0x900 [i915] Fixes: c36eebd ("drm/i915/gt: execlists->active is serialised by the tasklet") Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20191111114323.5833-1-chris@chris-wilson.co.uk (cherry picked from commit fecffa4) Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
intel-lab-lkp
pushed a commit
to intel-lab-lkp/linux
that referenced
this pull request
Dec 24, 2025
I've been chasing down the following flaky splat, introduced by recent changes in BTF generation [1]: ------------[ cut here ]------------ BUG: unable to handle page fault for address: ffa000000233d828 #PF: supervisor read access in kernel mode #PF: error_code(0x0000) - not-present page PGD 100000067 P4D 100253067 PUD 100258067 PMD 0 Oops: Oops: 0000 [#1] SMP NOPTI CPU: 1 UID: 0 PID: 390 Comm: test_progs Tainted: G W OE 6.19.0-rc1-gf785a31395d9 torvalds#331 PREEMPT(full) Tainted: [W]=WARN, [O]=OOT_MODULE, [E]=UNSIGNED_MODULE Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.3-4.el9 04/01/2014 RIP: 0010:simplify_symbols+0x2b2/0x480 9.737179] Code: 85 f6 4d 89 f7 b8 01 00 00 00 4c 0f 44 f8 49 83 fd f0 4d 0f 44 fe 75 5b 4d 85 ff 0f 85 76 ff ff ff eb 50 49 8b 4e 20 c1 e0 06 <48> 8b 44 01 10 9 cf fd ff ff 49 89 c5 eb 36 49 c7 c5 RSP: 0018:ffa00000017afc40 EFLAGS: 00010216 RAX: 00000000003fffc0 RBX: 0000000000000002 RCX: ffa0000001f3d858 RDX: ffffffffc0218038 RSI: ffffffffc0218008 RDI: aaaaaaaaaaaaaaab RBP: ffa00000017afd18 R08: 0000000000000072 R09: 0000000000000069 R10: ffffffff8160d6ca R11: 0000000000000000 R12: ffa0000001f3d577 R13: ffffffffc0214058 R14: ffa00000017afdc0 R15: ffa0000001f3e518 FS: 00007f1c638654c0(0000) GS:ff1100089b7bc000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: ffa000000233d828 CR3: 000000010ba1f001 CR4: 0000000000771ef0 PKRU: 55555554 Call Trace: <TASK> ? __kmalloc_node_track_caller_noprof+0x37f/0x740 ? __pfx_setup_modinfo_srcversion+0x10/0x10 ? srso_alias_return_thunk+0x5/0xfbef5 ? kstrdup+0x4a/0x70 ? srso_alias_return_thunk+0x5/0xfbef5 ? setup_modinfo_srcversion+0x1a/0x30 ? srso_alias_return_thunk+0x5/0xfbef5 ? setup_modinfo+0x12b/0x1e0 load_module+0x133a/0x1610 __x64_sys_finit_module+0x31b/0x450 ? entry_SYSCALL_64_after_hwframe+0x76/0x7e do_syscall_64+0x80/0x2d0 ? srso_alias_return_thunk+0x5/0xfbef5 ? exc_page_fault+0x95/0xc0 entry_SYSCALL_64_after_hwframe+0x76/0x7e RIP: 0033:0x7f1c63a2582d 9.794028] Code: 5b 41 5c c3 66 0f 1f 84 00 00 00 00 00 f3 0f 1e fa 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff 8 8b 0d bb 15 0f 00 f7 d8 64 89 01 48 RSP: 002b:00007ffe513df128 EFLAGS: 00000206 ORIG_RAX: 0000000000000139 RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007f1c63a2582d RDX: 0000000000000000 RSI: 0000000000ee83f9 RDI: 0000000000000016 RBP: 00007ffe513df150 R08: 0000000000000000 R09: 0000000000000000 R10: 0000000000000000 R11: 0000000000000206 R12: 00007ffe513e3588 R13: 000000000088fad0 R14: 00000000014bddb0 R15: 00007f1c63ba7000 </TASK> Modules linked in: bpf_testmod(OE) CR2: ffa000000233d828 ---[ end trace 0000000000000000 ]--- RIP: 0010:simplify_symbols+0x2b2/0x480 9.821595] Code: 85 f6 4d 89 f7 b8 01 00 00 00 4c 0f 44 f8 49 83 fd f0 4d 0f 44 fe 75 5b 4d 85 ff 0f 85 76 ff ff ff eb 50 49 8b 4e 20 c1 e0 06 <48> 8b 44 01 10 9 cf fd ff ff 49 89 c5 eb 36 49 c7 c5 RSP: 0018:ffa00000017afc40 EFLAGS: 00010216 RAX: 00000000003fffc0 RBX: 0000000000000002 RCX: ffa0000001f3d858 RDX: ffffffffc0218038 RSI: ffffffffc0218008 RDI: aaaaaaaaaaaaaaab RBP: ffa00000017afd18 R08: 0000000000000072 R09: 0000000000000069 R10: ffffffff8160d6ca R11: 0000000000000000 R12: ffa0000001f3d577 R13: ffffffffc0214058 R14: ffa00000017afdc0 R15: ffa0000001f3e518 FS: 00007f1c638654c0(0000) GS:ff1100089b7bc000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: ffa000000233d828 CR3: 000000010ba1f001 CR4: 0000000000771ef0 PKRU: 55555554 Kernel panic - not syncing: Fatal exception Kernel Offset: disabled This hasn't happened on BPF CI so far, for example, however I was able to reproduce it on a particular x64 machine using a kernel built with LLVM 20. The crash happens on attempt to load one of the BPF selftest modules (tools/testing/selftests/bpf/test_kmods/bpf_test_modorder_x.ko) which is used by kfunc_module_order test. The reason for the crash is that simplify_symbols() doesn't check for bounds of the ELF section index: for (i = 1; i < symsec->sh_size / sizeof(Elf_Sym); i++) { const char *name = info->strtab + sym[i].st_name; switch (sym[i].st_shndx) { case SHN_COMMON: [...] default: /* Divert to percpu allocation if a percpu var. */ if (sym[i].st_shndx == info->index.pcpu) secbase = (unsigned long)mod_percpu(mod); else /** HERE --> **/ secbase = info->sechdrs[sym[i].st_shndx].sh_addr; sym[i].st_value += secbase; break; } } And in the case I was able to reproduce, the value 0xffff (SHN_HIRESERVE aka SHN_XINDEX [2]) fell through here. Now this code fragment is between 15 and 20 years old, so obviously it's not expected for a kmodule symbol to have such st_shndx value. Even so, the kernel probably should fail loading the module instead of crashing, which is what this patch attempts to fix. Investigating further, I discovered that the module binary became corrupted by `${OBJCOPY} --update-section` operation updating .BTF_ids section data in scripts/gen-btf.sh. This explains how the bug has surfaced after gen-btf.sh was introduced: $ llvm-readelf -s --wide bpf_test_modorder_x.ko | grep 'BTF_ID' llvm-readelf: warning: 'bpf_test_modorder_x.ko': found an extended symbol index (2), but unable to locate the extended symbol index table llvm-readelf: warning: 'bpf_test_modorder_x.ko': found an extended symbol index (3), but unable to locate the extended symbol index table llvm-readelf: warning: 'bpf_test_modorder_x.ko': found an extended symbol index (4), but unable to locate the extended symbol index table 3: 0000000000000000 16 NOTYPE LOCAL DEFAULT RSV[0xffff] __BTF_ID__set8__bpf_test_modorder_kfunc_x_ids llvm-readelf: warning: 'bpf_test_modorder_x.ko': found an extended symbol index (16), but unable to locate the extended symbol index table 4: 0000000000000008 4 OBJECT LOCAL DEFAULT RSV[0xffff] __BTF_ID__func__bpf_test_modorder_retx__44417 vs expected $ llvm-readelf -s --wide bpf_test_modorder_x.ko | grep 'BTF_ID' 3: 0000000000000000 16 NOTYPE LOCAL DEFAULT 6 __BTF_ID__set8__bpf_test_modorder_kfunc_x_ids 4: 0000000000000008 4 OBJECT LOCAL DEFAULT 6 __BTF_ID__func__bpf_test_modorder_retx__44417 But why? Updating section data without changing it's size is not supposed to affect sections indices, right? With a bit more testing I confirmed that this is a LLVM-specific issue (doesn't reproduce with GCC kbuild), and it's not stable, because in link-vmlinux.h we also do: ${OBJCOPY} --update-section .BTF_ids=${btfids_vmlinux} ${VMLINUX} However: $ llvm-readelf -s --wide ~/workspace/prog-aux/linux/vmlinux | grep 0xffff # no output, which is good So the suspect is the implementation of llvm-objcopy. As it turns out there is a relevant known bug that explains the flakiness and isn't fixed yet [3]. [1] https://lore.kernel.org/bpf/20251219181825.1289460-3-ihor.solodrai@linux.dev/ [2] https://man7.org/linux/man-pages/man5/elf.5.html [3] llvm/llvm-project#168060 (comment) Signed-off-by: Ihor Solodrai <ihor.solodrai@linux.dev>
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
FSM Subsystem for linux kernel.
http://fsmos.ru/en