-
Notifications
You must be signed in to change notification settings - Fork 34
Test : Update README #10
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Closed
sgaud-quic
wants to merge
0
commits into
qualcomm-linux:qcom-next-staging
from
sgaud-quic:qcom-next-staging
Closed
Test : Update README #10
sgaud-quic
wants to merge
0
commits into
qualcomm-linux:qcom-next-staging
from
sgaud-quic:qcom-next-staging
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
18d8cf2 to
e615cd2
Compare
e615cd2 to
8e30800
Compare
shashim-quic
pushed a commit
that referenced
this pull request
Aug 25, 2025
Both jbd2_log_do_checkpoint() and jbd2_journal_shrink_checkpoint_list() periodically release j_list_lock after processing a batch of buffers to avoid long hold times on the j_list_lock. However, since both functions contend for j_list_lock, the combined time spent waiting and processing can be significant. jbd2_journal_shrink_checkpoint_list() explicitly calls cond_resched() when need_resched() is true to avoid softlockups during prolonged operations. But jbd2_log_do_checkpoint() only exits its loop when need_resched() is true, relying on potentially sleeping functions like __flush_batch() or wait_on_buffer() to trigger rescheduling. If those functions do not sleep, the kernel may hit a softlockup. watchdog: BUG: soft lockup - CPU#3 stuck for 156s! [kworker/u129:2:373] CPU: 3 PID: 373 Comm: kworker/u129:2 Kdump: loaded Not tainted 6.6.0+ #10 Hardware name: Huawei TaiShan 2280 /BC11SPCD, BIOS 1.27 06/13/2017 Workqueue: writeback wb_workfn (flush-7:2) pstate: 20000005 (nzCv daif -PAN -UAO -TCO -DIT -SSBS BTYPE=--) pc : native_queued_spin_lock_slowpath+0x358/0x418 lr : jbd2_log_do_checkpoint+0x31c/0x438 [jbd2] Call trace: native_queued_spin_lock_slowpath+0x358/0x418 jbd2_log_do_checkpoint+0x31c/0x438 [jbd2] __jbd2_log_wait_for_space+0xfc/0x2f8 [jbd2] add_transaction_credits+0x3bc/0x418 [jbd2] start_this_handle+0xf8/0x560 [jbd2] jbd2__journal_start+0x118/0x228 [jbd2] __ext4_journal_start_sb+0x110/0x188 [ext4] ext4_do_writepages+0x3dc/0x740 [ext4] ext4_writepages+0xa4/0x190 [ext4] do_writepages+0x94/0x228 __writeback_single_inode+0x48/0x318 writeback_sb_inodes+0x204/0x590 __writeback_inodes_wb+0x54/0xf8 wb_writeback+0x2cc/0x3d8 wb_do_writeback+0x2e0/0x2f8 wb_workfn+0x80/0x2a8 process_one_work+0x178/0x3e8 worker_thread+0x234/0x3b8 kthread+0xf0/0x108 ret_from_fork+0x10/0x20 So explicitly call cond_resched() in jbd2_log_do_checkpoint() to avoid softlockup. Cc: stable@kernel.org Signed-off-by: Baokun Li <libaokun1@huawei.com> Link: https://patch.msgid.link/20250812063752.912130-1-libaokun@huaweicloud.com Signed-off-by: Theodore Ts'o <tytso@mit.edu>
shashim-quic
pushed a commit
that referenced
this pull request
Aug 25, 2025
…dlock When a user creates a dualpi2 qdisc it automatically sets a timer. This timer will run constantly and update the qdisc's probability field. The issue is that the timer acquires the qdisc root lock and runs in hardirq. The qdisc root lock is also acquired in dev.c whenever a packet arrives for this qdisc. Since the dualpi2 timer callback runs in hardirq, it may interrupt the packet processing running in softirq. If that happens and it runs on the same CPU, it will acquire the same lock and cause a deadlock. The following splat shows up when running a kernel compiled with lock debugging: [ +0.000224] WARNING: inconsistent lock state [ +0.000224] 6.16.0+ #10 Not tainted [ +0.000169] -------------------------------- [ +0.000029] inconsistent {IN-HARDIRQ-W} -> {HARDIRQ-ON-W} usage. [ +0.000000] ping/156 [HC0[0]:SC0[2]:HE1:SE0] takes: [ +0.000000] ffff897841242110 (&sch->root_lock_key){?.-.}-{3:3}, at: __dev_queue_xmit+0x86d/0x1140 [ +0.000000] {IN-HARDIRQ-W} state was registered at: [ +0.000000] lock_acquire.part.0+0xb6/0x220 [ +0.000000] _raw_spin_lock+0x31/0x80 [ +0.000000] dualpi2_timer+0x6f/0x270 [ +0.000000] __hrtimer_run_queues+0x1c5/0x360 [ +0.000000] hrtimer_interrupt+0x115/0x260 [ +0.000000] __sysvec_apic_timer_interrupt+0x6d/0x1a0 [ +0.000000] sysvec_apic_timer_interrupt+0x6e/0x80 [ +0.000000] asm_sysvec_apic_timer_interrupt+0x1a/0x20 [ +0.000000] pv_native_safe_halt+0xf/0x20 [ +0.000000] default_idle+0x9/0x10 [ +0.000000] default_idle_call+0x7e/0x1e0 [ +0.000000] do_idle+0x1e8/0x250 [ +0.000000] cpu_startup_entry+0x29/0x30 [ +0.000000] rest_init+0x151/0x160 [ +0.000000] start_kernel+0x6f3/0x700 [ +0.000000] x86_64_start_reservations+0x24/0x30 [ +0.000000] x86_64_start_kernel+0xc8/0xd0 [ +0.000000] common_startup_64+0x13e/0x148 [ +0.000000] irq event stamp: 6884 [ +0.000000] hardirqs last enabled at (6883): [<ffffffffa75700b3>] neigh_resolve_output+0x223/0x270 [ +0.000000] hardirqs last disabled at (6882): [<ffffffffa7570078>] neigh_resolve_output+0x1e8/0x270 [ +0.000000] softirqs last enabled at (6880): [<ffffffffa757006b>] neigh_resolve_output+0x1db/0x270 [ +0.000000] softirqs last disabled at (6884): [<ffffffffa755b533>] __dev_queue_xmit+0x73/0x1140 [ +0.000000] other info that might help us debug this: [ +0.000000] Possible unsafe locking scenario: [ +0.000000] CPU0 [ +0.000000] ---- [ +0.000000] lock(&sch->root_lock_key); [ +0.000000] <Interrupt> [ +0.000000] lock(&sch->root_lock_key); [ +0.000000] *** DEADLOCK *** [ +0.000000] 4 locks held by ping/156: [ +0.000000] #0: ffff897842332e08 (sk_lock-AF_INET){+.+.}-{0:0}, at: raw_sendmsg+0x41e/0xf40 [ +0.000000] #1: ffffffffa816f880 (rcu_read_lock){....}-{1:3}, at: ip_output+0x2c/0x190 [ +0.000000] #2: ffffffffa816f880 (rcu_read_lock){....}-{1:3}, at: ip_finish_output2+0xad/0x950 [ +0.000000] #3: ffffffffa816f840 (rcu_read_lock_bh){....}-{1:3}, at: __dev_queue_xmit+0x73/0x1140 I am able to reproduce it consistently when running the following: tc qdisc add dev lo handle 1: root dualpi2 ping -f 127.0.0.1 To fix it, make the timer run in softirq. Fixes: 320d031 ("sched: Struct definition and parsing of dualpi2 qdisc") Reviewed-by: Jamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: Victor Nogueira <victor@mojatatu.com> Link: https://patch.msgid.link/20250815135317.664993-1-victor@mojatatu.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
sgaud-quic
pushed a commit
that referenced
this pull request
Sep 29, 2025
smc_lo_register_dmb() allocates DMB buffers with kzalloc(), which are later passed to get_page() in smc_rx_splice(). Since kmalloc memory is not page-backed, this triggers WARN_ON_ONCE() in get_page() and prevents holding a refcount on the buffer. This can lead to use-after-free if the memory is released before splice_to_pipe() completes. Use folio_alloc() instead, ensuring DMBs are page-backed and safe for get_page(). WARNING: CPU: 18 PID: 12152 at ./include/linux/mm.h:1330 smc_rx_splice+0xaf8/0xe20 [smc] CPU: 18 UID: 0 PID: 12152 Comm: smcapp Kdump: loaded Not tainted 6.17.0-rc3-11705-g9cf4672ecfee #10 NONE Hardware name: IBM 3931 A01 704 (z/VM 7.4.0) Krnl PSW : 0704e00180000000 000793161032696c (smc_rx_splice+0xafc/0xe20 [smc]) R:0 T:1 IO:1 EX:1 Key:0 M:1 W:0 P:0 AS:3 CC:2 PM:0 RI:0 EA:3 Krnl GPRS: 0000000000000000 001cee80007d3001 00077400000000f8 0000000000000005 0000000000000001 001cee80007d3006 0007740000001000 001c000000000000 000000009b0c99e0 0000000000001000 001c0000000000f8 001c000000000000 000003ffcc6f7c88 0007740003e98000 0007931600000005 000792969b2ff7b8 Krnl Code: 0007931610326960: af000000 mc 0,0 0007931610326964: a7f4ff43 brc 15,00079316103267ea #0007931610326968: af000000 mc 0,0 >000793161032696c: a7f4ff3f brc 15,00079316103267ea 0007931610326970: e320f1000004 lg %r2,256(%r15) 0007931610326976: c0e53fd1b5f5 brasl %r14,000793168fd5d560 000793161032697c: a7f4fbb5 brc 15,00079316103260e6 0007931610326980: b904002b lgr %r2,%r11 Call Trace: smc_rx_splice+0xafc/0xe20 [smc] smc_rx_splice+0x756/0xe20 [smc]) smc_rx_recvmsg+0xa74/0xe00 [smc] smc_splice_read+0x1ce/0x3b0 [smc] sock_splice_read+0xa2/0xf0 do_splice_read+0x198/0x240 splice_file_to_pipe+0x7e/0x110 do_splice+0x59e/0xde0 __do_splice+0x11a/0x2d0 __s390x_sys_splice+0x140/0x1f0 __do_syscall+0x122/0x280 system_call+0x6e/0x90 Last Breaking-Event-Address: smc_rx_splice+0x960/0xe20 [smc] ---[ end trace 0000000000000000 ]--- Fixes: f7a2207 ("net/smc: implement DMB-related operations of loopback-ism") Reviewed-by: Mahanta Jambigi <mjambigi@linux.ibm.com> Signed-off-by: Sidraya Jayagond <sidraya@linux.ibm.com> Link: https://patch.msgid.link/20250917184220.801066-1-sidraya@linux.ibm.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
sgaud-quic
pushed a commit
that referenced
this pull request
Sep 29, 2025
Running sha224_kunit on a KMSAN-enabled kernel results in a crash in
kmsan_internal_set_shadow_origin():
BUG: unable to handle page fault for address: ffffbc3840291000
#PF: supervisor read access in kernel mode
#PF: error_code(0x0000) - not-present page
PGD 1810067 P4D 1810067 PUD 192d067 PMD 3c17067 PTE 0
Oops: 0000 [#1] SMP NOPTI
CPU: 0 UID: 0 PID: 81 Comm: kunit_try_catch Tainted: G N 6.17.0-rc3 #10 PREEMPT(voluntary)
Tainted: [N]=TEST
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.17.0-0-gb52ca86e094d-prebuilt.qemu.org 04/01/2014
RIP: 0010:kmsan_internal_set_shadow_origin+0x91/0x100
[...]
Call Trace:
<TASK>
__msan_memset+0xee/0x1a0
sha224_final+0x9e/0x350
test_hash_buffer_overruns+0x46f/0x5f0
? kmsan_get_shadow_origin_ptr+0x46/0xa0
? __pfx_test_hash_buffer_overruns+0x10/0x10
kunit_try_run_case+0x198/0xa00
This occurs when memset() is called on a buffer that is not 4-byte aligned
and extends to the end of a guard page, i.e. the next page is unmapped.
The bug is that the loop at the end of kmsan_internal_set_shadow_origin()
accesses the wrong shadow memory bytes when the address is not 4-byte
aligned. Since each 4 bytes are associated with an origin, it rounds the
address and size so that it can access all the origins that contain the
buffer. However, when it checks the corresponding shadow bytes for a
particular origin, it incorrectly uses the original unrounded shadow
address. This results in reads from shadow memory beyond the end of the
buffer's shadow memory, which crashes when that memory is not mapped.
To fix this, correctly align the shadow address before accessing the 4
shadow bytes corresponding to each origin.
Link: https://lkml.kernel.org/r/20250911195858.394235-1-ebiggers@kernel.org
Fixes: 2ef3cec ("kmsan: do not wipe out origin when doing partial unpoisoning")
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
Tested-by: Alexander Potapenko <glider@google.com>
Reviewed-by: Alexander Potapenko <glider@google.com>
Cc: Dmitriy Vyukov <dvyukov@google.com>
Cc: Marco Elver <elver@google.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
sgaud-quic
pushed a commit
that referenced
this pull request
Dec 22, 2025
When interrupting perf stat in repeat mode with a signal the signal is passed to the child process but the repeat doesn't terminate: ``` $ perf stat -v --null --repeat 10 sleep 1 Control descriptor is not initialized [ perf stat: executing run #1 ... ] [ perf stat: executing run #2 ... ] ^Csleep: Interrupt [ perf stat: executing run #3 ... ] [ perf stat: executing run #4 ... ] [ perf stat: executing run #5 ... ] [ perf stat: executing run #6 ... ] [ perf stat: executing run #7 ... ] [ perf stat: executing run #8 ... ] [ perf stat: executing run #9 ... ] [ perf stat: executing run #10 ... ] Performance counter stats for 'sleep 1' (10 runs): 0.9500 +- 0.0512 seconds time elapsed ( +- 5.39% ) 0.01user 0.02system 0:09.53elapsed 0%CPU (0avgtext+0avgdata 18940maxresident)k 29944inputs+0outputs (0major+2629minor)pagefaults 0swaps ``` Terminate the repeated run and give a reasonable exit value: ``` $ perf stat -v --null --repeat 10 sleep 1 Control descriptor is not initialized [ perf stat: executing run #1 ... ] [ perf stat: executing run #2 ... ] [ perf stat: executing run #3 ... ] ^Csleep: Interrupt Performance counter stats for 'sleep 1' (10 runs): 0.680 +- 0.321 seconds time elapsed ( +- 47.16% ) Command exited with non-zero status 130 0.00user 0.01system 0:02.05elapsed 0%CPU (0avgtext+0avgdata 70688maxresident)k 0inputs+0outputs (0major+5002minor)pagefaults 0swaps ``` Note, this also changes the exit value for non-repeat runs when interrupted by a signal. Reported-by: Ingo Molnar <mingo@kernel.org> Closes: https://lore.kernel.org/lkml/aS5wjmbAM9ka3M2g@gmail.com/ Signed-off-by: Ian Rogers <irogers@google.com> Tested-by: Thomas Richter <tmricht@linux.ibm.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Test commit for workflow testing.