forked from torvalds/linux
-
Notifications
You must be signed in to change notification settings - Fork 11
Feature/revpi 719 #1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
iluminat23
merged 2 commits into
RevolutionPi:revpi-4.19
from
iluminat23:feature/REVPI-719
Aug 10, 2020
Merged
Feature/revpi 719 #1
iluminat23
merged 2 commits into
RevolutionPi:revpi-4.19
from
iluminat23:feature/REVPI-719
Aug 10, 2020
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
… improvements Use device managed functions to simplify error handling, reduce source code size, improve readability, and reduce the likelyhood of bugs. Other improvements as listed below. The conversion was done automatically with coccinelle using the following semantic patches. The semantic patches and the scripts used to generate this commit log are available at https://github.com/groeck/coccinelle-patches - Drop assignments to otherwise unused variables - Drop empty remove function - Use local variable 'struct device *dev' consistently - Use devm_watchdog_register_driver() to register watchdog device Signed-off-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Wim Van Sebroeck <wim@linux-watchdog.org> (cherry picked from commit 3564fbc) Signed-off-by: Philipp Rosenberger <p.rosenberger@kunbus.com>
Add support for the nowayout option in the gpio watchdog driver. Signed-off-by: Mans Rullgard <mans@mansr.com> Reviewed-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Wim Van Sebroeck <wim@linux-watchdog.org> (cherry picked from commit 1a4aaf9) Signed-off-by: Philipp Rosenberger <p.rosenberger@kunbus.com>
linosanfilippo-kunbus
approved these changes
Aug 10, 2020
linosanfilippo-kunbus
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good.
linosanfilippo-kunbus
pushed a commit
that referenced
this pull request
Aug 11, 2020
[ Upstream commit 863844e ] With commit 216b440 ("brcmfmac: Fix use after free in brcmf_sdio_readframes()") applied, we see locking timeouts in brcmf_sdio_watchdog_thread(). brcmfmac: brcmf_escan_timeout: timer expired INFO: task brcmf_wdog/mmc1:621 blocked for more than 120 seconds. Not tainted 4.19.94-07984-g24ff99a0f713 #1 "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. brcmf_wdog/mmc1 D 0 621 2 0x00000000 last_sleep: 2440793077. last_runnable: 2440766827 [<c0aa1e60>] (__schedule) from [<c0aa2100>] (schedule+0x98/0xc4) [<c0aa2100>] (schedule) from [<c0853830>] (__mmc_claim_host+0x154/0x274) [<c0853830>] (__mmc_claim_host) from [<bf10c5b8>] (brcmf_sdio_watchdog_thread+0x1b0/0x1f8 [brcmfmac]) [<bf10c5b8>] (brcmf_sdio_watchdog_thread [brcmfmac]) from [<c02570b8>] (kthread+0x178/0x180) In addition to restarting or exiting the loop, it is also necessary to abort the command and to release the host. Fixes: 216b440 ("brcmfmac: Fix use after free in brcmf_sdio_readframes()") Cc: Dan Carpenter <dan.carpenter@oracle.com> Cc: Matthias Kaehlcke <mka@chromium.org> Cc: Brian Norris <briannorris@chromium.org> Cc: Douglas Anderson <dianders@chromium.org> Signed-off-by: Guenter Roeck <linux@roeck-us.net> Reviewed-by: Douglas Anderson <dianders@chromium.org> Acked-by: franky.lin@broadcom.com Acked-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Kalle Valo <kvalo@codeaurora.org> Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Philipp Rosenberger <p.rosenberger@kunbus.com>
linosanfilippo-kunbus
added a commit
to linosanfilippo-kunbus/linux
that referenced
this pull request
Jan 8, 2021
The tpm core module creates two device files (/dev/tpm* and /dev/tpmrm*) as soon as a tpm chip registers. However it is possible that the tpm chip unregisters while one or both of these files are still opened for read/write operations. In case that only the tpmrm file is still open after chip registration and the file is accessed for a write operation the following warning is triggered: WARNING: CPU: 2 PID: 8800 at lib/refcount.c:153 refcount_inc_checked+0x4c/0x50 refcount_t: increment on 0; use-after-free. Modules linked in: tpm_tis_spi tpm_tis_core tpm xt_nat veth ipt_MASQUERADE nf_conntrack_netlink nfnetlink xt_addrtype iptable_nat nf_nat_ipv4 nf_nat br_netfilter bridge stp llc overlay cmac bnep hci_uart btbcm serdev bluetooth ecdh_generic brcmfmac evdev brcmutil sha256_generic ftdi_sio usbserial cfg80211 bcm2835_codec(C) raspberrypi_hwmon bcm2835_v4l2(C) hwmon v4l2_mem2mem bcm2835_mmal_vchiq(C) v4l2_common videobuf2_dma_contig videobuf2_vmalloc rfkill videobuf2_memops videobuf2_v4l2 videobuf2_common videodev ip6t_REJECT vc_sm_cma(C) nf_reject_ipv6 media ip6t_rt ip6table_filter ip6_tables gpio_wdt gpio_keys fixed mcp320x ad5446 industrialio uio_pdrv_genirq spidev uio ipt_REJECT nf_reject_ipv4 xt_recent xt_tcpudp xt_conntrack nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 xt_limit iptable_filter i2c_dev ip_tables x_tables ipv6 [last unloaded: spi_bcm2835] CPU: 2 PID: 8800 Comm: iotedged Tainted: G C 4.19.95-rt38-LS-v7+ RevolutionPi#1 Hardware name: BCM2835 [<80112a08>] (unwind_backtrace) from [<8010d5fc>] (show_stack+0x20/0x24) [<8010d5fc>] (show_stack) from [<808b3b50>] (dump_stack+0xd4/0x118) [<808b3b50>] (dump_stack) from [<80121edc>] (__warn.part.0+0xcc/0xe8) [<80121edc>] (__warn.part.0) from [<80121f74>] (warn_slowpath_fmt+0x7c/0xa4) [<80121f74>] (warn_slowpath_fmt) from [<805670f8>] (refcount_inc_checked+0x4c/0x50) [<805670f8>] (refcount_inc_checked) from [<808b8924>] (kobject_get+0x2c/0x5c) [<808b8924>] (kobject_get) from [<8060f57c>] (get_device+0x24/0x2c) [<8060f57c>] (get_device) from [<7f1b4384>] (tpm_try_get_ops+0x24/0x58 [tpm]) [<7f1b4384>] (tpm_try_get_ops [tpm]) from [<7f1b6dbc>] (tpm_common_write+0x17c/0x1fc [tpm]) [<7f1b6dbc>] (tpm_common_write [tpm]) from [<7f1b89a4>] (tpmrm_write+0x58/0x6b4 [tpm]) [<7f1b89a4>] (tpmrm_write [tpm]) from [<802ee854>] (__vfs_write+0x4c/0x17c) [<802ee854>] (__vfs_write) from [<802eeb6c>] (vfs_write+0xb4/0x1c8) [<802eeb6c>] (vfs_write) from [<802eee5c>] (ksys_write+0x70/0xfc) [<802eee5c>] (ksys_write) from [<802eef00>] (sys_write+0x18/0x1c) [<802eef00>] (sys_write) from [<80101000>] (ret_fast_syscall+0x0/0x28) Exception stack(0x9ac97fa8 to 0x9ac97ff0) 7fa0: 0000000 76ef506c 00000009 76ef506c 0000000 00000000 7fc0: 0000000 76ef506c 00000009 00000004 76ef506c 0000000 00000000 00000001 7fe0: 00000004 7e93e110 76c54371 76c56526 ---[ end trace 0000000000000002 ]--- ------------[ cut here ]------------ WARNING: CPU: 2 PID: 8800 at lib/refcount.c:187 refcount_sub_and_test_checked+0xb0/0xb8 refcount_t: underflow; use-after-free. Modules linked in: tpm_tis_spi tpm_tis_core tpm xt_nat veth ipt_MASQUERADE nf_conntrack_netlink nfnetlink xt_addrtype iptable_nat nf_nat_ipv4 nf_nat br_netfilter bridge stp llc overlay cmac bnep hci_uart btbcm serdev bluetooth ecdh_generic brcmfmac evdev brcmutil sha256_generic ftdi_sio usbserial cfg80211 bcm2835_codec(C) raspberrypi_hwmon bcm2835_v4l2(C) hwmon v4l2_mem2mem bcm2835_mmal_vchiq(C) v4l2_common videobuf2_dma_contig videobuf2_vmalloc rfkill videobuf2_memops videobuf2_v4l2 videobuf2_common videodev ip6t_REJECT vc_sm_cma(C) nf_reject_ipv6 media ip6t_rt ip6table_filter ip6_tables gpio_wdt gpio_keys fixed mcp320x ad5446 industrialio uio_pdrv_genirq spidev uio ipt_REJECT nf_reject_ipv4 xt_recent xt_tcpudp xt_conntrack nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 xt_limit iptable_filter i2c_dev ip_tables x_tables ipv6 [last unloaded: spi_bcm2835] CPU: 2 PID: 8800 Comm: iotedged Tainted: G WC 4.19.95-rt38-LS-v7+ RevolutionPi#1 Hardware name: BCM2835 [<80112a08>] (unwind_backtrace) from [<8010d5fc>] (show_stack+0x20/0x24) [<8010d5fc>] (show_stack) from [<808b3b50>] (dump_stack+0xd4/0x118) [<808b3b50>] (dump_stack) from [<80121edc>] (__warn.part.0+0xcc/0xe8) [<80121edc>] (__warn.part.0) from [<80121f74>] (warn_slowpath_fmt+0x7c/0xa4) [<80121f74>] (warn_slowpath_fmt) from [<805671ac>] (refcount_sub_and_test_checked+0xb0/0xb8) [<805671ac>] (refcount_sub_and_test_checked) from [<805671cc>] (refcount_dec_and_test_checked+0x18/0x1c) [<805671cc>] (refcount_dec_and_test_checked) from [<808b8aa8>] (kobject_put+0x2c/0xe0) [<808b8aa8>] (kobject_put) from [<8060f788>] (put_device+0x24/0x28) [<8060f788>] (put_device) from [<7f1b43b0>] (tpm_try_get_ops+0x50/0x58 [tpm]) [<7f1b43b0>] (tpm_try_get_ops [tpm]) from [<7f1b6dbc>] (tpm_common_write+0x17c/0x1fc [tpm]) [<7f1b6dbc>] (tpm_common_write [tpm]) from [<7f1b89a4>] (tpmrm_write+0x58/0x6b4 [tpm]) [<7f1b89a4>] (tpmrm_write [tpm]) from [<802ee854>] (__vfs_write+0x4c/0x17c) [<802ee854>] (__vfs_write) from [<802eeb6c>] (vfs_write+0xb4/0x1c8) [<802eeb6c>] (vfs_write) from [<802eee5c>] (ksys_write+0x70/0xfc) [<802eee5c>] (ksys_write) from [<802eef00>] (sys_write+0x18/0x1c) [<802eef00>] (sys_write) from [<80101000>] (ret_fast_syscall+0x0/0x28) Exception stack(0x9ac97fa8 to 0x9ac97ff0) 7fa0: 0000000 76ef506c 00000009 76ef506c 0000000 00000000 7fc0: 0000000 76ef506c 00000009 00000004 76ef506c 0000000 00000000 00000001 7fe0: 00000004 7e93e110 76c54371 76c56526 ---[ end trace 0000000000000003 ]--- The reason is an errorneous reference counting for the tpm dev: In tpm_chip_alloc() an extra reference ought to be retrieved in case that the tpmrm file is still open while the tpm file is closed. However getting the reference depends on the flag TPM_CHIP_FLAG_TPM2, which is never set in the source code (the original intention was obviously to set it in case that tpm2 and thus the tpmrm file should be used). Missing to get this extra reference causes the warning in the error case described above. Fix this by getting an extra reference to the tmp dev in tpm_common_open() which is called for both the tpm and the tpmrm device. This makes sure that the tpmrm device still holds a valid reference even if the tpm device is already closed. Put the respective reference in tpm_common_release() which are also called for both devices and thus ensures that all references are released eventually. Signed-off-by: Lino Sanfilippo <l.sanfilippo@kunbus.com>
iluminat23
pushed a commit
that referenced
this pull request
Jan 20, 2022
…rylock() commit 38fa320 upstream. While reboot the system by sysrq, the following bug will be occur. BUG: sleeping function called from invalid context at kernel/locking/semaphore.c:90 in_atomic(): 0, irqs_disabled(): 128, non_block: 0, pid: 10052, name: rc.shutdown CPU: 3 PID: 10052 Comm: rc.shutdown Tainted: G W O 5.10.0 #1 Call trace: dump_backtrace+0x0/0x1c8 show_stack+0x18/0x28 dump_stack+0xd0/0x110 ___might_sleep+0x14c/0x160 __might_sleep+0x74/0x88 down_interruptible+0x40/0x118 virt_efi_reset_system+0x3c/0xd0 efi_reboot+0xd4/0x11c machine_restart+0x60/0x9c emergency_restart+0x1c/0x2c sysrq_handle_reboot+0x1c/0x2c __handle_sysrq+0xd0/0x194 write_sysrq_trigger+0xbc/0xe4 proc_reg_write+0xd4/0xf0 vfs_write+0xa8/0x148 ksys_write+0x6c/0xd8 __arm64_sys_write+0x18/0x28 el0_svc_common.constprop.3+0xe4/0x16c do_el0_svc+0x1c/0x2c el0_svc+0x20/0x30 el0_sync_handler+0x80/0x17c el0_sync+0x158/0x180 The reason for this problem is that irq has been disabled in machine_restart() and then it calls down_interruptible() in virt_efi_reset_system(), which would occur sleep in irq context, it is dangerous! Commit 99409b9("locking/semaphore: Add might_sleep() to down_*() family") add might_sleep() in down_interruptible(), so the bug info is here. down_trylock() can solve this problem, cause there is no might_sleep. -------- Cc: <stable@vger.kernel.org> Signed-off-by: Zhang Jianhua <chris.zjh@huawei.com> Signed-off-by: Ard Biesheuvel <ardb@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
iluminat23
pushed a commit
that referenced
this pull request
Jan 20, 2022
…tart commit 60d950f upstream. In commit 74fc4f8 ("net: Fix offloading indirect devices dependency on qdisc order creation"), it adds a process to trigger the callback to setup the bo callback when the driver regists a callback. In our current implement, we are not ready to run the callback when nfp call the function flow_indr_dev_register, then there will be error message as: kernel: Oops: 0000 [#1] SMP PTI kernel: CPU: 0 PID: 14119 Comm: kworker/0:0 Tainted: G kernel: Workqueue: events work_for_cpu_fn kernel: RIP: 0010:nfp_flower_indr_setup_tc_cb+0x258/0x410 kernel: RSP: 0018:ffffbc1e02c57bf8 EFLAGS: 00010286 kernel: RAX: 0000000000000000 RBX: ffff9c761fabc000 RCX: 0000000000000001 kernel: RDX: 0000000000000001 RSI: fffffffffffffff0 RDI: ffffffffc0be9ef1 kernel: RBP: ffffbc1e02c57c58 R08: ffffffffc08f33aa R09: ffff9c6db7478800 kernel: R10: 0000009c003f6e00 R11: ffffbc1e02800000 R12: ffffbc1e000d9000 kernel: R13: ffffbc1e000db428 R14: ffff9c6db7478800 R15: ffff9c761e884e80 kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 kernel: CR2: fffffffffffffff0 CR3: 00000009e260a004 CR4: 00000000007706f0 kernel: DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 kernel: DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 kernel: PKRU: 55555554 kernel: Call Trace: kernel: ? flow_indr_dev_register+0xab/0x210 kernel: ? __cond_resched+0x15/0x30 kernel: ? kmem_cache_alloc_trace+0x44/0x4b0 kernel: ? nfp_flower_setup_tc+0x1d0/0x1d0 [nfp] kernel: flow_indr_dev_register+0x158/0x210 kernel: ? tcf_block_unbind+0xe0/0xe0 kernel: nfp_flower_init+0x40b/0x650 [nfp] kernel: nfp_net_pci_probe+0x25f/0x960 [nfp] kernel: ? nfp_rtsym_read_le+0x76/0x130 [nfp] kernel: nfp_pci_probe+0x6a9/0x820 [nfp] kernel: local_pci_probe+0x45/0x80 So we need to call flow_indr_dev_register in app start process instead of init stage. Fixes: 74fc4f8 ("net: Fix offloading indirect devices dependency on qdisc order creation") Signed-off-by: Baowen Zheng <baowen.zheng@corigine.com> Signed-off-by: Simon Horman <simon.horman@corigine.com> Signed-off-by: Louis Peens <louis.peens@corigine.com> Link: https://lore.kernel.org/r/20211012124850.13025-1-louis.peens@corigine.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
iluminat23
pushed a commit
that referenced
this pull request
Jan 20, 2022
commit b15fa92 upstream. Starting with kernel 5.11 built with CONFIG_FORTIFY_SOURCE mouting an ocfs2 filesystem with either o2cb or pcmk cluster stack fails with the trace below. Problem seems to be that strings for cluster stack and cluster name are not guaranteed to be null terminated in the disk representation, while strlcpy assumes that the source string is always null terminated. This causes a read outside of the source string triggering the buffer overflow detection. detected buffer overflow in strlen ------------[ cut here ]------------ kernel BUG at lib/string.c:1149! invalid opcode: 0000 [#1] SMP PTI CPU: 1 PID: 910 Comm: mount.ocfs2 Not tainted 5.14.0-1-amd64 #1 Debian 5.14.6-2 RIP: 0010:fortify_panic+0xf/0x11 ... Call Trace: ocfs2_initialize_super.isra.0.cold+0xc/0x18 [ocfs2] ocfs2_fill_super+0x359/0x19b0 [ocfs2] mount_bdev+0x185/0x1b0 legacy_get_tree+0x27/0x40 vfs_get_tree+0x25/0xb0 path_mount+0x454/0xa20 __x64_sys_mount+0x103/0x140 do_syscall_64+0x3b/0xc0 entry_SYSCALL_64_after_hwframe+0x44/0xae Link: https://lkml.kernel.org/r/20210929180654.32460-1-vvidic@valentin-vidic.from.hr Signed-off-by: Valentin Vidic <vvidic@valentin-vidic.from.hr> Reviewed-by: Joseph Qi <joseph.qi@linux.alibaba.com> Cc: Mark Fasheh <mark@fasheh.com> Cc: Joel Becker <jlbec@evilplan.org> Cc: Junxiao Bi <junxiao.bi@oracle.com> Cc: Changwei Ge <gechangwei@live.cn> Cc: Gang He <ghe@suse.com> Cc: Jun Piao <piaojun@huawei.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
iluminat23
pushed a commit
that referenced
this pull request
Jan 20, 2022
commit 74c42e1 upstream. Currently collapse_file does not explicitly check PG_writeback, instead, page_has_private and try_to_release_page are used to filter writeback pages. This does not work for xfs with blocksize equal to or larger than pagesize, because in such case xfs has no page->private. This makes collapse_file bail out early for writeback page. Otherwise, xfs end_page_writeback will panic as follows. page:fffffe00201bcc80 refcount:0 mapcount:0 mapping:ffff0003f88c86a8 index:0x0 pfn:0x84ef32 aops:xfs_address_space_operations [xfs] ino:30000b7 dentry name:"libtest.so" flags: 0x57fffe0000008027(locked|referenced|uptodate|active|writeback) raw: 57fffe0000008027 ffff80001b48bc28 ffff80001b48bc28 ffff0003f88c86a8 raw: 0000000000000000 0000000000000000 00000000ffffffff ffff0000c3e9a000 page dumped because: VM_BUG_ON_PAGE(((unsigned int) page_ref_count(page) + 127u <= 127u)) page->mem_cgroup:ffff0000c3e9a000 ------------[ cut here ]------------ kernel BUG at include/linux/mm.h:1212! Internal error: Oops - BUG: 0 [#1] SMP Modules linked in: BUG: Bad page state in process khugepaged pfn:84ef32 xfs(E) page:fffffe00201bcc80 refcount:0 mapcount:0 mapping:0 index:0x0 pfn:0x84ef32 libcrc32c(E) rfkill(E) aes_ce_blk(E) crypto_simd(E) ... CPU: 25 PID: 0 Comm: swapper/25 Kdump: loaded Tainted: ... pstate: 60400005 (nZCv daif +PAN -UAO -TCO BTYPE=--) Call trace: end_page_writeback+0x1c0/0x214 iomap_finish_page_writeback+0x13c/0x204 iomap_finish_ioend+0xe8/0x19c iomap_writepage_end_bio+0x38/0x50 bio_endio+0x168/0x1ec blk_update_request+0x278/0x3f0 blk_mq_end_request+0x34/0x15c virtblk_request_done+0x38/0x74 [virtio_blk] blk_done_softirq+0xc4/0x110 __do_softirq+0x128/0x38c __irq_exit_rcu+0x118/0x150 irq_exit+0x1c/0x30 __handle_domain_irq+0x8c/0xf0 gic_handle_irq+0x84/0x108 el1_irq+0xcc/0x180 arch_cpu_idle+0x18/0x40 default_idle_call+0x4c/0x1a0 cpuidle_idle_call+0x168/0x1e0 do_idle+0xb4/0x104 cpu_startup_entry+0x30/0x9c secondary_start_kernel+0x104/0x180 Code: d4210000 b0006161 910c8021 94013f4d (d4210000) ---[ end trace 4a88c6a074082f8c ]--- Kernel panic - not syncing: Oops - BUG: Fatal exception in interrupt Link: https://lkml.kernel.org/r/20211022023052.33114-1-rongwei.wang@linux.alibaba.com Fixes: 99cb0db ("mm,thp: add read-only THP support for (non-shmem) FS") Signed-off-by: Rongwei Wang <rongwei.wang@linux.alibaba.com> Signed-off-by: Xu Yu <xuyu@linux.alibaba.com> Suggested-by: Yang Shi <shy828301@gmail.com> Reviewed-by: Matthew Wilcox (Oracle) <willy@infradead.org> Reviewed-by: Yang Shi <shy828301@gmail.com> Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Cc: Song Liu <song@kernel.org> Cc: William Kucharski <william.kucharski@oracle.com> Cc: Hugh Dickins <hughd@google.com> Cc: Mike Kravetz <mike.kravetz@oracle.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
iluminat23
pushed a commit
that referenced
this pull request
Jan 20, 2022
commit 6473395 upstream. When copying the device name, the length of the data memcpy copied exceeds the length of the source buffer, which cause the KASAN issue below. Use strscpy_pad() instead. BUG: KASAN: slab-out-of-bounds in ib_nl_set_path_rec_attrs+0x136/0x320 [ib_core] Read of size 64 at addr ffff88811a10f5e0 by task rping/140263 CPU: 3 PID: 140263 Comm: rping Not tainted 5.15.0-rc1+ #1 Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014 Call Trace: dump_stack_lvl+0x57/0x7d print_address_description.constprop.0+0x1d/0xa0 kasan_report+0xcb/0x110 kasan_check_range+0x13d/0x180 memcpy+0x20/0x60 ib_nl_set_path_rec_attrs+0x136/0x320 [ib_core] ib_nl_make_request+0x1c6/0x380 [ib_core] send_mad+0x20a/0x220 [ib_core] ib_sa_path_rec_get+0x3e3/0x800 [ib_core] cma_query_ib_route+0x29b/0x390 [rdma_cm] rdma_resolve_route+0x308/0x3e0 [rdma_cm] ucma_resolve_route+0xe1/0x150 [rdma_ucm] ucma_write+0x17b/0x1f0 [rdma_ucm] vfs_write+0x142/0x4d0 ksys_write+0x133/0x160 do_syscall_64+0x43/0x90 entry_SYSCALL_64_after_hwframe+0x44/0xae RIP: 0033:0x7f26499aa90f Code: 89 54 24 18 48 89 74 24 10 89 7c 24 08 e8 29 fd ff ff 48 8b 54 24 18 48 8b 74 24 10 41 89 c0 8b 7c 24 08 b8 01 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 31 44 89 c7 48 89 44 24 08 e8 5c fd ff ff 48 RSP: 002b:00007f26495f2dc0 EFLAGS: 00000293 ORIG_RAX: 0000000000000001 RAX: ffffffffffffffda RBX: 00000000000007d0 RCX: 00007f26499aa90f RDX: 0000000000000010 RSI: 00007f26495f2e00 RDI: 0000000000000003 RBP: 00005632a8315440 R08: 0000000000000000 R09: 0000000000000001 R10: 0000000000000000 R11: 0000000000000293 R12: 00007f26495f2e00 R13: 00005632a83154e0 R14: 00005632a8315440 R15: 00005632a830a810 Allocated by task 131419: kasan_save_stack+0x1b/0x40 __kasan_kmalloc+0x7c/0x90 proc_self_get_link+0x8b/0x100 pick_link+0x4f1/0x5c0 step_into+0x2eb/0x3d0 walk_component+0xc8/0x2c0 link_path_walk+0x3b8/0x580 path_openat+0x101/0x230 do_filp_open+0x12e/0x240 do_sys_openat2+0x115/0x280 __x64_sys_openat+0xce/0x140 do_syscall_64+0x43/0x90 entry_SYSCALL_64_after_hwframe+0x44/0xae Fixes: 2ca546b ("IB/sa: Route SA pathrecord query through netlink") Link: https://lore.kernel.org/r/72ede0f6dab61f7f23df9ac7a70666e07ef314b0.1635055496.git.leonro@nvidia.com Signed-off-by: Mark Zhang <markzhang@nvidia.com> Reviewed-by: Mark Bloch <mbloch@nvidia.com> Signed-off-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
iluminat23
pushed a commit
that referenced
this pull request
Jan 20, 2022
This reverts commit 88dbd08. Causes the following Syzkaller reported issue: BUG: kernel NULL pointer dereference, address: 0000000000000010 PGD 0 P4D 0 Oops: 0002 [#1] PREEMPT SMP KASAN CPU: 1 PID: 546 Comm: syz-executor631 Tainted: G B 5.10.76-syzkaller-01178-g4944ec82ebb9 #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 RIP: 0010:arch_atomic_try_cmpxchg syzkaller/managers/android-5-10/kernel/./arch/x86/include/asm/atomic.h:202 [inline] RIP: 0010:atomic_try_cmpxchg_acquire syzkaller/managers/android-5-10/kernel/./include/asm-generic/atomic-instrumented.h:707 [inline] RIP: 0010:queued_spin_lock syzkaller/managers/android-5-10/kernel/./include/asm-generic/qspinlock.h:82 [inline] RIP: 0010:do_raw_spin_lock_flags syzkaller/managers/android-5-10/kernel/./include/linux/spinlock.h:195 [inline] RIP: 0010:__raw_spin_lock_irqsave syzkaller/managers/android-5-10/kernel/./include/linux/spinlock_api_smp.h:119 [inline] RIP: 0010:_raw_spin_lock_irqsave+0x10d/0x210 syzkaller/managers/android-5-10/kernel/kernel/locking/spinlock.c:159 Code: 00 00 00 e8 d5 29 09 fd 4c 89 e7 be 04 00 00 00 e8 c8 29 09 fd 42 8a 04 3b 84 c0 0f 85 be 00 00 00 8b 44 24 40 b9 01 00 00 00 <f0> 41 0f b1 4d 00 75 45 48 c7 44 24 20 0e 36 e0 45 4b c7 04 37 00 RSP: 0018:ffffc90000f174e0 EFLAGS: 00010097 RAX: 0000000000000000 RBX: 1ffff920001e2ea4 RCX: 0000000000000001 RDX: 0000000000000001 RSI: 0000000000000004 RDI: ffffc90000f17520 RBP: ffffc90000f175b0 R08: dffffc0000000000 R09: 0000000000000003 R10: fffff520001e2ea5 R11: 0000000000000004 R12: ffffc90000f17520 R13: 0000000000000010 R14: 1ffff920001e2ea0 R15: dffffc0000000000 FS: 0000000000000000(0000) GS:ffff8881f7100000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000000000010 CR3: 000000000640f000 CR4: 00000000003506a0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: prepare_to_wait+0x9c/0x290 syzkaller/managers/android-5-10/kernel/kernel/sched/wait.c:248 io_uring_cancel_files syzkaller/managers/android-5-10/kernel/fs/io_uring.c:8690 [inline] io_uring_cancel_task_requests+0x16a9/0x1ed0 syzkaller/managers/android-5-10/kernel/fs/io_uring.c:8760 io_uring_flush+0x170/0x6d0 syzkaller/managers/android-5-10/kernel/fs/io_uring.c:8923 filp_close+0xb0/0x150 syzkaller/managers/android-5-10/kernel/fs/open.c:1319 close_files syzkaller/managers/android-5-10/kernel/fs/file.c:401 [inline] put_files_struct+0x1d4/0x350 syzkaller/managers/android-5-10/kernel/fs/file.c:429 exit_files+0x80/0xa0 syzkaller/managers/android-5-10/kernel/fs/file.c:458 do_exit+0x6d9/0x23a0 syzkaller/managers/android-5-10/kernel/kernel/exit.c:808 do_group_exit+0x16a/0x2d0 syzkaller/managers/android-5-10/kernel/kernel/exit.c:910 get_signal+0x133e/0x1f80 syzkaller/managers/android-5-10/kernel/kernel/signal.c:2790 arch_do_signal+0x8d/0x620 syzkaller/managers/android-5-10/kernel/arch/x86/kernel/signal.c:805 exit_to_user_mode_loop syzkaller/managers/android-5-10/kernel/kernel/entry/common.c:161 [inline] exit_to_user_mode_prepare+0xaa/0xe0 syzkaller/managers/android-5-10/kernel/kernel/entry/common.c:191 syscall_exit_to_user_mode+0x24/0x40 syzkaller/managers/android-5-10/kernel/kernel/entry/common.c:266 do_syscall_64+0x3d/0x70 syzkaller/managers/android-5-10/kernel/arch/x86/entry/common.c:56 entry_SYSCALL_64_after_hwframe+0x44/0xa9 RIP: 0033:0x7fc6d1589a89 Code: Unable to access opcode bytes at RIP 0x7fc6d1589a5f. RSP: 002b:00007ffd2b5da728 EFLAGS: 00000246 ORIG_RAX: 00000000000000ca RAX: fffffffffffffdfc RBX: 0000000000005193 RCX: 00007fc6d1589a89 RDX: 0000000000000000 RSI: 0000000000000080 RDI: 00007fc6d161142c RBP: 0000000000000032 R08: 00007ffd2b5eb0b8 R09: 0000000000000000 R10: 00007ffd2b5da750 R11: 0000000000000246 R12: 00007fc6d161142c R13: 00007ffd2b5da750 R14: 00007ffd2b5da770 R15: 0000000000000000 Modules linked in: CR2: 0000000000000010 ---[ end trace fe8044f7dc4d8d65 ]--- RIP: 0010:arch_atomic_try_cmpxchg syzkaller/managers/android-5-10/kernel/./arch/x86/include/asm/atomic.h:202 [inline] RIP: 0010:atomic_try_cmpxchg_acquire syzkaller/managers/android-5-10/kernel/./include/asm-generic/atomic-instrumented.h:707 [inline] RIP: 0010:queued_spin_lock syzkaller/managers/android-5-10/kernel/./include/asm-generic/qspinlock.h:82 [inline] RIP: 0010:do_raw_spin_lock_flags syzkaller/managers/android-5-10/kernel/./include/linux/spinlock.h:195 [inline] RIP: 0010:__raw_spin_lock_irqsave syzkaller/managers/android-5-10/kernel/./include/linux/spinlock_api_smp.h:119 [inline] RIP: 0010:_raw_spin_lock_irqsave+0x10d/0x210 syzkaller/managers/android-5-10/kernel/kernel/locking/spinlock.c:159 Code: 00 00 00 e8 d5 29 09 fd 4c 89 e7 be 04 00 00 00 e8 c8 29 09 fd 42 8a 04 3b 84 c0 0f 85 be 00 00 00 8b 44 24 40 b9 01 00 00 00 <f0> 41 0f b1 4d 00 75 45 48 c7 44 24 20 0e 36 e0 45 4b c7 04 37 00 RSP: 0018:ffffc90000f174e0 EFLAGS: 00010097 RAX: 0000000000000000 RBX: 1ffff920001e2ea4 RCX: 0000000000000001 RDX: 0000000000000001 RSI: 0000000000000004 RDI: ffffc90000f17520 RBP: ffffc90000f175b0 R08: dffffc0000000000 R09: 0000000000000003 R10: fffff520001e2ea5 R11: 0000000000000004 R12: ffffc90000f17520 R13: 0000000000000010 R14: 1ffff920001e2ea0 R15: dffffc0000000000 FS: 0000000000000000(0000) GS:ffff8881f7100000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000000000010 CR3: 000000000640f000 CR4: 00000000003506a0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 ---------------- Code disassembly (best guess), 1 bytes skipped: 0: 00 00 add %al,(%rax) 2: e8 d5 29 09 fd callq 0xfd0929dc 7: 4c 89 e7 mov %r12,%rdi a: be 04 00 00 00 mov $0x4,%esi f: e8 c8 29 09 fd callq 0xfd0929dc 14: 42 8a 04 3b mov (%rbx,%r15,1),%al 18: 84 c0 test %al,%al 1a: 0f 85 be 00 00 00 jne 0xde 20: 8b 44 24 40 mov 0x40(%rsp),%eax 24: b9 01 00 00 00 mov $0x1,%ecx * 29: f0 41 0f b1 4d 00 lock cmpxchg %ecx,0x0(%r13) <-- trapping instruction 2f: 75 45 jne 0x76 31: 48 c7 44 24 20 0e 36 movq $0x45e0360e,0x20(%rsp) 38: e0 45 3a: 4b rex.WXB 3b: c7 .byte 0xc7 3c: 04 37 add $0x37,%al Link: https://syzkaller.appspot.com/bug?extid=b0003676644cf0d6acc4 Reported-by: syzbot+b0003676644cf0d6acc4@syzkaller.appspotmail.com Signed-off-by: Lee Jones <lee.jones@linaro.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
iluminat23
pushed a commit
that referenced
this pull request
Jan 20, 2022
commit 703535e upstream. No need to deduce command size in scsi_setup_scsi_cmnd() anymore as appropriate checks have been added to scsi_fill_sghdr_rq() function and the cmd_len should never be zero here. The code to do that wasn't correct anyway, as it used uninitialized cmd->cmnd, which caused a null-ptr-deref if the command size was zero as in the trace below. Fix this by removing the unneeded code. KASAN: null-ptr-deref in range [0x0000000000000000-0x0000000000000007] CPU: 0 PID: 1822 Comm: repro Not tainted 5.15.0 #1 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.14.0-4.fc34 04/01/2014 Call Trace: blk_mq_dispatch_rq_list+0x7c7/0x12d0 __blk_mq_sched_dispatch_requests+0x244/0x380 blk_mq_sched_dispatch_requests+0xf0/0x160 __blk_mq_run_hw_queue+0xe8/0x160 __blk_mq_delay_run_hw_queue+0x252/0x5d0 blk_mq_run_hw_queue+0x1dd/0x3b0 blk_mq_sched_insert_request+0x1ff/0x3e0 blk_execute_rq_nowait+0x173/0x1e0 blk_execute_rq+0x15c/0x540 sg_io+0x97c/0x1370 scsi_ioctl+0xe16/0x28e0 sd_ioctl+0x134/0x170 blkdev_ioctl+0x362/0x6e0 block_ioctl+0xb0/0xf0 vfs_ioctl+0xa7/0xf0 do_syscall_64+0x3d/0xb0 entry_SYSCALL_64_after_hwframe+0x44/0xae ---[ end trace 8b086e334adef6d2 ]--- Kernel panic - not syncing: Fatal exception Link: https://lore.kernel.org/r/20211103170659.22151-2-tadeusz.struk@linaro.org Fixes: 2ceda20 ("scsi: core: Move command size detection out of the fast path") Cc: Bart Van Assche <bvanassche@acm.org> Cc: Christoph Hellwig <hch@lst.de> Cc: James E.J. Bottomley <jejb@linux.ibm.com> Cc: Martin K. Petersen <martin.petersen@oracle.com> Cc: <linux-scsi@vger.kernel.org> Cc: <linux-kernel@vger.kernel.org> Cc: <stable@vger.kernel.org> # 5.15, 5.14, 5.10 Reported-by: syzbot+5516b30f5401d4dcbcae@syzkaller.appspotmail.com Reviewed-by: Bart Van Assche <bvanassche@acm.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Tadeusz Struk <tadeusz.struk@linaro.org> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
iluminat23
pushed a commit
that referenced
this pull request
Jan 20, 2022
commit 3ef68d4 upstream. Kernel crashes when accessing port_speed sysfs file. The issue happens on a CNA when the local array was accessed beyond bounds. Fix this by changing the lookup. BUG: unable to handle kernel paging request at 0000000000004000 PGD 0 P4D 0 Oops: 0000 [#1] SMP PTI CPU: 15 PID: 455213 Comm: sosreport Kdump: loaded Not tainted 4.18.0-305.7.1.el8_4.x86_64 #1 RIP: 0010:string_nocheck+0x12/0x70 Code: 00 00 4c 89 e2 be 20 00 00 00 48 89 ef e8 86 9a 00 00 4c 01 e3 eb 81 90 49 89 f2 48 89 ce 48 89 f8 48 c1 fe 30 66 85 f6 74 4f <44> 0f b6 0a 45 84 c9 74 46 83 ee 01 41 b8 01 00 00 00 48 8d 7c 37 RSP: 0018:ffffb5141c1afcf0 EFLAGS: 00010286 RAX: ffff8bf4009f8000 RBX: ffff8bf4009f9000 RCX: ffff0a00ffffff04 RDX: 0000000000004000 RSI: ffffffffffffffff RDI: ffff8bf4009f8000 RBP: 0000000000004000 R08: 0000000000000001 R09: ffffb5141c1afb84 R10: ffff8bf4009f9000 R11: ffffb5141c1afce6 R12: ffff0a00ffffff04 R13: ffffffffc08e21aa R14: 0000000000001000 R15: ffffffffc08e21aa FS: 00007fc4ebfff700(0000) GS:ffff8c717f7c0000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000000004000 CR3: 000000edfdee6006 CR4: 00000000001706e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: string+0x40/0x50 vsnprintf+0x33c/0x520 scnprintf+0x4d/0x90 qla2x00_port_speed_show+0xb5/0x100 [qla2xxx] dev_attr_show+0x1c/0x40 sysfs_kf_seq_show+0x9b/0x100 seq_read+0x153/0x410 vfs_read+0x91/0x140 ksys_read+0x4f/0xb0 do_syscall_64+0x5b/0x1a0 entry_SYSCALL_64_after_hwframe+0x65/0xca Link: https://lore.kernel.org/r/20210908164622.19240-7-njavali@marvell.com Fixes: 4910b52 ("scsi: qla2xxx: Add support for setting port speed") Cc: stable@vger.kernel.org Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com> Signed-off-by: Arun Easi <aeasi@marvell.com> Signed-off-by: Nilesh Javali <njavali@marvell.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
iluminat23
pushed a commit
that referenced
this pull request
Jan 20, 2022
commit e775eb9 upstream. When enable debug kernel configs,there will be calltrace as below: BUG: using smp_processor_id() in preemptible [00000000] code: swapper/0/1 caller is debug_smp_processor_id+0x20/0x30 CPU: 6 PID: 1 Comm: swapper/0 Not tainted 5.10.63-yocto-standard #1 Hardware name: NXP Layerscape LX2160ARDB (DT) Call trace: dump_backtrace+0x0/0x1a0 show_stack+0x24/0x30 dump_stack+0xf0/0x13c check_preemption_disabled+0x100/0x110 debug_smp_processor_id+0x20/0x30 dpaa2_io_query_fq_count+0xdc/0x154 dpaa2_eth_stop+0x144/0x314 __dev_close_many+0xdc/0x160 __dev_change_flags+0xe8/0x220 dev_change_flags+0x30/0x70 ic_close_devs+0x50/0x78 ip_auto_config+0xed0/0xf10 do_one_initcall+0xac/0x460 kernel_init_freeable+0x30c/0x378 kernel_init+0x20/0x128 ret_from_fork+0x10/0x38 Based on comment in the context, it doesn't matter whether preemption is disable or not. So, replace smp_processor_id() with raw_smp_processor_id() to avoid above call trace. Fixes: c89105c ("staging: fsl-mc: Move DPIO from staging to drivers/soc/fsl") Cc: stable@vger.kernel.org Signed-off-by: Meng Li <Meng.Li@windriver.com> Signed-off-by: Li Yang <leoyang.li@nxp.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
iluminat23
pushed a commit
that referenced
this pull request
Jan 20, 2022
commit dc7e594 upstream. In orininal code, use 2 function spin_lock() and local_irq_save() to protect the critical zone. But when enable the kernel debug config, there are below inconsistent lock state detected. ================================ WARNING: inconsistent lock state 5.10.63-yocto-standard #1 Not tainted -------------------------------- inconsistent {SOFTIRQ-ON-W} -> {IN-SOFTIRQ-W} usage. lock_torture_wr/226 [HC0[0]:SC1[5]:HE1:SE0] takes: ffff002005b2dd80 (&p->access_spinlock){+.?.}-{3:3}, at: qbman_swp_enqueue_multiple_mem_back+0x44/0x270 {SOFTIRQ-ON-W} state was registered at: lock_acquire.part.0+0xf8/0x250 lock_acquire+0x68/0x84 _raw_spin_lock+0x68/0x90 qbman_swp_enqueue_multiple_mem_back+0x44/0x270 ...... cryptomgr_test+0x38/0x60 kthread+0x158/0x164 ret_from_fork+0x10/0x38 irq event stamp: 4498 hardirqs last enabled at (4498): [<ffff800010fcf980>] _raw_spin_unlock_irqrestore+0x90/0xb0 hardirqs last disabled at (4497): [<ffff800010fcffc4>] _raw_spin_lock_irqsave+0xd4/0xe0 softirqs last enabled at (4458): [<ffff8000100108c4>] __do_softirq+0x674/0x724 softirqs last disabled at (4465): [<ffff80001005b2a4>] __irq_exit_rcu+0x190/0x19c other info that might help us debug this: Possible unsafe locking scenario: CPU0 ---- lock(&p->access_spinlock); <Interrupt> lock(&p->access_spinlock); *** DEADLOCK *** So, in order to avoid deadlock, use the combined functions spin_lock_irqsave/spin_unlock_irqrestore() to protect critical zone. Fixes: 3b2abda ("soc: fsl: dpio: Replace QMAN array mode with ring mode enqueue") Cc: stable@vger.kernel.org Signed-off-by: Meng Li <Meng.Li@windriver.com> Signed-off-by: Li Yang <leoyang.li@nxp.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
iluminat23
pushed a commit
that referenced
this pull request
Jan 20, 2022
[ Upstream commit dbb4cfe ] The interrupt handling should be related to the firmware version. If the driver matches an old firmware, then the driver should not handle interrupt such as i2c or dma, otherwise it will cause some errors. This log reveals it: [ 27.708641] INFO: trying to register non-static key. [ 27.710851] The code is fine but needs lockdep annotation, or maybe [ 27.712010] you didn't initialize this object before use? [ 27.712396] turning off the locking correctness validator. [ 27.712787] CPU: 2 PID: 0 Comm: swapper/2 Not tainted 5.12.4-g70e7f0549188-dirty #169 [ 27.713349] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.12.0-59-gc9ba5276e321-prebuilt.qemu.org 04/01/2014 [ 27.714149] Call Trace: [ 27.714329] <IRQ> [ 27.714480] dump_stack+0xba/0xf5 [ 27.714737] register_lock_class+0x873/0x8f0 [ 27.715052] ? __lock_acquire+0x323/0x1930 [ 27.715353] __lock_acquire+0x75/0x1930 [ 27.715636] lock_acquire+0x1dd/0x3e0 [ 27.715905] ? netup_i2c_interrupt+0x19/0x310 [ 27.716226] _raw_spin_lock_irqsave+0x4b/0x60 [ 27.716544] ? netup_i2c_interrupt+0x19/0x310 [ 27.716863] netup_i2c_interrupt+0x19/0x310 [ 27.717178] netup_unidvb_isr+0xd3/0x160 [ 27.717467] __handle_irq_event_percpu+0x53/0x3e0 [ 27.717808] handle_irq_event_percpu+0x35/0x90 [ 27.718129] handle_irq_event+0x39/0x60 [ 27.718409] handle_fasteoi_irq+0xc2/0x1d0 [ 27.718707] __common_interrupt+0x7f/0x150 [ 27.719008] common_interrupt+0xb4/0xd0 [ 27.719289] </IRQ> [ 27.719446] asm_common_interrupt+0x1e/0x40 [ 27.719747] RIP: 0010:native_safe_halt+0x17/0x20 [ 27.720084] Code: 07 0f 00 2d 8b ee 4c 00 f4 5d c3 0f 1f 84 00 00 00 00 00 8b 05 72 95 17 02 55 48 89 e5 85 c0 7e 07 0f 00 2d 6b ee 4c 00 fb f4 <5d> c3 cc cc cc cc cc cc cc 55 48 89 e5 e8 67 53 ff ff 8b 0d 29 f6 [ 27.721386] RSP: 0018:ffffc9000008fe90 EFLAGS: 00000246 [ 27.721758] RAX: 0000000000000000 RBX: 0000000000000002 RCX: 0000000000000000 [ 27.722262] RDX: 0000000000000000 RSI: ffffffff85f7c054 RDI: ffffffff85ded4e6 [ 27.722770] RBP: ffffc9000008fe90 R08: 0000000000000001 R09: 0000000000000001 [ 27.723277] R10: 0000000000000000 R11: 0000000000000001 R12: ffffffff86a75408 [ 27.723781] R13: 0000000000000000 R14: 0000000000000000 R15: ffff888100260000 [ 27.724289] default_idle+0x9/0x10 [ 27.724537] arch_cpu_idle+0xa/0x10 [ 27.724791] default_idle_call+0x6e/0x250 [ 27.725082] do_idle+0x1f0/0x2d0 [ 27.725326] cpu_startup_entry+0x18/0x20 [ 27.725613] start_secondary+0x11f/0x160 [ 27.725902] secondary_startup_64_no_verify+0xb0/0xbb [ 27.726272] BUG: kernel NULL pointer dereference, address: 0000000000000002 [ 27.726768] #PF: supervisor read access in kernel mode [ 27.727138] #PF: error_code(0x0000) - not-present page [ 27.727507] PGD 8000000118688067 P4D 8000000118688067 PUD 10feab067 PMD 0 [ 27.727999] Oops: 0000 [#1] PREEMPT SMP PTI [ 27.728302] CPU: 2 PID: 0 Comm: swapper/2 Not tainted 5.12.4-g70e7f0549188-dirty #169 [ 27.728861] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.12.0-59-gc9ba5276e321-prebuilt.qemu.org 04/01/2014 [ 27.729660] RIP: 0010:netup_i2c_interrupt+0x23/0x310 [ 27.730019] Code: 0f 1f 80 00 00 00 00 55 48 89 e5 41 55 41 54 53 48 89 fb e8 af 6e 95 fd 48 89 df e8 e7 9f 1c 01 49 89 c5 48 8b 83 48 08 00 00 <66> 44 8b 60 02 44 89 e0 48 8b 93 48 08 00 00 83 e0 f8 66 89 42 02 [ 27.731339] RSP: 0018:ffffc90000118e90 EFLAGS: 00010046 [ 27.731716] RAX: 0000000000000000 RBX: ffff88810803c4d8 RCX: 0000000000000000 [ 27.732223] RDX: 0000000000000001 RSI: ffffffff85d37b94 RDI: ffff88810803c4d8 [ 27.732727] RBP: ffffc90000118ea8 R08: 0000000000000000 R09: 0000000000000001 [ 27.733239] R10: ffff88810803c4f0 R11: 61646e6f63657320 R12: 0000000000000000 [ 27.733745] R13: 0000000000000046 R14: ffff888101041000 R15: ffff8881081b2400 [ 27.734251] FS: 0000000000000000(0000) GS:ffff88817bc80000(0000) knlGS:0000000000000000 [ 27.734821] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 27.735228] CR2: 0000000000000002 CR3: 0000000108194000 CR4: 00000000000006e0 [ 27.735735] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 27.736241] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 27.736744] Call Trace: [ 27.736924] <IRQ> [ 27.737074] netup_unidvb_isr+0xd3/0x160 [ 27.737363] __handle_irq_event_percpu+0x53/0x3e0 [ 27.737706] handle_irq_event_percpu+0x35/0x90 [ 27.738028] handle_irq_event+0x39/0x60 [ 27.738306] handle_fasteoi_irq+0xc2/0x1d0 [ 27.738602] __common_interrupt+0x7f/0x150 [ 27.738899] common_interrupt+0xb4/0xd0 [ 27.739176] </IRQ> [ 27.739331] asm_common_interrupt+0x1e/0x40 [ 27.739633] RIP: 0010:native_safe_halt+0x17/0x20 [ 27.739967] Code: 07 0f 00 2d 8b ee 4c 00 f4 5d c3 0f 1f 84 00 00 00 00 00 8b 05 72 95 17 02 55 48 89 e5 85 c0 7e 07 0f 00 2d 6b ee 4c 00 fb f4 <5d> c3 cc cc cc cc cc cc cc 55 48 89 e5 e8 67 53 ff ff 8b 0d 29 f6 [ 27.741275] RSP: 0018:ffffc9000008fe90 EFLAGS: 00000246 [ 27.741647] RAX: 0000000000000000 RBX: 0000000000000002 RCX: 0000000000000000 [ 27.742148] RDX: 0000000000000000 RSI: ffffffff85f7c054 RDI: ffffffff85ded4e6 [ 27.742652] RBP: ffffc9000008fe90 R08: 0000000000000001 R09: 0000000000000001 [ 27.743154] R10: 0000000000000000 R11: 0000000000000001 R12: ffffffff86a75408 [ 27.743652] R13: 0000000000000000 R14: 0000000000000000 R15: ffff888100260000 [ 27.744157] default_idle+0x9/0x10 [ 27.744405] arch_cpu_idle+0xa/0x10 [ 27.744658] default_idle_call+0x6e/0x250 [ 27.744948] do_idle+0x1f0/0x2d0 [ 27.745190] cpu_startup_entry+0x18/0x20 [ 27.745475] start_secondary+0x11f/0x160 [ 27.745761] secondary_startup_64_no_verify+0xb0/0xbb [ 27.746123] Modules linked in: [ 27.746348] Dumping ftrace buffer: [ 27.746596] (ftrace buffer empty) [ 27.746852] CR2: 0000000000000002 [ 27.747094] ---[ end trace ebafd46f83ab946d ]--- [ 27.747424] RIP: 0010:netup_i2c_interrupt+0x23/0x310 [ 27.747778] Code: 0f 1f 80 00 00 00 00 55 48 89 e5 41 55 41 54 53 48 89 fb e8 af 6e 95 fd 48 89 df e8 e7 9f 1c 01 49 89 c5 48 8b 83 48 08 00 00 <66> 44 8b 60 02 44 89 e0 48 8b 93 48 08 00 00 83 e0 f8 66 89 42 02 [ 27.749082] RSP: 0018:ffffc90000118e90 EFLAGS: 00010046 [ 27.749461] RAX: 0000000000000000 RBX: ffff88810803c4d8 RCX: 0000000000000000 [ 27.749966] RDX: 0000000000000001 RSI: ffffffff85d37b94 RDI: ffff88810803c4d8 [ 27.750471] RBP: ffffc90000118ea8 R08: 0000000000000000 R09: 0000000000000001 [ 27.750976] R10: ffff88810803c4f0 R11: 61646e6f63657320 R12: 0000000000000000 [ 27.751480] R13: 0000000000000046 R14: ffff888101041000 R15: ffff8881081b2400 [ 27.751986] FS: 0000000000000000(0000) GS:ffff88817bc80000(0000) knlGS:0000000000000000 [ 27.752560] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 27.752970] CR2: 0000000000000002 CR3: 0000000108194000 CR4: 00000000000006e0 [ 27.753481] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 27.753984] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 27.754487] Kernel panic - not syncing: Fatal exception in interrupt [ 27.755033] Dumping ftrace buffer: [ 27.755279] (ftrace buffer empty) [ 27.755534] Kernel Offset: disabled [ 27.755785] Rebooting in 1 seconds.. Signed-off-by: Zheyu Ma <zheyuma97@gmail.com> Signed-off-by: Sean Young <sean@mess.org> Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
iluminat23
pushed a commit
that referenced
this pull request
Jan 20, 2022
[ Upstream commit b220c15 ] Coverity complains of a possible NULL dereference: CID 120718 (#1 of 1): Dereference null return value (NULL_RETURNS) 23. dereference: Dereferencing a pointer that might be NULL state->bos when calling msm_gpu_crashstate_get_bo. [show details] 301 msm_gpu_crashstate_get_bo(state, submit->bos[i].obj, 302 submit->bos[i].iova, submit->bos[i].flags); Fix this by employing the same state->bos NULL check as is used in the next for loop. Cc: Rob Clark <robdclark@gmail.com> Cc: Sean Paul <sean@poorly.run> Cc: David Airlie <airlied@linux.ie> Cc: Daniel Vetter <daniel@ffwll.ch> Cc: linux-arm-msm@vger.kernel.org Cc: dri-devel@lists.freedesktop.org Cc: freedreno@lists.freedesktop.org Cc: linux-kernel@vger.kernel.org Signed-off-by: Tim Gardner <tim.gardner@canonical.com> Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org> Link: https://lore.kernel.org/r/20210929162554.14295-1-tim.gardner@canonical.com Signed-off-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org> Signed-off-by: Rob Clark <robdclark@chromium.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
iluminat23
pushed a commit
that referenced
this pull request
Jan 20, 2022
[ Upstream commit 39fbef4 ] The following kernel crash can be triggered: [ 89.266592] ------------[ cut here ]------------ [ 89.267427] kernel BUG at fs/buffer.c:3020! [ 89.268264] invalid opcode: 0000 [#1] SMP KASAN PTI [ 89.269116] CPU: 7 PID: 1750 Comm: kmmpd-loop0 Not tainted 5.10.0-862.14.0.6.x86_64-08610-gc932cda3cef4-dirty #20 [ 89.273169] RIP: 0010:submit_bh_wbc.isra.0+0x538/0x6d0 [ 89.277157] RSP: 0018:ffff888105ddfd08 EFLAGS: 00010246 [ 89.278093] RAX: 0000000000000005 RBX: ffff888124231498 RCX: ffffffffb2772612 [ 89.279332] RDX: 1ffff11024846293 RSI: 0000000000000008 RDI: ffff888124231498 [ 89.280591] RBP: ffff8881248cc000 R08: 0000000000000001 R09: ffffed1024846294 [ 89.281851] R10: ffff88812423149f R11: ffffed1024846293 R12: 0000000000003800 [ 89.283095] R13: 0000000000000001 R14: 0000000000000000 R15: ffff8881161f7000 [ 89.284342] FS: 0000000000000000(0000) GS:ffff88839b5c0000(0000) knlGS:0000000000000000 [ 89.285711] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 89.286701] CR2: 00007f166ebc01a0 CR3: 0000000435c0e000 CR4: 00000000000006e0 [ 89.287919] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 89.289138] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 89.290368] Call Trace: [ 89.290842] write_mmp_block+0x2ca/0x510 [ 89.292218] kmmpd+0x433/0x9a0 [ 89.294902] kthread+0x2dd/0x3e0 [ 89.296268] ret_from_fork+0x22/0x30 [ 89.296906] Modules linked in: by running the following commands: 1. mkfs.ext4 -O mmp /dev/sda -b 1024 2. mount /dev/sda /home/test 3. echo "/dev/sda" > /sys/power/resume That happens because swsusp_check() calls set_blocksize() on the target partition which confuses the file system: Thread1 Thread2 mount /dev/sda /home/test get s_mmp_bh --> has mapped flag start kmmpd thread echo "/dev/sda" > /sys/power/resume resume_store software_resume swsusp_check set_blocksize truncate_inode_pages_range truncate_cleanup_page block_invalidatepage discard_buffer --> clean mapped flag write_mmp_block submit_bh submit_bh_wbc BUG_ON(!buffer_mapped(bh)) To address this issue, modify swsusp_check() to open the target block device with exclusive access. Signed-off-by: Ye Bin <yebin10@huawei.com> [ rjw: Subject and changelog edits ] Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
iluminat23
pushed a commit
that referenced
this pull request
Jan 20, 2022
[ Upstream commit 8ef9dc0 ] We got the following lockdep splat while running fstests (specifically btrfs/003 and btrfs/020 in a row) with the new rc. This was uncovered by 87579e9 ("loop: use worker per cgroup instead of kworker") which converted loop to using workqueues, which comes with lockdep annotations that don't exist with kworkers. The lockdep splat is as follows: WARNING: possible circular locking dependency detected 5.14.0-rc2-custom+ #34 Not tainted ------------------------------------------------------ losetup/156417 is trying to acquire lock: ffff9c7645b02d38 ((wq_completion)loop0){+.+.}-{0:0}, at: flush_workqueue+0x84/0x600 but task is already holding lock: ffff9c7647395468 (&lo->lo_mutex){+.+.}-{3:3}, at: __loop_clr_fd+0x41/0x650 [loop] which lock already depends on the new lock. the existing dependency chain (in reverse order) is: -> #5 (&lo->lo_mutex){+.+.}-{3:3}: __mutex_lock+0xba/0x7c0 lo_open+0x28/0x60 [loop] blkdev_get_whole+0x28/0xf0 blkdev_get_by_dev.part.0+0x168/0x3c0 blkdev_open+0xd2/0xe0 do_dentry_open+0x163/0x3a0 path_openat+0x74d/0xa40 do_filp_open+0x9c/0x140 do_sys_openat2+0xb1/0x170 __x64_sys_openat+0x54/0x90 do_syscall_64+0x3b/0x90 entry_SYSCALL_64_after_hwframe+0x44/0xae -> #4 (&disk->open_mutex){+.+.}-{3:3}: __mutex_lock+0xba/0x7c0 blkdev_get_by_dev.part.0+0xd1/0x3c0 blkdev_get_by_path+0xc0/0xd0 btrfs_scan_one_device+0x52/0x1f0 [btrfs] btrfs_control_ioctl+0xac/0x170 [btrfs] __x64_sys_ioctl+0x83/0xb0 do_syscall_64+0x3b/0x90 entry_SYSCALL_64_after_hwframe+0x44/0xae -> #3 (uuid_mutex){+.+.}-{3:3}: __mutex_lock+0xba/0x7c0 btrfs_rm_device+0x48/0x6a0 [btrfs] btrfs_ioctl+0x2d1c/0x3110 [btrfs] __x64_sys_ioctl+0x83/0xb0 do_syscall_64+0x3b/0x90 entry_SYSCALL_64_after_hwframe+0x44/0xae -> #2 (sb_writers#11){.+.+}-{0:0}: lo_write_bvec+0x112/0x290 [loop] loop_process_work+0x25f/0xcb0 [loop] process_one_work+0x28f/0x5d0 worker_thread+0x55/0x3c0 kthread+0x140/0x170 ret_from_fork+0x22/0x30 -> #1 ((work_completion)(&lo->rootcg_work)){+.+.}-{0:0}: process_one_work+0x266/0x5d0 worker_thread+0x55/0x3c0 kthread+0x140/0x170 ret_from_fork+0x22/0x30 -> #0 ((wq_completion)loop0){+.+.}-{0:0}: __lock_acquire+0x1130/0x1dc0 lock_acquire+0xf5/0x320 flush_workqueue+0xae/0x600 drain_workqueue+0xa0/0x110 destroy_workqueue+0x36/0x250 __loop_clr_fd+0x9a/0x650 [loop] lo_ioctl+0x29d/0x780 [loop] block_ioctl+0x3f/0x50 __x64_sys_ioctl+0x83/0xb0 do_syscall_64+0x3b/0x90 entry_SYSCALL_64_after_hwframe+0x44/0xae other info that might help us debug this: Chain exists of: (wq_completion)loop0 --> &disk->open_mutex --> &lo->lo_mutex Possible unsafe locking scenario: CPU0 CPU1 ---- ---- lock(&lo->lo_mutex); lock(&disk->open_mutex); lock(&lo->lo_mutex); lock((wq_completion)loop0); *** DEADLOCK *** 1 lock held by losetup/156417: #0: ffff9c7647395468 (&lo->lo_mutex){+.+.}-{3:3}, at: __loop_clr_fd+0x41/0x650 [loop] stack backtrace: CPU: 8 PID: 156417 Comm: losetup Not tainted 5.14.0-rc2-custom+ #34 Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 0.0.0 02/06/2015 Call Trace: dump_stack_lvl+0x57/0x72 check_noncircular+0x10a/0x120 __lock_acquire+0x1130/0x1dc0 lock_acquire+0xf5/0x320 ? flush_workqueue+0x84/0x600 flush_workqueue+0xae/0x600 ? flush_workqueue+0x84/0x600 drain_workqueue+0xa0/0x110 destroy_workqueue+0x36/0x250 __loop_clr_fd+0x9a/0x650 [loop] lo_ioctl+0x29d/0x780 [loop] ? __lock_acquire+0x3a0/0x1dc0 ? update_dl_rq_load_avg+0x152/0x360 ? lock_is_held_type+0xa5/0x120 ? find_held_lock.constprop.0+0x2b/0x80 block_ioctl+0x3f/0x50 __x64_sys_ioctl+0x83/0xb0 do_syscall_64+0x3b/0x90 entry_SYSCALL_64_after_hwframe+0x44/0xae RIP: 0033:0x7f645884de6b Usually the uuid_mutex exists to protect the fs_devices that map together all of the devices that match a specific uuid. In rm_device we're messing with the uuid of a device, so it makes sense to protect that here. However in doing that it pulls in a whole host of lockdep dependencies, as we call mnt_may_write() on the sb before we grab the uuid_mutex, thus we end up with the dependency chain under the uuid_mutex being added under the normal sb write dependency chain, which causes problems with loop devices. We don't need the uuid mutex here however. If we call btrfs_scan_one_device() before we scratch the super block we will find the fs_devices and not find the device itself and return EBUSY because the fs_devices is open. If we call it after the scratch happens it will not appear to be a valid btrfs file system. We do not need to worry about other fs_devices modifying operations here because we're protected by the exclusive operations locking. So drop the uuid_mutex here in order to fix the lockdep splat. A more detailed explanation from the discussion: We are worried about rm and scan racing with each other, before this change we'll zero the device out under the UUID mutex so when scan does run it'll make sure that it can go through the whole device scan thing without rm messing with us. We aren't worried if the scratch happens first, because the result is we don't think this is a btrfs device and we bail out. The only case we are concerned with is we scratch _after_ scan is able to read the superblock and gets a seemingly valid super block, so lets consider this case. Scan will call device_list_add() with the device we're removing. We'll call find_fsid_with_metadata_uuid() and get our fs_devices for this UUID. At this point we lock the fs_devices->device_list_mutex. This is what protects us in this case, but we have two cases here. 1. We aren't to the device removal part of the RM. We found our device, and device name matches our path, we go down and we set total_devices to our super number of devices, which doesn't affect anything because we haven't done the remove yet. 2. We are past the device removal part, which is protected by the device_list_mutex. Scan doesn't find the device, it goes down and does the if (fs_devices->opened) return -EBUSY; check and we bail out. Nothing about this situation is ideal, but the lockdep splat is real, and the fix is safe, tho admittedly a bit scary looking. Reviewed-by: Anand Jain <anand.jain@oracle.com> Signed-off-by: Josef Bacik <josef@toxicpanda.com> Reviewed-by: David Sterba <dsterba@suse.com> [ copy more from the discussion ] Signed-off-by: David Sterba <dsterba@suse.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
iluminat23
pushed a commit
that referenced
this pull request
Jan 20, 2022
…eam state [ Upstream commit b7b1d02 ] The internal stream state sets the timeout to 120 seconds 2 seconds after the creation of the flow, attach this internal stream state to the IPS_ASSURED flag for consistent event reporting. Before this patch: [NEW] udp 17 30 src=10.246.11.13 dst=216.239.35.0 sport=37282 dport=123 [UNREPLIED] src=216.239.35.0 dst=10.246.11.13 sport=123 dport=37282 [UPDATE] udp 17 30 src=10.246.11.13 dst=216.239.35.0 sport=37282 dport=123 src=216.239.35.0 dst=10.246.11.13 sport=123 dport=37282 [UPDATE] udp 17 30 src=10.246.11.13 dst=216.239.35.0 sport=37282 dport=123 src=216.239.35.0 dst=10.246.11.13 sport=123 dport=37282 [ASSURED] [DESTROY] udp 17 src=10.246.11.13 dst=216.239.35.0 sport=37282 dport=123 src=216.239.35.0 dst=10.246.11.13 sport=123 dport=37282 [ASSURED] Note IPS_ASSURED for the flow not yet in the internal stream state. after this update: [NEW] udp 17 30 src=10.246.11.13 dst=216.239.35.0 sport=37282 dport=123 [UNREPLIED] src=216.239.35.0 dst=10.246.11.13 sport=123 dport=37282 [UPDATE] udp 17 30 src=10.246.11.13 dst=216.239.35.0 sport=37282 dport=123 src=216.239.35.0 dst=10.246.11.13 sport=123 dport=37282 [UPDATE] udp 17 120 src=10.246.11.13 dst=216.239.35.0 sport=37282 dport=123 src=216.239.35.0 dst=10.246.11.13 sport=123 dport=37282 [ASSURED] [DESTROY] udp 17 src=10.246.11.13 dst=216.239.35.0 sport=37282 dport=123 src=216.239.35.0 dst=10.246.11.13 sport=123 dport=37282 [ASSURED] Before this patch, short-lived UDP flows never entered IPS_ASSURED, so they were already candidate flow to be deleted by early_drop under stress. Before this patch, IPS_ASSURED is set on regardless the internal stream state, attach this internal stream state to IPS_ASSURED. packet #1 (original direction) enters NEW state packet #2 (reply direction) enters ESTABLISHED state, sets on IPS_SEEN_REPLY paclet #3 (any direction) sets on IPS_ASSURED (if 2 seconds since the creation has passed by). Reported-by: Maciej Żenczykowski <zenczykowski@gmail.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
iluminat23
pushed a commit
that referenced
this pull request
Jan 20, 2022
[ Upstream commit 4ef0c5c ] There is a small race between copy_process() and sched_fork() where child->sched_task_group point to an already freed pointer. parent doing fork() | someone moving the parent | to another cgroup -------------------------------+------------------------------- copy_process() + dup_task_struct()<1> parent move to another cgroup, and free the old cgroup. <2> + sched_fork() + __set_task_cpu()<3> + task_fork_fair() + sched_slice()<4> In the worst case, this bug can lead to "use-after-free" and cause panic as shown above: (1) parent copy its sched_task_group to child at <1>; (2) someone move the parent to another cgroup and free the old cgroup at <2>; (3) the sched_task_group and cfs_rq that belong to the old cgroup will be accessed at <3> and <4>, which cause a panic: [] BUG: unable to handle kernel NULL pointer dereference at 0000000000000000 [] PGD 8000001fa0a86067 P4D 8000001fa0a86067 PUD 2029955067 PMD 0 [] Oops: 0000 [#1] SMP PTI [] CPU: 7 PID: 648398 Comm: ebizzy Kdump: loaded Tainted: G OE --------- - - 4.18.0.x86_64+ #1 [] RIP: 0010:sched_slice+0x84/0xc0 [] Call Trace: [] task_fork_fair+0x81/0x120 [] sched_fork+0x132/0x240 [] copy_process.part.5+0x675/0x20e0 [] ? __handle_mm_fault+0x63f/0x690 [] _do_fork+0xcd/0x3b0 [] do_syscall_64+0x5d/0x1d0 [] entry_SYSCALL_64_after_hwframe+0x65/0xca [] RIP: 0033:0x7f04418cd7e1 Between cgroup_can_fork() and cgroup_post_fork(), the cgroup membership and thus sched_task_group can't change. So update child's sched_task_group at sched_post_fork() and move task_fork() and __set_task_cpu() (where accees the sched_task_group) from sched_fork() to sched_post_fork(). Fixes: 8323f26 ("sched: Fix race in task_group") Signed-off-by: Zhang Qiao <zhangqiao22@huawei.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: Tejun Heo <tj@kernel.org> Link: https://lkml.kernel.org/r/20210915064030.2231-1-zhangqiao22@huawei.com Signed-off-by: Sasha Levin <sashal@kernel.org>
iluminat23
pushed a commit
that referenced
this pull request
Jan 20, 2022
[ Upstream commit 6d0d1b5 ] If the device used as a serial console gets detached/attached at runtime, register_console() will try to call imx_uart_setup_console(), but this is not possible since it is marked as __init. For instance # cat /sys/devices/virtual/tty/console/active tty1 ttymxc0 # echo -n N > /sys/devices/virtual/tty/console/subsystem/ttymxc0/console # echo -n Y > /sys/devices/virtual/tty/console/subsystem/ttymxc0/console [ 73.166649] 8<--- cut here --- [ 73.167005] Unable to handle kernel paging request at virtual address c154d928 [ 73.167601] pgd = 55433e84 [ 73.167875] [c154d928] *pgd=8141941e(bad) [ 73.168304] Internal error: Oops: 8000000d [#1] SMP ARM [ 73.168429] Modules linked in: [ 73.168522] CPU: 0 PID: 536 Comm: sh Not tainted 5.15.0-rc6-00056-g3968ddcf05fb #3 [ 73.168675] Hardware name: Freescale i.MX6 Ultralite (Device Tree) [ 73.168791] PC is at imx_uart_console_setup+0x0/0x238 [ 73.168927] LR is at try_enable_new_console+0x98/0x124 [ 73.169056] pc : [<c154d928>] lr : [<c0196f44>] psr: a0000013 [ 73.169178] sp : c2ef5e70 ip : 00000000 fp : 00000000 [ 73.169281] r10: 00000000 r9 : c02cf970 r8 : 00000000 [ 73.169389] r7 : 00000001 r6 : 00000001 r5 : c1760164 r4 : c1e0fb08 [ 73.169512] r3 : c154d928 r2 : 00000000 r1 : efffcbd r0 : c1760164 [ 73.169641] Flags: NzCv IRQs on FIQs on Mode SVC_32 ISA ARM Segment none [ 73.169782] Control: 10c5387d Table: 8345406a DAC: 00000051 [ 73.169895] Register r0 information: non-slab/vmalloc memory [ 73.170032] Register r1 information: non-slab/vmalloc memory [ 73.170158] Register r2 information: NULL pointer [ 73.170273] Register r3 information: non-slab/vmalloc memory [ 73.170397] Register r4 information: non-slab/vmalloc memory [ 73.170521] Register r5 information: non-slab/vmalloc memory [ 73.170647] Register r6 information: non-paged memory [ 73.170771] Register r7 information: non-paged memory [ 73.170892] Register r8 information: NULL pointer [ 73.171009] Register r9 information: non-slab/vmalloc memory [ 73.171142] Register r10 information: NULL pointer [ 73.171259] Register r11 information: NULL pointer [ 73.171375] Register r12 information: NULL pointer [ 73.171494] Process sh (pid: 536, stack limit = 0xcd1ba82f) [ 73.171621] Stack: (0xc2ef5e70 to 0xc2ef6000) [ 73.171731] 5e60: ???????? ???????? ???????? ???????? [ 73.171899] 5e80: ???????? ???????? ???????? ???????? ???????? ???????? ???????? ???????? [ 73.172059] 5ea0: ???????? ???????? ???????? ???????? ???????? ???????? ???????? ???????? [ 73.172217] 5ec0: ???????? ???????? ???????? ???????? ???????? ???????? ???????? ???????? [ 73.172377] 5ee0: ???????? ???????? ???????? ???????? ???????? ???????? ???????? ???????? [ 73.172537] 5f00: ???????? ???????? ???????? ???????? ???????? ???????? ???????? ???????? [ 73.172698] 5f20: ???????? ???????? ???????? ???????? ???????? ???????? ???????? ???????? [ 73.172856] 5f40: ???????? ???????? ???????? ???????? ???????? ???????? ???????? ???????? [ 73.173016] 5f60: ???????? ???????? ???????? ???????? ???????? ???????? ???????? ???????? [ 73.173177] 5f80: ???????? ???????? ???????? ???????? ???????? ???????? ???????? ???????? [ 73.173336] 5fa0: ???????? ???????? ???????? ???????? ???????? ???????? ???????? ???????? [ 73.173496] 5fc0: ???????? ???????? ???????? ???????? ???????? ???????? ???????? ???????? [ 73.173654] 5fe0: ???????? ???????? ???????? ???????? ???????? ???????? ???????? ???????? [ 73.173826] [<c0196f44>] (try_enable_new_console) from [<c01984a8>] (register_console+0x10c/0x2ec) [ 73.174053] [<c01984a8>] (register_console) from [<c06e2c90>] (console_store+0x14c/0x168) [ 73.174262] [<c06e2c90>] (console_store) from [<c0383718>] (kernfs_fop_write_iter+0x110/0x1cc) [ 73.174470] [<c0383718>] (kernfs_fop_write_iter) from [<c02cf5f4>] (vfs_write+0x31c/0x548) [ 73.174679] [<c02cf5f4>] (vfs_write) from [<c02cf970>] (ksys_write+0x60/0xec) [ 73.174863] [<c02cf970>] (ksys_write) from [<c0100080>] (ret_fast_syscall+0x0/0x1c) [ 73.175052] Exception stack(0xc2ef5fa8 to 0xc2ef5ff0) [ 73.175167] 5fa0: ???????? ???????? ???????? ???????? ???????? ???????? [ 73.175327] 5fc0: ???????? ???????? ???????? ???????? ???????? ???????? ???????? ???????? [ 73.175486] 5fe0: ???????? ???????? ???????? ???????? [ 73.175608] Code: 00000000 00000000 00000000 00000000 (00000000) [ 73.175744] ---[ end trace 9b75121265109bf1 ]--- A similar issue could be triggered by unbinding/binding the serial console device [*]. Drop __init so that imx_uart_setup_console() can be safely called at runtime. [*] https://lore.kernel.org/all/20181114174940.7865-3-stefan@agner.ch/ Fixes: a3cb39d ("serial: core: Allow detach and attach serial device for console") Reviewed-by: Andy Shevchenko <andy.shevchenko@gmail.com> Acked-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de> Signed-off-by: Stefan Agner <stefan@agner.ch> Signed-off-by: Francesco Dolcini <francesco.dolcini@toradex.com> Link: https://lore.kernel.org/r/20211020192643.476895-2-francesco.dolcini@toradex.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
iluminat23
pushed a commit
that referenced
this pull request
Jan 20, 2022
[ Upstream commit f0caea8 ] Olga reports seeing the following Oops when doing O_DIRECT writes to a pNFS flexfiles server: Oops: 0000 [#1] SMP PTI CPU: 1 PID: 234186 Comm: kworker/u8:1 Not tainted 5.15.0-rc4+ #4 Hardware name: Red Hat KVM/RHEL-AV, BIOS 1.13.0-2.module+el8.3.0+7353+9de0a3cc 04/01/2014 Workqueue: nfsiod rpc_async_release [sunrpc] RIP: 0010:nfs_mark_request_commit+0x12/0x30 [nfs] Code: ff ff be 03 00 00 00 e8 ac 34 83 eb e9 29 ff ff ff e8 22 bc d7 eb 66 90 0f 1f 44 00 00 48 85 f6 74 16 48 8b 42 10 48 8b 40 18 <48> 8b 40 18 48 85 c0 74 05 e9 70 fc 15 ec 48 89 d6 e9 68 ed ff ff RSP: 0018:ffffa82f0159fe00 EFLAGS: 00010286 RAX: 0000000000000000 RBX: ffff8f3393141880 RCX: 0000000000000000 RDX: ffffa82f0159fe08 RSI: ffff8f3381252500 RDI: ffff8f3393141880 RBP: ffff8f33ac317c00 R08: 0000000000000000 R09: ffff8f3487724cb0 R10: 0000000000000008 R11: 0000000000000001 R12: 0000000000000001 R13: ffff8f3485bccee0 R14: ffff8f33ac317c10 R15: ffff8f33ac317cd8 FS: 0000000000000000(0000) GS:ffff8f34fbc80000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000000000018 CR3: 0000000122120006 CR4: 0000000000770ee0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 PKRU: 55555554 Call Trace: nfs_direct_write_completion+0x13b/0x250 [nfs] rpc_free_task+0x39/0x60 [sunrpc] rpc_async_release+0x29/0x40 [sunrpc] process_one_work+0x1ce/0x370 worker_thread+0x30/0x380 ? process_one_work+0x370/0x370 kthread+0x11a/0x140 ? set_kthread_struct+0x40/0x40 ret_from_fork+0x22/0x30 Reported-by: Olga Kornievskaia <aglo@umich.edu> Fixes: 9c455a8 ("NFS/pNFS: Clean up pNFS commit operations") Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
iluminat23
pushed a commit
that referenced
this pull request
Jan 20, 2022
[ Upstream commit c98c5da ] Current code does list element deletion and addition in and out of lock protection. This patch moves deletion behind lock. list_add double add: new=ffff9130b5eb89f8, prev=ffff9130b5eb89f8, next=ffff9130c6a715f0. ------------[ cut here ]------------ kernel BUG at lib/list_debug.c:31! invalid opcode: 0000 [#1] SMP PTI CPU: 1 PID: 182395 Comm: kworker/1:37 Kdump: loaded Tainted: G W OE --------- - - 4.18.0-193.el8.x86_64 #1 Hardware name: HP ProLiant DL160 Gen8, BIOS J03 02/10/2014 Workqueue: qla2xxx_wq qla2x00_iocb_work_fn [qla2xxx] RIP: 0010:__list_add_valid+0x41/0x50 Code: 85 94 00 00 00 48 39 c7 74 0b 48 39 d7 74 06 b8 01 00 00 00 c3 48 89 f2 4c 89 c1 48 89 fe 48 c7 c7 60 83 ad 97 e8 4d bd ce ff <0f> 0b 0f 1f 00 66 2e 0f 1f 84 00 00 00 00 00 48 8b 07 48 8b 57 08 RSP: 0018:ffffaba306f47d68 EFLAGS: 00010046 RAX: 0000000000000058 RBX: ffff9130b5eb8800 RCX: 0000000000000006 RDX: 0000000000000000 RSI: 0000000000000096 RDI: ffff9130b7456a00 RBP: ffff9130c6a70a58 R08: 000000000008d7be R09: 0000000000000001 R10: 0000000000000000 R11: 0000000000000001 R12: ffff9130c6a715f0 R13: ffff9130b5eb8824 R14: ffff9130b5eb89f8 R15: ffff9130b5eb89f8 FS: 0000000000000000(0000) GS:ffff9130b7440000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007efcaaef11a0 CR3: 000000005200a002 CR4: 00000000000606e0 Call Trace: qla24xx_async_gnl+0x113/0x3c0 [qla2xxx] ? qla2x00_iocb_work_fn+0x53/0x80 [qla2xxx] ? process_one_work+0x1a7/0x3b0 ? worker_thread+0x30/0x390 ? create_worker+0x1a0/0x1a0 ? kthread+0x112/0x130 Link: https://lore.kernel.org/r/20211026115412.27691-3-njavali@marvell.com Fixes: 726b854 ("qla2xxx: Add framework for async fabric discovery") Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com> Signed-off-by: Quinn Tran <qutran@marvell.com> Signed-off-by: Nilesh Javali <njavali@marvell.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
iluminat23
pushed a commit
that referenced
this pull request
Jan 20, 2022
[ Upstream commit e140c79 ] When fully configure VLANs for a VF, then unload the VF while triggering a reset to PF, will cause a kernel crash because the irq is already uninit. [ 293.177579] ------------[ cut here ]------------ [ 293.183502] kernel BUG at drivers/pci/msi.c:352! [ 293.189547] Internal error: Oops - BUG: 0 [#1] SMP ...... [ 293.390124] Workqueue: hclgevf hclgevf_service_task [hclgevf] [ 293.402627] pstate: 80c00009 (Nzcv daif +PAN +UAO) [ 293.414324] pc : free_msi_irqs+0x19c/0x1b8 [ 293.425429] lr : free_msi_irqs+0x18c/0x1b8 [ 293.436545] sp : ffff00002716fbb0 [ 293.446950] x29: ffff00002716fbb0 x28: 0000000000000000 [ 293.459519] x27: 0000000000000000 x26: ffff45b91ea16b00 [ 293.472183] x25: 0000000000000000 x24: ffffa587b08f4700 [ 293.484717] x23: ffffc591ac30e000 x22: ffffa587b08f8428 [ 293.497190] x21: ffffc591ac30e300 x20: 0000000000000000 [ 293.509594] x19: ffffa58a062a8300 x18: 0000000000000000 [ 293.521949] x17: 0000000000000000 x16: ffff45b91dcc3f48 [ 293.534013] x15: 0000000000000000 x14: 0000000000000000 [ 293.545883] x13: 0000000000000040 x12: 0000000000000228 [ 293.557508] x11: 0000000000000020 x10: 0000000000000040 [ 293.568889] x9 : ffff45b91ea1e190 x8 : ffffc591802d0000 [ 293.580123] x7 : ffffc591802d0148 x6 : 0000000000000120 [ 293.591190] x5 : ffffc591802d0000 x4 : 0000000000000000 [ 293.602015] x3 : 0000000000000000 x2 : 0000000000000000 [ 293.612624] x1 : 00000000000004a4 x0 : ffffa58a1e0c6b80 [ 293.623028] Call trace: [ 293.630340] free_msi_irqs+0x19c/0x1b8 [ 293.638849] pci_disable_msix+0x118/0x140 [ 293.647452] pci_free_irq_vectors+0x20/0x38 [ 293.656081] hclgevf_uninit_msi+0x44/0x58 [hclgevf] [ 293.665309] hclgevf_reset_rebuild+0x1ac/0x2e0 [hclgevf] [ 293.674866] hclgevf_reset+0x358/0x400 [hclgevf] [ 293.683545] hclgevf_reset_service_task+0xd0/0x1b0 [hclgevf] [ 293.693325] hclgevf_service_task+0x4c/0x2e8 [hclgevf] [ 293.702307] process_one_work+0x1b0/0x448 [ 293.710034] worker_thread+0x54/0x468 [ 293.717331] kthread+0x134/0x138 [ 293.724114] ret_from_fork+0x10/0x18 [ 293.731324] Code: f940b000 b4ffff00 a903e7b8 f90017b6 (d4210000) This patch fixes the problem by waiting for the VF reset done while unloading the VF. Fixes: e2cb1de ("net: hns3: Add HNS3 VF HCL(Hardware Compatibility Layer) Support") Signed-off-by: Yufeng Mo <moyufeng@huawei.com> Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
tboehler1
pushed a commit
that referenced
this pull request
Sep 1, 2025
commit 54d5cd4 upstream. Usage of the intel_pmt_read() for binary sysfs, requires a pcidev. The current use of the endpoint value is only valid for telemetry endpoint usage. Without the ep, the crashlog usage causes the following NULL pointer exception: BUG: kernel NULL pointer dereference, address: 0000000000000000 Oops: Oops: 0000 [#1] SMP NOPTI RIP: 0010:intel_pmt_read+0x3b/0x70 [pmt_class] Code: Call Trace: <TASK> ? sysfs_kf_bin_read+0xc0/0xe0 kernfs_fop_read_iter+0xac/0x1a0 vfs_read+0x26d/0x350 ksys_read+0x6b/0xe0 __x64_sys_read+0x1d/0x30 x64_sys_call+0x1bc8/0x1d70 do_syscall_64+0x6d/0x110 Augment struct intel_pmt_entry with a pointer to the pcidev to avoid the NULL pointer exception. Fixes: 045a513 ("platform/x86/intel/pmt: Use PMT callbacks") Cc: stable@vger.kernel.org Reviewed-by: David E. Box <david.e.box@linux.intel.com> Reviewed-by: Tejas Upadhyay <tejas.upadhyay@intel.com> Signed-off-by: Michael J. Ruhl <michael.j.ruhl@intel.com> Link: https://lore.kernel.org/r/20250713172943.7335-2-michael.j.ruhl@intel.com Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
tboehler1
pushed a commit
that referenced
this pull request
Sep 1, 2025
commit ae42c6f upstream. If ti_csi2rx_start_dma() fails in ti_csi2rx_dma_callback(), the buffer is marked done with VB2_BUF_STATE_ERROR but is not removed from the DMA queue. This causes the same buffer to be retried in the next iteration, resulting in a double list_del() and eventual list corruption. Fix this by removing the buffer from the queue before calling vb2_buffer_done() on error. This resolves a crash due to list_del corruption: [ 37.811243] j721e-csi2rx 30102000.ticsi2rx: Failed to queue the next buffer for DMA [ 37.832187] slab kmalloc-2k start ffff00000255b000 pointer offset 1064 size 2048 [ 37.839761] list_del corruption. next->prev should be ffff00000255bc28, but was ffff00000255d428. (next=ffff00000255b428) [ 37.850799] ------------[ cut here ]------------ [ 37.855424] kernel BUG at lib/list_debug.c:65! [ 37.859876] Internal error: Oops - BUG: 00000000f2000800 [#1] SMP [ 37.866061] Modules linked in: i2c_dev usb_f_rndis u_ether libcomposite dwc3 udc_core usb_common aes_ce_blk aes_ce_cipher ghash_ce gf128mul sha1_ce cpufreq_dt dwc3_am62 phy_gmii_sel sa2ul [ 37.882830] CPU: 0 UID: 0 PID: 0 Comm: swapper/0 Not tainted 6.16.0-rc3+ #28 VOLUNTARY [ 37.890851] Hardware name: Bosch STLA-GSRV2-B0 (DT) [ 37.895737] pstate: 600000c5 (nZCv daIF -PAN -UAO -TCO -DIT -SSBS BTYPE=--) [ 37.902703] pc : __list_del_entry_valid_or_report+0xdc/0x114 [ 37.908390] lr : __list_del_entry_valid_or_report+0xdc/0x114 [ 37.914059] sp : ffff800080003db0 [ 37.917375] x29: ffff800080003db0 x28: 0000000000000007 x27: ffff800080e50000 [ 37.924521] x26: 0000000000000000 x25: ffff0000016abb50 x24: dead000000000122 [ 37.931666] x23: ffff0000016abb78 x22: ffff0000016ab080 x21: ffff800080003de0 [ 37.938810] x20: ffff00000255bc00 x19: ffff00000255b800 x18: 000000000000000a [ 37.945956] x17: 20747562202c3832 x16: 6362353532303030 x15: 0720072007200720 [ 37.953101] x14: 0720072007200720 x13: 0720072007200720 x12: 00000000ffffffea [ 37.960248] x11: ffff800080003b18 x10: 00000000ffffefff x9 : ffff800080f5b568 [ 37.967396] x8 : ffff800080f5b5c0 x7 : 0000000000017fe8 x6 : c0000000ffffefff [ 37.974542] x5 : ffff00000fea6688 x4 : 0000000000000000 x3 : 0000000000000000 [ 37.981686] x2 : 0000000000000000 x1 : ffff800080ef2b40 x0 : 000000000000006d [ 37.988832] Call trace: [ 37.991281] __list_del_entry_valid_or_report+0xdc/0x114 (P) [ 37.996959] ti_csi2rx_dma_callback+0x84/0x1c4 [ 38.001419] udma_vchan_complete+0x1e0/0x344 [ 38.005705] tasklet_action_common+0x118/0x310 [ 38.010163] tasklet_action+0x30/0x3c [ 38.013832] handle_softirqs+0x10c/0x2e0 [ 38.017761] __do_softirq+0x14/0x20 [ 38.021256] ____do_softirq+0x10/0x20 [ 38.024931] call_on_irq_stack+0x24/0x60 [ 38.028873] do_softirq_own_stack+0x1c/0x40 [ 38.033064] __irq_exit_rcu+0x130/0x15c [ 38.036909] irq_exit_rcu+0x10/0x20 [ 38.040403] el1_interrupt+0x38/0x60 [ 38.043987] el1h_64_irq_handler+0x18/0x24 [ 38.048091] el1h_64_irq+0x6c/0x70 [ 38.051501] default_idle_call+0x34/0xe0 (P) [ 38.055783] do_idle+0x1f8/0x250 [ 38.059021] cpu_startup_entry+0x34/0x3c [ 38.062951] rest_init+0xb4/0xc0 [ 38.066186] console_on_rootfs+0x0/0x6c [ 38.070031] __primary_switched+0x88/0x90 [ 38.074059] Code: b00037e0 91378000 f9400462 97e9bf49 (d4210000) [ 38.080168] ---[ end trace 0000000000000000 ]--- [ 38.084795] Kernel panic - not syncing: Oops - BUG: Fatal exception in interrupt [ 38.092197] SMP: stopping secondary CPUs [ 38.096139] Kernel Offset: disabled [ 38.099631] CPU features: 0x0000,00002000,02000801,0400420b [ 38.105202] Memory Limit: none [ 38.108260] ---[ end Kernel panic - not syncing: Oops - BUG: Fatal exception in interrupt ]--- Fixes: b4a3d87 ("media: ti: Add CSI2RX support for J721E") Cc: stable@vger.kernel.org Suggested-by: Sjoerd Simons <sjoerd@collabora.com> Signed-off-by: Sjoerd Simons <sjoerd@collabora.com> Signed-off-by: Julien Massot <julien.massot@collabora.com> Reviewed-by: Jai Luthra <jai.luthra@linux.dev> Tested-by: Dirk Behme <dirk.behme@de.bosch.com> Signed-off-by: Sakari Ailus <sakari.ailus@linux.intel.com> Signed-off-by: Hans Verkuil <hverkuil@xs4all.nl> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
tboehler1
pushed a commit
that referenced
this pull request
Sep 1, 2025
…er dereference commit 1bb3363 upstream. A malicious HID device with quirk APPLE_MAGIC_BACKLIGHT can trigger a NULL pointer dereference whilst the power feature-report is toggled and sent to the device in apple_magic_backlight_report_set(). The power feature-report is expected to have two data fields, but if the descriptor declares one field then accessing field[1] and dereferencing it in apple_magic_backlight_report_set() becomes invalid since field[1] will be NULL. An example of a minimal descriptor which can cause the crash is something like the following where the report with ID 3 (power report) only references a single 1-byte field. When hid core parses the descriptor it will encounter the final feature tag, allocate a hid_report (all members of field[] will be zeroed out), create field structure and populate it, increasing the maxfield to 1. The subsequent field[1] access and dereference causes the crash. Usage Page (Vendor Defined 0xFF00) Usage (0x0F) Collection (Application) Report ID (1) Usage (0x01) Logical Minimum (0) Logical Maximum (255) Report Size (8) Report Count (1) Feature (Data,Var,Abs) Usage (0x02) Logical Maximum (32767) Report Size (16) Report Count (1) Feature (Data,Var,Abs) Report ID (3) Usage (0x03) Logical Minimum (0) Logical Maximum (1) Report Size (8) Report Count (1) Feature (Data,Var,Abs) End Collection Here we see the KASAN splat when the kernel dereferences the NULL pointer and crashes: [ 15.164723] Oops: general protection fault, probably for non-canonical address 0xdffffc0000000006: 0000 [#1] SMP KASAN NOPTI [ 15.165691] KASAN: null-ptr-deref in range [0x0000000000000030-0x0000000000000037] [ 15.165691] CPU: 0 UID: 0 PID: 10 Comm: kworker/0:1 Not tainted 6.15.0 #31 PREEMPT(voluntary) [ 15.165691] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.2-debian-1.16.2-1 04/01/2014 [ 15.165691] RIP: 0010:apple_magic_backlight_report_set+0xbf/0x210 [ 15.165691] Call Trace: [ 15.165691] <TASK> [ 15.165691] apple_probe+0x571/0xa20 [ 15.165691] hid_device_probe+0x2e2/0x6f0 [ 15.165691] really_probe+0x1ca/0x5c0 [ 15.165691] __driver_probe_device+0x24f/0x310 [ 15.165691] driver_probe_device+0x4a/0xd0 [ 15.165691] __device_attach_driver+0x169/0x220 [ 15.165691] bus_for_each_drv+0x118/0x1b0 [ 15.165691] __device_attach+0x1d5/0x380 [ 15.165691] device_initial_probe+0x12/0x20 [ 15.165691] bus_probe_device+0x13d/0x180 [ 15.165691] device_add+0xd87/0x1510 [...] To fix this issue we should validate the number of fields that the backlight and power reports have and if they do not have the required number of fields then bail. Fixes: 394ba61 ("HID: apple: Add support for magic keyboard backlight on T2 Macs") Cc: stable@vger.kernel.org Signed-off-by: Qasim Ijaz <qasdev00@gmail.com> Reviewed-by: Orlando Chamberlain <orlandoch.dev@gmail.com> Tested-by: Aditya Garg <gargaditya08@live.com> Link: https://patch.msgid.link/20250713233008.15131-1-qasdev00@gmail.com Signed-off-by: Benjamin Tissoires <bentiss@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
tboehler1
pushed a commit
that referenced
this pull request
Sep 1, 2025
commit 151c0aa upstream. 1. In func configfs_composite_bind() -> composite_os_desc_req_prepare(): if kmalloc fails, the pointer cdev->os_desc_req will be freed but not set to NULL. Then it will return a failure to the upper-level function. 2. in func configfs_composite_bind() -> composite_dev_cleanup(): it will checks whether cdev->os_desc_req is NULL. If it is not NULL, it will attempt to use it.This will lead to a use-after-free issue. BUG: KASAN: use-after-free in composite_dev_cleanup+0xf4/0x2c0 Read of size 8 at addr 0000004827837a00 by task init/1 CPU: 10 PID: 1 Comm: init Tainted: G O 5.10.97-oh #1 kasan_report+0x188/0x1cc __asan_load8+0xb4/0xbc composite_dev_cleanup+0xf4/0x2c0 configfs_composite_bind+0x210/0x7ac udc_bind_to_driver+0xb4/0x1ec usb_gadget_probe_driver+0xec/0x21c gadget_dev_desc_UDC_store+0x264/0x27c Fixes: 37a3a53 ("usb: gadget: OS Feature Descriptors support") Cc: stable <stable@kernel.org> Signed-off-by: Tao Xue <xuetao09@huawei.com> Link: https://lore.kernel.org/r/20250721093908.14967-1-xuetao09@huawei.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
tboehler1
pushed a commit
that referenced
this pull request
Sep 1, 2025
[ Upstream commit 736a051 ] The hfs_find_init() method can trigger the crash if tree pointer is NULL: [ 45.746290][ T9787] Oops: general protection fault, probably for non-canonical address 0xdffffc0000000008: 0000 [#1] SMP KAI [ 45.747287][ T9787] KASAN: null-ptr-deref in range [0x0000000000000040-0x0000000000000047] [ 45.748716][ T9787] CPU: 2 UID: 0 PID: 9787 Comm: repro Not tainted 6.16.0-rc3 #10 PREEMPT(full) [ 45.750250][ T9787] Hardware name: QEMU Ubuntu 24.04 PC (i440FX + PIIX, 1996), BIOS 1.16.3-debian-1.16.3-2 04/01/2014 [ 45.751983][ T9787] RIP: 0010:hfs_find_init+0x86/0x230 [ 45.752834][ T9787] Code: c1 ea 03 80 3c 02 00 0f 85 9a 01 00 00 4c 8d 6b 40 48 c7 45 18 00 00 00 00 48 b8 00 00 00 00 00 fc [ 45.755574][ T9787] RSP: 0018:ffffc90015157668 EFLAGS: 00010202 [ 45.756432][ T9787] RAX: dffffc0000000000 RBX: 0000000000000000 RCX: ffffffff819a4d09 [ 45.757457][ T9787] RDX: 0000000000000008 RSI: ffffffff819acd3a RDI: ffffc900151576e8 [ 45.758282][ T9787] RBP: ffffc900151576d0 R08: 0000000000000005 R09: 0000000000000000 [ 45.758943][ T9787] R10: 0000000080000000 R11: 0000000000000001 R12: 0000000000000004 [ 45.759619][ T9787] R13: 0000000000000040 R14: ffff88802c50814a R15: 0000000000000000 [ 45.760293][ T9787] FS: 00007ffb72734540(0000) GS:ffff8880cec64000(0000) knlGS:0000000000000000 [ 45.761050][ T9787] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 45.761606][ T9787] CR2: 00007f9bd8225000 CR3: 000000010979a000 CR4: 00000000000006f0 [ 45.762286][ T9787] Call Trace: [ 45.762570][ T9787] <TASK> [ 45.762824][ T9787] hfs_ext_read_extent+0x190/0x9d0 [ 45.763269][ T9787] ? submit_bio_noacct_nocheck+0x2dd/0xce0 [ 45.763766][ T9787] ? __pfx_hfs_ext_read_extent+0x10/0x10 [ 45.764250][ T9787] hfs_get_block+0x55f/0x830 [ 45.764646][ T9787] block_read_full_folio+0x36d/0x850 [ 45.765105][ T9787] ? __pfx_hfs_get_block+0x10/0x10 [ 45.765541][ T9787] ? const_folio_flags+0x5b/0x100 [ 45.765972][ T9787] ? __pfx_hfs_read_folio+0x10/0x10 [ 45.766415][ T9787] filemap_read_folio+0xbe/0x290 [ 45.766840][ T9787] ? __pfx_filemap_read_folio+0x10/0x10 [ 45.767325][ T9787] ? __filemap_get_folio+0x32b/0xbf0 [ 45.767780][ T9787] do_read_cache_folio+0x263/0x5c0 [ 45.768223][ T9787] ? __pfx_hfs_read_folio+0x10/0x10 [ 45.768666][ T9787] read_cache_page+0x5b/0x160 [ 45.769070][ T9787] hfs_btree_open+0x491/0x1740 [ 45.769481][ T9787] hfs_mdb_get+0x15e2/0x1fb0 [ 45.769877][ T9787] ? __pfx_hfs_mdb_get+0x10/0x10 [ 45.770316][ T9787] ? find_held_lock+0x2b/0x80 [ 45.770731][ T9787] ? lockdep_init_map_type+0x5c/0x280 [ 45.771200][ T9787] ? lockdep_init_map_type+0x5c/0x280 [ 45.771674][ T9787] hfs_fill_super+0x38e/0x720 [ 45.772092][ T9787] ? __pfx_hfs_fill_super+0x10/0x10 [ 45.772549][ T9787] ? snprintf+0xbe/0x100 [ 45.772931][ T9787] ? __pfx_snprintf+0x10/0x10 [ 45.773350][ T9787] ? do_raw_spin_lock+0x129/0x2b0 [ 45.773796][ T9787] ? find_held_lock+0x2b/0x80 [ 45.774215][ T9787] ? set_blocksize+0x40a/0x510 [ 45.774636][ T9787] ? sb_set_blocksize+0x176/0x1d0 [ 45.775087][ T9787] ? setup_bdev_super+0x369/0x730 [ 45.775533][ T9787] get_tree_bdev_flags+0x384/0x620 [ 45.775985][ T9787] ? __pfx_hfs_fill_super+0x10/0x10 [ 45.776453][ T9787] ? __pfx_get_tree_bdev_flags+0x10/0x10 [ 45.776950][ T9787] ? bpf_lsm_capable+0x9/0x10 [ 45.777365][ T9787] ? security_capable+0x80/0x260 [ 45.777803][ T9787] vfs_get_tree+0x8e/0x340 [ 45.778203][ T9787] path_mount+0x13de/0x2010 [ 45.778604][ T9787] ? kmem_cache_free+0x2b0/0x4c0 [ 45.779052][ T9787] ? __pfx_path_mount+0x10/0x10 [ 45.779480][ T9787] ? getname_flags.part.0+0x1c5/0x550 [ 45.779954][ T9787] ? putname+0x154/0x1a0 [ 45.780335][ T9787] __x64_sys_mount+0x27b/0x300 [ 45.780758][ T9787] ? __pfx___x64_sys_mount+0x10/0x10 [ 45.781232][ T9787] do_syscall_64+0xc9/0x480 [ 45.781631][ T9787] entry_SYSCALL_64_after_hwframe+0x77/0x7f [ 45.782149][ T9787] RIP: 0033:0x7ffb7265b6ca [ 45.782539][ T9787] Code: 48 8b 0d c9 17 0d 00 f7 d8 64 89 01 48 83 c8 ff c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 48 [ 45.784212][ T9787] RSP: 002b:00007ffc0c10cfb8 EFLAGS: 00000206 ORIG_RAX: 00000000000000a5 [ 45.784935][ T9787] RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007ffb7265b6ca [ 45.785626][ T9787] RDX: 0000200000000240 RSI: 0000200000000280 RDI: 00007ffc0c10d100 [ 45.786316][ T9787] RBP: 00007ffc0c10d190 R08: 00007ffc0c10d000 R09: 0000000000000000 [ 45.787011][ T9787] R10: 0000000000000048 R11: 0000000000000206 R12: 0000560246733250 [ 45.787697][ T9787] R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000 [ 45.788393][ T9787] </TASK> [ 45.788665][ T9787] Modules linked in: [ 45.789058][ T9787] ---[ end trace 0000000000000000 ]--- [ 45.789554][ T9787] RIP: 0010:hfs_find_init+0x86/0x230 [ 45.790028][ T9787] Code: c1 ea 03 80 3c 02 00 0f 85 9a 01 00 00 4c 8d 6b 40 48 c7 45 18 00 00 00 00 48 b8 00 00 00 00 00 fc [ 45.792364][ T9787] RSP: 0018:ffffc90015157668 EFLAGS: 00010202 [ 45.793155][ T9787] RAX: dffffc0000000000 RBX: 0000000000000000 RCX: ffffffff819a4d09 [ 45.794123][ T9787] RDX: 0000000000000008 RSI: ffffffff819acd3a RDI: ffffc900151576e8 [ 45.795105][ T9787] RBP: ffffc900151576d0 R08: 0000000000000005 R09: 0000000000000000 [ 45.796135][ T9787] R10: 0000000080000000 R11: 0000000000000001 R12: 0000000000000004 [ 45.797114][ T9787] R13: 0000000000000040 R14: ffff88802c50814a R15: 0000000000000000 [ 45.798024][ T9787] FS: 00007ffb72734540(0000) GS:ffff8880cec64000(0000) knlGS:0000000000000000 [ 45.799019][ T9787] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 45.799822][ T9787] CR2: 00007f9bd8225000 CR3: 000000010979a000 CR4: 00000000000006f0 [ 45.800747][ T9787] Kernel panic - not syncing: Fatal exception The hfs_fill_super() calls hfs_mdb_get() method that tries to construct Extents Tree and Catalog Tree: HFS_SB(sb)->ext_tree = hfs_btree_open(sb, HFS_EXT_CNID, hfs_ext_keycmp); if (!HFS_SB(sb)->ext_tree) { pr_err("unable to open extent tree\n"); goto out; } HFS_SB(sb)->cat_tree = hfs_btree_open(sb, HFS_CAT_CNID, hfs_cat_keycmp); if (!HFS_SB(sb)->cat_tree) { pr_err("unable to open catalog tree\n"); goto out; } However, hfs_btree_open() calls read_mapping_page() that calls hfs_get_block(). And this method calls hfs_ext_read_extent(): static int hfs_ext_read_extent(struct inode *inode, u16 block) { struct hfs_find_data fd; int res; if (block >= HFS_I(inode)->cached_start && block < HFS_I(inode)->cached_start + HFS_I(inode)->cached_blocks) return 0; res = hfs_find_init(HFS_SB(inode->i_sb)->ext_tree, &fd); if (!res) { res = __hfs_ext_cache_extent(&fd, inode, block); hfs_find_exit(&fd); } return res; } The problem here that hfs_find_init() is trying to use HFS_SB(inode->i_sb)->ext_tree that is not initialized yet. It will be initailized when hfs_btree_open() finishes the execution. The patch adds checking of tree pointer in hfs_find_init() and it reworks the logic of hfs_btree_open() by reading the b-tree's header directly from the volume. The read_mapping_page() is exchanged on filemap_grab_folio() that grab the folio from mapping. Then, sb_bread() extracts the b-tree's header content and copy it into the folio. Reported-by: Wenzhi Wang <wenzhi.wang@uwaterloo.ca> Signed-off-by: Viacheslav Dubeyko <slava@dubeyko.com> cc: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de> cc: Yangtao Li <frank.li@vivo.com> cc: linux-fsdevel@vger.kernel.org Link: https://lore.kernel.org/r/20250710213657.108285-1-slava@dubeyko.com Signed-off-by: Viacheslav Dubeyko <slava@dubeyko.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
tboehler1
pushed a commit
that referenced
this pull request
Sep 1, 2025
[ Upstream commit 9445878 ] The hfsplus_readdir() method is capable to crash by calling hfsplus_uni2asc(): [ 667.121659][ T9805] ================================================================== [ 667.122651][ T9805] BUG: KASAN: slab-out-of-bounds in hfsplus_uni2asc+0x902/0xa10 [ 667.123627][ T9805] Read of size 2 at addr ffff88802592f40c by task repro/9805 [ 667.124578][ T9805] [ 667.124876][ T9805] CPU: 3 UID: 0 PID: 9805 Comm: repro Not tainted 6.16.0-rc3 #1 PREEMPT(full) [ 667.124886][ T9805] Hardware name: QEMU Ubuntu 24.04 PC (i440FX + PIIX, 1996), BIOS 1.16.3-debian-1.16.3-2 04/01/2014 [ 667.124890][ T9805] Call Trace: [ 667.124893][ T9805] <TASK> [ 667.124896][ T9805] dump_stack_lvl+0x10e/0x1f0 [ 667.124911][ T9805] print_report+0xd0/0x660 [ 667.124920][ T9805] ? __virt_addr_valid+0x81/0x610 [ 667.124928][ T9805] ? __phys_addr+0xe8/0x180 [ 667.124934][ T9805] ? hfsplus_uni2asc+0x902/0xa10 [ 667.124942][ T9805] kasan_report+0xc6/0x100 [ 667.124950][ T9805] ? hfsplus_uni2asc+0x902/0xa10 [ 667.124959][ T9805] hfsplus_uni2asc+0x902/0xa10 [ 667.124966][ T9805] ? hfsplus_bnode_read+0x14b/0x360 [ 667.124974][ T9805] hfsplus_readdir+0x845/0xfc0 [ 667.124984][ T9805] ? __pfx_hfsplus_readdir+0x10/0x10 [ 667.124994][ T9805] ? stack_trace_save+0x8e/0xc0 [ 667.125008][ T9805] ? iterate_dir+0x18b/0xb20 [ 667.125015][ T9805] ? trace_lock_acquire+0x85/0xd0 [ 667.125022][ T9805] ? lock_acquire+0x30/0x80 [ 667.125029][ T9805] ? iterate_dir+0x18b/0xb20 [ 667.125037][ T9805] ? down_read_killable+0x1ed/0x4c0 [ 667.125044][ T9805] ? putname+0x154/0x1a0 [ 667.125051][ T9805] ? __pfx_down_read_killable+0x10/0x10 [ 667.125058][ T9805] ? apparmor_file_permission+0x239/0x3e0 [ 667.125069][ T9805] iterate_dir+0x296/0xb20 [ 667.125076][ T9805] __x64_sys_getdents64+0x13c/0x2c0 [ 667.125084][ T9805] ? __pfx___x64_sys_getdents64+0x10/0x10 [ 667.125091][ T9805] ? __x64_sys_openat+0x141/0x200 [ 667.125126][ T9805] ? __pfx_filldir64+0x10/0x10 [ 667.125134][ T9805] ? do_user_addr_fault+0x7fe/0x12f0 [ 667.125143][ T9805] do_syscall_64+0xc9/0x480 [ 667.125151][ T9805] entry_SYSCALL_64_after_hwframe+0x77/0x7f [ 667.125158][ T9805] RIP: 0033:0x7fa8753b2fc9 [ 667.125164][ T9805] Code: 00 c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 48 [ 667.125172][ T9805] RSP: 002b:00007ffe96f8e0f8 EFLAGS: 00000217 ORIG_RAX: 00000000000000d9 [ 667.125181][ T9805] RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007fa8753b2fc9 [ 667.125185][ T9805] RDX: 0000000000000400 RSI: 00002000000063c0 RDI: 0000000000000004 [ 667.125190][ T9805] RBP: 00007ffe96f8e110 R08: 00007ffe96f8e110 R09: 00007ffe96f8e110 [ 667.125195][ T9805] R10: 0000000000000000 R11: 0000000000000217 R12: 0000556b1e3b4260 [ 667.125199][ T9805] R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000 [ 667.125207][ T9805] </TASK> [ 667.125210][ T9805] [ 667.145632][ T9805] Allocated by task 9805: [ 667.145991][ T9805] kasan_save_stack+0x20/0x40 [ 667.146352][ T9805] kasan_save_track+0x14/0x30 [ 667.146717][ T9805] __kasan_kmalloc+0xaa/0xb0 [ 667.147065][ T9805] __kmalloc_noprof+0x205/0x550 [ 667.147448][ T9805] hfsplus_find_init+0x95/0x1f0 [ 667.147813][ T9805] hfsplus_readdir+0x220/0xfc0 [ 667.148174][ T9805] iterate_dir+0x296/0xb20 [ 667.148549][ T9805] __x64_sys_getdents64+0x13c/0x2c0 [ 667.148937][ T9805] do_syscall_64+0xc9/0x480 [ 667.149291][ T9805] entry_SYSCALL_64_after_hwframe+0x77/0x7f [ 667.149809][ T9805] [ 667.150030][ T9805] The buggy address belongs to the object at ffff88802592f000 [ 667.150030][ T9805] which belongs to the cache kmalloc-2k of size 2048 [ 667.151282][ T9805] The buggy address is located 0 bytes to the right of [ 667.151282][ T9805] allocated 1036-byte region [ffff88802592f000, ffff88802592f40c) [ 667.152580][ T9805] [ 667.152798][ T9805] The buggy address belongs to the physical page: [ 667.153373][ T9805] page: refcount:0 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x25928 [ 667.154157][ T9805] head: order:3 mapcount:0 entire_mapcount:0 nr_pages_mapped:0 pincount:0 [ 667.154916][ T9805] anon flags: 0xfff00000000040(head|node=0|zone=1|lastcpupid=0x7ff) [ 667.155631][ T9805] page_type: f5(slab) [ 667.155997][ T9805] raw: 00fff00000000040 ffff88801b442f00 0000000000000000 dead000000000001 [ 667.156770][ T9805] raw: 0000000000000000 0000000080080008 00000000f5000000 0000000000000000 [ 667.157536][ T9805] head: 00fff00000000040 ffff88801b442f00 0000000000000000 dead000000000001 [ 667.158317][ T9805] head: 0000000000000000 0000000080080008 00000000f5000000 0000000000000000 [ 667.159088][ T9805] head: 00fff00000000003 ffffea0000964a01 00000000ffffffff 00000000ffffffff [ 667.159865][ T9805] head: ffffffffffffffff 0000000000000000 00000000ffffffff 0000000000000008 [ 667.160643][ T9805] page dumped because: kasan: bad access detected [ 667.161216][ T9805] page_owner tracks the page as allocated [ 667.161732][ T9805] page last allocated via order 3, migratetype Unmovable, gfp_mask 0xd20c0(__GFP_IO|__GFP_FS|__GFP_NOWARN9 [ 667.163566][ T9805] post_alloc_hook+0x1c0/0x230 [ 667.164003][ T9805] get_page_from_freelist+0xdeb/0x3b30 [ 667.164503][ T9805] __alloc_frozen_pages_noprof+0x25c/0x2460 [ 667.165040][ T9805] alloc_pages_mpol+0x1fb/0x550 [ 667.165489][ T9805] new_slab+0x23b/0x340 [ 667.165872][ T9805] ___slab_alloc+0xd81/0x1960 [ 667.166313][ T9805] __slab_alloc.isra.0+0x56/0xb0 [ 667.166767][ T9805] __kmalloc_cache_noprof+0x255/0x3e0 [ 667.167255][ T9805] psi_cgroup_alloc+0x52/0x2d0 [ 667.167693][ T9805] cgroup_mkdir+0x694/0x1210 [ 667.168118][ T9805] kernfs_iop_mkdir+0x111/0x190 [ 667.168568][ T9805] vfs_mkdir+0x59b/0x8d0 [ 667.168956][ T9805] do_mkdirat+0x2ed/0x3d0 [ 667.169353][ T9805] __x64_sys_mkdir+0xef/0x140 [ 667.169784][ T9805] do_syscall_64+0xc9/0x480 [ 667.170195][ T9805] entry_SYSCALL_64_after_hwframe+0x77/0x7f [ 667.170730][ T9805] page last free pid 1257 tgid 1257 stack trace: [ 667.171304][ T9805] __free_frozen_pages+0x80c/0x1250 [ 667.171770][ T9805] vfree.part.0+0x12b/0xab0 [ 667.172182][ T9805] delayed_vfree_work+0x93/0xd0 [ 667.172612][ T9805] process_one_work+0x9b5/0x1b80 [ 667.173067][ T9805] worker_thread+0x630/0xe60 [ 667.173486][ T9805] kthread+0x3a8/0x770 [ 667.173857][ T9805] ret_from_fork+0x517/0x6e0 [ 667.174278][ T9805] ret_from_fork_asm+0x1a/0x30 [ 667.174703][ T9805] [ 667.174917][ T9805] Memory state around the buggy address: [ 667.175411][ T9805] ffff88802592f300: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 [ 667.176114][ T9805] ffff88802592f380: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 [ 667.176830][ T9805] >ffff88802592f400: 00 04 fc fc fc fc fc fc fc fc fc fc fc fc fc fc [ 667.177547][ T9805] ^ [ 667.177933][ T9805] ffff88802592f480: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc [ 667.178640][ T9805] ffff88802592f500: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc [ 667.179350][ T9805] ================================================================== The hfsplus_uni2asc() method operates by struct hfsplus_unistr: struct hfsplus_unistr { __be16 length; hfsplus_unichr unicode[HFSPLUS_MAX_STRLEN]; } __packed; where HFSPLUS_MAX_STRLEN is 255 bytes. The issue happens if length of the structure instance has value bigger than 255 (for example, 65283). In such case, pointer on unicode buffer is going beyond of the allocated memory. The patch fixes the issue by checking the length value of hfsplus_unistr instance and using 255 value in the case if length value is bigger than HFSPLUS_MAX_STRLEN. Potential reason of such situation could be a corruption of Catalog File b-tree's node. Reported-by: Wenzhi Wang <wenzhi.wang@uwaterloo.ca> Signed-off-by: Liu Shixin <liushixin2@huawei.com> Signed-off-by: Viacheslav Dubeyko <slava@dubeyko.com> cc: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de> cc: Yangtao Li <frank.li@vivo.com> cc: linux-fsdevel@vger.kernel.org Reviewed-by: Yangtao Li <frank.li@vivo.com> Link: https://lore.kernel.org/r/20250710230830.110500-1-slava@dubeyko.com Signed-off-by: Viacheslav Dubeyko <slava@dubeyko.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
tboehler1
pushed a commit
that referenced
this pull request
Sep 1, 2025
commit 87c6efc upstream. Shuang reported sch_ets test-case [1] crashing in ets_class_qlen_notify() after recent changes from Lion [2]. The problem is: in ets_qdisc_change() we purge unused DWRR queues; the value of 'q->nbands' is the new one, and the cleanup should be done with the old one. The problem is here since my first attempts to fix ets_qdisc_change(), but it surfaced again after the recent qdisc len accounting fixes. Fix it purging idle DWRR queues before assigning a new value of 'q->nbands', so that all purge operations find a consistent configuration: - old 'q->nbands' because it's needed by ets_class_find() - old 'q->nstrict' because it's needed by ets_class_is_strict() BUG: kernel NULL pointer dereference, address: 0000000000000000 #PF: supervisor read access in kernel mode #PF: error_code(0x0000) - not-present page PGD 0 P4D 0 Oops: Oops: 0000 [#1] SMP NOPTI CPU: 62 UID: 0 PID: 39457 Comm: tc Kdump: loaded Not tainted 6.12.0-116.el10.x86_64 #1 PREEMPT(voluntary) Hardware name: Dell Inc. PowerEdge R640/06DKY5, BIOS 2.12.2 07/09/2021 RIP: 0010:__list_del_entry_valid_or_report+0x4/0x80 Code: ff 4c 39 c7 0f 84 39 19 8e ff b8 01 00 00 00 c3 cc cc cc cc 66 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 f3 0f 1e fa <48> 8b 17 48 8b 4f 08 48 85 d2 0f 84 56 19 8e ff 48 85 c9 0f 84 ab RSP: 0018:ffffba186009f400 EFLAGS: 00010202 RAX: 00000000000000d6 RBX: 0000000000000000 RCX: 0000000000000004 RDX: ffff9f0fa29b69c0 RSI: 0000000000000000 RDI: 0000000000000000 RBP: ffffffffc12c2400 R08: 0000000000000008 R09: 0000000000000004 R10: ffffffffffffffff R11: 0000000000000004 R12: 0000000000000000 R13: ffff9f0f8cfe0000 R14: 0000000000100005 R15: 0000000000000000 FS: 00007f2154f37480(0000) GS:ffff9f269c1c0000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000000000000 CR3: 00000001530be001 CR4: 00000000007726f0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 PKRU: 55555554 Call Trace: <TASK> ets_class_qlen_notify+0x65/0x90 [sch_ets] qdisc_tree_reduce_backlog+0x74/0x110 ets_qdisc_change+0x630/0xa40 [sch_ets] __tc_modify_qdisc.constprop.0+0x216/0x7f0 tc_modify_qdisc+0x7c/0x120 rtnetlink_rcv_msg+0x145/0x3f0 netlink_rcv_skb+0x53/0x100 netlink_unicast+0x245/0x390 netlink_sendmsg+0x21b/0x470 ____sys_sendmsg+0x39d/0x3d0 ___sys_sendmsg+0x9a/0xe0 __sys_sendmsg+0x7a/0xd0 do_syscall_64+0x7d/0x160 entry_SYSCALL_64_after_hwframe+0x76/0x7e RIP: 0033:0x7f2155114084 Code: 89 02 b8 ff ff ff ff eb bb 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 00 f3 0f 1e fa 80 3d 25 f0 0c 00 00 74 13 b8 2e 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 54 c3 0f 1f 00 48 83 ec 28 89 54 24 1c 48 89 RSP: 002b:00007fff1fd7a988 EFLAGS: 00000202 ORIG_RAX: 000000000000002e RAX: ffffffffffffffda RBX: 0000560ec063e5e0 RCX: 00007f2155114084 RDX: 0000000000000000 RSI: 00007fff1fd7a9f0 RDI: 0000000000000003 RBP: 00007fff1fd7aa60 R08: 0000000000000010 R09: 000000000000003f R10: 0000560ee9b3a010 R11: 0000000000000202 R12: 00007fff1fd7aae0 R13: 000000006891ccde R14: 0000560ec063e5e0 R15: 00007fff1fd7aad0 </TASK> [1] https://lore.kernel.org/netdev/e08c7f4a6882f260011909a868311c6e9b54f3e4.1639153474.git.dcaratti@redhat.com/ [2] https://lore.kernel.org/netdev/d912cbd7-193b-4269-9857-525bee8bbb6a@gmail.com/ Cc: stable@vger.kernel.org Fixes: 103406b ("net/sched: Always pass notifications when child class becomes empty") Fixes: c062f2a ("net/sched: sch_ets: don't remove idle classes from the round-robin list") Fixes: dcc68b4 ("net: sch_ets: Add a new Qdisc") Reported-by: Li Shuang <shuali@redhat.com> Closes: https://issues.redhat.com/browse/RHEL-108026 Reviewed-by: Petr Machata <petrm@nvidia.com> Co-developed-by: Ivan Vecera <ivecera@redhat.com> Signed-off-by: Ivan Vecera <ivecera@redhat.com> Signed-off-by: Davide Caratti <dcaratti@redhat.com> Link: https://patch.msgid.link/7928ff6d17db47a2ae7cc205c44777b1f1950545.1755016081.git.dcaratti@redhat.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
tboehler1
pushed a commit
that referenced
this pull request
Sep 1, 2025
commit 33caa20 upstream. The existing code move the VF NIC to new namespace when NETDEV_REGISTER is received on netvsc NIC. During deletion of the namespace, default_device_exit_batch() >> default_device_exit_net() is called. When netvsc NIC is moved back and registered to the default namespace, it automatically brings VF NIC back to the default namespace. This will cause the default_device_exit_net() >> for_each_netdev_safe loop unable to detect the list end, and hit NULL ptr: [ 231.449420] mana 7870:00:00.0 enP30832s1: Moved VF to namespace with: eth0 [ 231.449656] BUG: kernel NULL pointer dereference, address: 0000000000000010 [ 231.450246] #PF: supervisor read access in kernel mode [ 231.450579] #PF: error_code(0x0000) - not-present page [ 231.450916] PGD 17b8a8067 P4D 0 [ 231.451163] Oops: Oops: 0000 [#1] SMP NOPTI [ 231.451450] CPU: 82 UID: 0 PID: 1394 Comm: kworker/u768:1 Not tainted 6.16.0-rc4+ #3 VOLUNTARY [ 231.452042] Hardware name: Microsoft Corporation Virtual Machine/Virtual Machine, BIOS Hyper-V UEFI Release v4.1 11/21/2024 [ 231.452692] Workqueue: netns cleanup_net [ 231.452947] RIP: 0010:default_device_exit_batch+0x16c/0x3f0 [ 231.453326] Code: c0 0c f5 b3 e8 d5 db fe ff 48 85 c0 74 15 48 c7 c2 f8 fd ca b2 be 10 00 00 00 48 8d 7d c0 e8 7b 77 25 00 49 8b 86 28 01 00 00 <48> 8b 50 10 4c 8b 2a 4c 8d 62 f0 49 83 ed 10 4c 39 e0 0f 84 d6 00 [ 231.454294] RSP: 0018:ff75fc7c9bf9fd00 EFLAGS: 00010246 [ 231.454610] RAX: 0000000000000000 RBX: 0000000000000002 RCX: 61c8864680b583eb [ 231.455094] RDX: ff1fa9f71462d800 RSI: ff75fc7c9bf9fd38 RDI: 0000000030766564 [ 231.455686] RBP: ff75fc7c9bf9fd78 R08: 0000000000000000 R09: 0000000000000000 [ 231.456126] R10: 0000000000000001 R11: 0000000000000004 R12: ff1fa9f70088e340 [ 231.456621] R13: ff1fa9f70088e340 R14: ffffffffb3f50c20 R15: ff1fa9f7103e6340 [ 231.457161] FS: 0000000000000000(0000) GS:ff1faa6783a08000(0000) knlGS:0000000000000000 [ 231.457707] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 231.458031] CR2: 0000000000000010 CR3: 0000000179ab2006 CR4: 0000000000b73ef0 [ 231.458434] Call Trace: [ 231.458600] <TASK> [ 231.458777] ops_undo_list+0x100/0x220 [ 231.459015] cleanup_net+0x1b8/0x300 [ 231.459285] process_one_work+0x184/0x340 To fix it, move the ns change to a workqueue, and take rtnl_lock to avoid changing the netdev list when default_device_exit_net() is using it. Cc: stable@vger.kernel.org Fixes: 4c26280 ("hv_netvsc: Fix VF namespace also in synthetic NIC NETDEV_REGISTER event") Signed-off-by: Haiyang Zhang <haiyangz@microsoft.com> Link: https://patch.msgid.link/1754511711-11188-1-git-send-email-haiyangz@linux.microsoft.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
kunbus-gitlab-sync
pushed a commit
that referenced
this pull request
Dec 11, 2025
[ Upstream commit 0fd20f6 ] Do not block PCI config accesses through pci_cfg_access_lock() when executing the s390 variant of PCI error recovery: Acquire just device_lock() instead of pci_dev_lock() as powerpc's EEH and generig PCI AER processing do. During error recovery testing a pair of tasks was reported to be hung: mlx5_core 0000:00:00.1: mlx5_health_try_recover:338:(pid 5553): health recovery flow aborted, PCI reads still not working INFO: task kmcheck:72 blocked for more than 122 seconds. Not tainted 5.14.0-570.12.1.bringup7.el9.s390x #1 "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. task:kmcheck state:D stack:0 pid:72 tgid:72 ppid:2 flags:0x00000000 Call Trace: [<000000065256f030>] __schedule+0x2a0/0x590 [<000000065256f356>] schedule+0x36/0xe0 [<000000065256f572>] schedule_preempt_disabled+0x22/0x30 [<0000000652570a94>] __mutex_lock.constprop.0+0x484/0x8a8 [<000003ff800673a4>] mlx5_unload_one+0x34/0x58 [mlx5_core] [<000003ff8006745c>] mlx5_pci_err_detected+0x94/0x140 [mlx5_core] [<0000000652556c5a>] zpci_event_attempt_error_recovery+0xf2/0x398 [<0000000651b9184a>] __zpci_event_error+0x23a/0x2c0 INFO: task kworker/u1664:6:1514 blocked for more than 122 seconds. Not tainted 5.14.0-570.12.1.bringup7.el9.s390x #1 "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. task:kworker/u1664:6 state:D stack:0 pid:1514 tgid:1514 ppid:2 flags:0x00000000 Workqueue: mlx5_health0000:00:00.0 mlx5_fw_fatal_reporter_err_work [mlx5_core] Call Trace: [<000000065256f030>] __schedule+0x2a0/0x590 [<000000065256f356>] schedule+0x36/0xe0 [<0000000652172e28>] pci_wait_cfg+0x80/0xe8 [<0000000652172f94>] pci_cfg_access_lock+0x74/0x88 [<000003ff800916b6>] mlx5_vsc_gw_lock+0x36/0x178 [mlx5_core] [<000003ff80098824>] mlx5_crdump_collect+0x34/0x1c8 [mlx5_core] [<000003ff80074b62>] mlx5_fw_fatal_reporter_dump+0x6a/0xe8 [mlx5_core] [<0000000652512242>] devlink_health_do_dump.part.0+0x82/0x168 [<0000000652513212>] devlink_health_report+0x19a/0x230 [<000003ff80075a12>] mlx5_fw_fatal_reporter_err_work+0xba/0x1b0 [mlx5_core] No kernel log of the exact same error with an upstream kernel is available - but the very same deadlock situation can be constructed there, too: - task: kmcheck mlx5_unload_one() tries to acquire devlink lock while the PCI error recovery code has set pdev->block_cfg_access by way of pci_cfg_access_lock() - task: kworker mlx5_crdump_collect() tries to set block_cfg_access through pci_cfg_access_lock() while devlink_health_report() had acquired the devlink lock. A similar deadlock situation can be reproduced by requesting a crdump with > devlink health dump show pci/<BDF> reporter fw_fatal while PCI error recovery is executed on the same <BDF> physical function by mlx5_core's pci_error_handlers. On s390 this can be injected with > zpcictl --reset-fw <BDF> Tests with this patch failed to reproduce that second deadlock situation, the devlink command is rejected with "kernel answers: Permission denied" - and we get a kernel log message of: mlx5_core 1ed0:00:00.1: mlx5_crdump_collect:50:(pid 254382): crdump: failed to lock vsc gw err -5 because the config read of VSC_SEMAPHORE is rejected by the underlying hardware. Two prior attempts to address this issue have been discussed and ultimately rejected [see link], with the primary argument that s390's implementation of PCI error recovery is imposing restrictions that neither powerpc's EEH nor PCI AER handling need. Tests show that PCI error recovery on s390 is running to completion even without blocking access to PCI config space. Link: https://lore.kernel.org/all/20251007144826.2825134-1-gbayer@linux.ibm.com/ Cc: stable@vger.kernel.org Fixes: 4cdf2f4 ("s390/pci: implement minimal PCI error recovery") Reviewed-by: Niklas Schnelle <schnelle@linux.ibm.com> Signed-off-by: Gerd Bayer <gbayer@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com> [ Adjust context ] Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
kunbus-gitlab-sync
pushed a commit
that referenced
this pull request
Dec 11, 2025
[ Upstream commit 5d726c4 ] Following deadlock can be triggered easily by lockdep: WARNING: possible circular locking dependency detected 6.17.0-rc3-00124-ga12c2658ced0 #1665 Not tainted ------------------------------------------------------ check/1334 is trying to acquire lock: ff1100011d9d0678 (&q->sysfs_lock){+.+.}-{4:4}, at: blk_unregister_queue+0x53/0x180 but task is already holding lock: ff1100011d9d00e0 (&q->q_usage_counter(queue)#3){++++}-{0:0}, at: del_gendisk+0xba/0x110 which lock already depends on the new lock. the existing dependency chain (in reverse order) is: -> #2 (&q->q_usage_counter(queue)#3){++++}-{0:0}: blk_queue_enter+0x40b/0x470 blkg_conf_prep+0x7b/0x3c0 tg_set_limit+0x10a/0x3e0 cgroup_file_write+0xc6/0x420 kernfs_fop_write_iter+0x189/0x280 vfs_write+0x256/0x490 ksys_write+0x83/0x190 __x64_sys_write+0x21/0x30 x64_sys_call+0x4608/0x4630 do_syscall_64+0xdb/0x6b0 entry_SYSCALL_64_after_hwframe+0x76/0x7e -> #1 (&q->rq_qos_mutex){+.+.}-{4:4}: __mutex_lock+0xd8/0xf50 mutex_lock_nested+0x2b/0x40 wbt_init+0x17e/0x280 wbt_enable_default+0xe9/0x140 blk_register_queue+0x1da/0x2e0 __add_disk+0x38c/0x5d0 add_disk_fwnode+0x89/0x250 device_add_disk+0x18/0x30 virtblk_probe+0x13a3/0x1800 virtio_dev_probe+0x389/0x610 really_probe+0x136/0x620 __driver_probe_device+0xb3/0x230 driver_probe_device+0x2f/0xe0 __driver_attach+0x158/0x250 bus_for_each_dev+0xa9/0x130 driver_attach+0x26/0x40 bus_add_driver+0x178/0x3d0 driver_register+0x7d/0x1c0 __register_virtio_driver+0x2c/0x60 virtio_blk_init+0x6f/0xe0 do_one_initcall+0x94/0x540 kernel_init_freeable+0x56a/0x7b0 kernel_init+0x2b/0x270 ret_from_fork+0x268/0x4c0 ret_from_fork_asm+0x1a/0x30 -> #0 (&q->sysfs_lock){+.+.}-{4:4}: __lock_acquire+0x1835/0x2940 lock_acquire+0xf9/0x450 __mutex_lock+0xd8/0xf50 mutex_lock_nested+0x2b/0x40 blk_unregister_queue+0x53/0x180 __del_gendisk+0x226/0x690 del_gendisk+0xba/0x110 sd_remove+0x49/0xb0 [sd_mod] device_remove+0x87/0xb0 device_release_driver_internal+0x11e/0x230 device_release_driver+0x1a/0x30 bus_remove_device+0x14d/0x220 device_del+0x1e1/0x5a0 __scsi_remove_device+0x1ff/0x2f0 scsi_remove_device+0x37/0x60 sdev_store_delete+0x77/0x100 dev_attr_store+0x1f/0x40 sysfs_kf_write+0x65/0x90 kernfs_fop_write_iter+0x189/0x280 vfs_write+0x256/0x490 ksys_write+0x83/0x190 __x64_sys_write+0x21/0x30 x64_sys_call+0x4608/0x4630 do_syscall_64+0xdb/0x6b0 entry_SYSCALL_64_after_hwframe+0x76/0x7e other info that might help us debug this: Chain exists of: &q->sysfs_lock --> &q->rq_qos_mutex --> &q->q_usage_counter(queue)#3 Possible unsafe locking scenario: CPU0 CPU1 ---- ---- lock(&q->q_usage_counter(queue)#3); lock(&q->rq_qos_mutex); lock(&q->q_usage_counter(queue)#3); lock(&q->sysfs_lock); Root cause is that queue_usage_counter is grabbed with rq_qos_mutex held in blkg_conf_prep(), while queue should be freezed before rq_qos_mutex from other context. The blk_queue_enter() from blkg_conf_prep() is used to protect against policy deactivation, which is already protected with blkcg_mutex, hence convert blk_queue_enter() to blkcg_mutex to fix this problem. Meanwhile, consider that blkcg_mutex is held after queue is freezed from policy deactivation, also convert blkg_alloc() to use GFP_NOIO. Signed-off-by: Yu Kuai <yukuai3@huawei.com> Reviewed-by: Ming Lei <ming.lei@redhat.com> Signed-off-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Sasha Levin <sashal@kernel.org>
kunbus-gitlab-sync
pushed a commit
that referenced
this pull request
Dec 11, 2025
[ Upstream commit 4c634b6 ] As noted in the kernel documentation [1], open-coded multiplication in allocator arguments is discouraged because it can lead to integer overflow. Use kcalloc() to gain built-in overflow protection, making memory allocation safer when calculating allocation size compared to explicit multiplication. Similarly, use size_add() instead of explicit addition for 'uobj_chunk_num + sobj_chunk_num'. Link: https://www.kernel.org/doc/html/next/process/deprecated.html#open-coded-arithmetic-in-allocator-arguments #1 Signed-off-by: Qianfeng Rong <rongqianfeng@vivo.com> Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Acked-by: Giovanni Cabiddu <giovanni.cabiddu@intel.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: Sasha Levin <sashal@kernel.org>
kunbus-gitlab-sync
pushed a commit
that referenced
this pull request
Dec 11, 2025
[ Upstream commit 99d7181 ] There is race in amdgpu_amdkfd_device_fini_sw and interrupt. if amdgpu_amdkfd_device_fini_sw run in b/w kfd_cleanup_nodes and kfree(kfd), and KGD interrupt generated. kernel panic log: BUG: kernel NULL pointer dereference, address: 0000000000000098 amdgpu 0000:c8:00.0: amdgpu: Requesting 4 partitions through PSP PGD d78c68067 P4D d78c68067 kfd kfd: amdgpu: Allocated 3969056 bytes on gart PUD 1465b8067 PMD @ Oops: @002 [#1] SMP NOPTI kfd kfd: amdgpu: Total number of KFD nodes to be created: 4 CPU: 115 PID: @ Comm: swapper/115 Kdump: loaded Tainted: G S W OE K RIP: 0010:_raw_spin_lock_irqsave+0x12/0x40 Code: 89 e@ 41 5c c3 cc cc cc cc 66 66 2e Of 1f 84 00 00 00 00 00 OF 1f 40 00 Of 1f 44% 00 00 41 54 9c 41 5c fa 31 cO ba 01 00 00 00 <fO> OF b1 17 75 Ba 4c 89 e@ 41 Sc 89 c6 e8 07 38 5d RSP: 0018: ffffc90@1a6b0e28 EFLAGS: 00010046 RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000018 0000000000000001 RSI: ffff8883bb623e00 RDI: 0000000000000098 ffff8883bb000000 RO8: ffff888100055020 ROO: ffff888100055020 0000000000000000 R11: 0000000000000000 R12: 0900000000000002 ffff888F2b97da0@ R14: @000000000000098 R15: ffff8883babdfo00 CS: 010 DS: 0000 ES: 0000 CRO: 0000000080050033 CR2: 0000000000000098 CR3: 0000000e7cae2006 CR4: 0000000002770ce0 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 0000000000000000 DR6: 00000000fffeO7FO DR7: 0000000000000400 PKRU: 55555554 Call Trace: <IRQ> kgd2kfd_interrupt+@x6b/0x1f@ [amdgpu] ? amdgpu_fence_process+0xa4/0x150 [amdgpu] kfd kfd: amdgpu: Node: 0, interrupt_bitmap: 3 YcpxFl Rant tErace amdgpu_irq_dispatch+0x165/0x210 [amdgpu] amdgpu_ih_process+0x80/0x100 [amdgpu] amdgpu: Virtual CRAT table created for GPU amdgpu_irq_handler+0x1f/@x60 [amdgpu] __handle_irq_event_percpu+0x3d/0x170 amdgpu: Topology: Add dGPU node [0x74a2:0x1002] handle_irq_event+0x5a/@xco handle_edge_irq+0x93/0x240 kfd kfd: amdgpu: KFD node 1 partition @ size 49148M asm_call_irq_on_stack+0xf/@X20 </IRQ> common_interrupt+0xb3/0x130 asm_common_interrupt+0x1le/0x40 5.10.134-010.a1i5000.a18.x86_64 #1 Signed-off-by: Yifan Zhang <yifan1.zhang@amd.com> Reviewed-by: Philip Yang<Philip.Yang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
kunbus-gitlab-sync
pushed a commit
that referenced
this pull request
Dec 11, 2025
[ Upstream commit 38f5024 ] With CONFIG_PROVE_RCU_LIST=y and by executing $ netcat -l --sctp & $ netcat --sctp localhost & $ ss --sctp one can trigger the following Lockdep-RCU splat(s): WARNING: suspicious RCU usage 6.18.0-rc1-00093-g7f864458e9a6 #5 Not tainted ----------------------------- net/sctp/diag.c:76 RCU-list traversed in non-reader section!! other info that might help us debug this: rcu_scheduler_active = 2, debug_locks = 1 2 locks held by ss/215: #0: ffff9c740828bec0 (nlk_cb_mutex-SOCK_DIAG){+.+.}-{4:4}, at: __netlink_dump_start+0x84/0x2b0 #1: ffff9c7401d72cd0 (sk_lock-AF_INET6){+.+.}-{0:0}, at: sctp_sock_dump+0x38/0x200 stack backtrace: CPU: 0 UID: 0 PID: 215 Comm: ss Not tainted 6.18.0-rc1-00093-g7f864458e9a6 #5 PREEMPT(voluntary) Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.16.3-0-ga6ed6b701f0a-prebuilt.qemu.org 04/01/2014 Call Trace: <TASK> dump_stack_lvl+0x5d/0x90 lockdep_rcu_suspicious.cold+0x4e/0xa3 inet_sctp_diag_fill.isra.0+0x4b1/0x5d0 sctp_sock_dump+0x131/0x200 sctp_transport_traverse_process+0x170/0x1b0 ? __pfx_sctp_sock_filter+0x10/0x10 ? __pfx_sctp_sock_dump+0x10/0x10 sctp_diag_dump+0x103/0x140 __inet_diag_dump+0x70/0xb0 netlink_dump+0x148/0x490 __netlink_dump_start+0x1f3/0x2b0 inet_diag_handler_cmd+0xcd/0x100 ? __pfx_inet_diag_dump_start+0x10/0x10 ? __pfx_inet_diag_dump+0x10/0x10 ? __pfx_inet_diag_dump_done+0x10/0x10 sock_diag_rcv_msg+0x18e/0x320 ? __pfx_sock_diag_rcv_msg+0x10/0x10 netlink_rcv_skb+0x4d/0x100 netlink_unicast+0x1d7/0x2b0 netlink_sendmsg+0x203/0x450 ____sys_sendmsg+0x30c/0x340 ___sys_sendmsg+0x94/0xf0 __sys_sendmsg+0x83/0xf0 do_syscall_64+0xbb/0x390 entry_SYSCALL_64_after_hwframe+0x77/0x7f ... </TASK> Fixes: 8f840e4 ("sctp: add the sctp_diag.c file") Signed-off-by: Stefan Wiehler <stefan.wiehler@nokia.com> Reviewed-by: Kuniyuki Iwashima <kuniyu@google.com> Acked-by: Xin Long <lucien.xin@gmail.com> Link: https://patch.msgid.link/20251028161506.3294376-2-stefan.wiehler@nokia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
kunbus-gitlab-sync
pushed a commit
that referenced
this pull request
Dec 11, 2025
[ Upstream commit e120f46 ] Raw IP packets have no MAC header, leaving skb->mac_header uninitialized. This can trigger kernel panics on ARM64 when xfrm or other subsystems access the offset due to strict alignment checks. Initialize the MAC header to prevent such crashes. This can trigger kernel panics on ARM when running IPsec over the qmimux0 interface. Example trace: Internal error: Oops: 000000009600004f [#1] SMP CPU: 0 UID: 0 PID: 0 Comm: swapper/0 Not tainted 6.12.34-gbe78e49cb433 #1 Hardware name: LS1028A RDB Board (DT) pstate: 60000005 (nZCv daif -PAN -UAO -TCO -DIT -SSBS BTYPE=--) pc : xfrm_input+0xde8/0x1318 lr : xfrm_input+0x61c/0x1318 sp : ffff800080003b20 Call trace: xfrm_input+0xde8/0x1318 xfrm6_rcv+0x38/0x44 xfrm6_esp_rcv+0x48/0xa8 ip6_protocol_deliver_rcu+0x94/0x4b0 ip6_input_finish+0x44/0x70 ip6_input+0x44/0xc0 ipv6_rcv+0x6c/0x114 __netif_receive_skb_one_core+0x5c/0x8c __netif_receive_skb+0x18/0x60 process_backlog+0x78/0x17c __napi_poll+0x38/0x180 net_rx_action+0x168/0x2f0 Fixes: c6adf77 ("net: usb: qmi_wwan: add qmap mux protocol support") Signed-off-by: Qendrim Maxhuni <qendrim.maxhuni@garderos.com> Link: https://patch.msgid.link/20251029075744.105113-1-qendrim.maxhuni@garderos.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
kunbus-gitlab-sync
pushed a commit
that referenced
this pull request
Dec 11, 2025
commit 6dd97ce upstream. When a connector is connected but inactive (e.g., disabled by desktop environments), pipe_ctx->stream_res.tg will be destroyed. Then, reading odm_combine_segments causes kernel NULL pointer dereference. BUG: kernel NULL pointer dereference, address: 0000000000000000 #PF: supervisor read access in kernel mode #PF: error_code(0x0000) - not-present page PGD 0 P4D 0 Oops: Oops: 0000 [#1] SMP NOPTI CPU: 16 UID: 0 PID: 26474 Comm: cat Not tainted 6.17.0+ #2 PREEMPT(lazy) e6a17af9ee6db7c63e9d90dbe5b28ccab67520c6 Hardware name: LENOVO 21Q4/LNVNB161216, BIOS PXCN25WW 03/27/2025 RIP: 0010:odm_combine_segments_show+0x93/0xf0 [amdgpu] Code: 41 83 b8 b0 00 00 00 01 75 6e 48 98 ba a1 ff ff ff 48 c1 e0 0c 48 8d 8c 07 d8 02 00 00 48 85 c9 74 2d 48 8b bc 07 f0 08 00 00 <48> 8b 07 48 8b 80 08 02 00> RSP: 0018:ffffd1bf4b953c58 EFLAGS: 00010286 RAX: 0000000000005000 RBX: ffff8e35976b02d0 RCX: ffff8e3aeed052d8 RDX: 00000000ffffffa1 RSI: ffff8e35a3120800 RDI: 0000000000000000 RBP: 0000000000000000 R08: ffff8e3580eb0000 R09: ffff8e35976b02d0 R10: ffffd1bf4b953c78 R11: 0000000000000000 R12: ffffd1bf4b953d08 R13: 0000000000040000 R14: 0000000000000001 R15: 0000000000000001 FS: 00007f44d3f9f740(0000) GS:ffff8e3caa47f000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000000000000 CR3: 00000006485c2000 CR4: 0000000000f50ef0 PKRU: 55555554 Call Trace: <TASK> seq_read_iter+0x125/0x490 ? __alloc_frozen_pages_noprof+0x18f/0x350 seq_read+0x12c/0x170 full_proxy_read+0x51/0x80 vfs_read+0xbc/0x390 ? __handle_mm_fault+0xa46/0xef0 ? do_syscall_64+0x71/0x900 ksys_read+0x73/0xf0 do_syscall_64+0x71/0x900 ? count_memcg_events+0xc2/0x190 ? handle_mm_fault+0x1d7/0x2d0 ? do_user_addr_fault+0x21a/0x690 ? exc_page_fault+0x7e/0x1a0 entry_SYSCALL_64_after_hwframe+0x6c/0x74 RIP: 0033:0x7f44d4031687 Code: 48 89 fa 4c 89 df e8 58 b3 00 00 8b 93 08 03 00 00 59 5e 48 83 f8 fc 74 1a 5b c3 0f 1f 84 00 00 00 00 00 48 8b 44 24 10 0f 05 <5b> c3 0f 1f 80 00 00 00 00> RSP: 002b:00007ffdb4b5f0b0 EFLAGS: 00000202 ORIG_RAX: 0000000000000000 RAX: ffffffffffffffda RBX: 00007f44d3f9f740 RCX: 00007f44d4031687 RDX: 0000000000040000 RSI: 00007f44d3f5e000 RDI: 0000000000000003 RBP: 0000000000040000 R08: 0000000000000000 R09: 0000000000000000 R10: 0000000000000000 R11: 0000000000000202 R12: 00007f44d3f5e000 R13: 0000000000000003 R14: 0000000000000000 R15: 0000000000040000 </TASK> Modules linked in: tls tcp_diag inet_diag xt_mark ccm snd_hrtimer snd_seq_dummy snd_seq_midi snd_seq_oss snd_seq_midi_event snd_rawmidi snd_seq snd_seq_device x> snd_hda_codec_atihdmi snd_hda_codec_realtek_lib lenovo_wmi_helpers think_lmi snd_hda_codec_generic snd_hda_codec_hdmi snd_soc_core kvm snd_compress uvcvideo sn> platform_profile joydev amd_pmc mousedev mac_hid sch_fq_codel uinput i2c_dev parport_pc ppdev lp parport nvme_fabrics loop nfnetlink ip_tables x_tables dm_cryp> CR2: 0000000000000000 ---[ end trace 0000000000000000 ]--- RIP: 0010:odm_combine_segments_show+0x93/0xf0 [amdgpu] Code: 41 83 b8 b0 00 00 00 01 75 6e 48 98 ba a1 ff ff ff 48 c1 e0 0c 48 8d 8c 07 d8 02 00 00 48 85 c9 74 2d 48 8b bc 07 f0 08 00 00 <48> 8b 07 48 8b 80 08 02 00> RSP: 0018:ffffd1bf4b953c58 EFLAGS: 00010286 RAX: 0000000000005000 RBX: ffff8e35976b02d0 RCX: ffff8e3aeed052d8 RDX: 00000000ffffffa1 RSI: ffff8e35a3120800 RDI: 0000000000000000 RBP: 0000000000000000 R08: ffff8e3580eb0000 R09: ffff8e35976b02d0 R10: ffffd1bf4b953c78 R11: 0000000000000000 R12: ffffd1bf4b953d08 R13: 0000000000040000 R14: 0000000000000001 R15: 0000000000000001 FS: 00007f44d3f9f740(0000) GS:ffff8e3caa47f000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000000000000 CR3: 00000006485c2000 CR4: 0000000000f50ef0 PKRU: 55555554 Fix this by checking pipe_ctx->stream_res.tg before dereferencing. Fixes: 07926ba ("drm/amd/display: Add debugfs interface for ODM combine info") Signed-off-by: Rong Zhang <i@rong.moe> Reviewed-by: Mario Limoncello <mario.limonciello@amd.com> Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit f19bbec) Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
kunbus-gitlab-sync
pushed a commit
that referenced
this pull request
Dec 11, 2025
[ Upstream commit 84bbe32 ] On completion of i915_vma_pin_ww(), a synchronous variant of dma_fence_work_commit() is called. When pinning a VMA to GGTT address space on a Cherry View family processor, or on a Broxton generation SoC with VTD enabled, i.e., when stop_machine() is then called from intel_ggtt_bind_vma(), that can potentially lead to lock inversion among reservation_ww and cpu_hotplug locks. [86.861179] ====================================================== [86.861193] WARNING: possible circular locking dependency detected [86.861209] 6.15.0-rc5-CI_DRM_16515-gca0305cadc2d+ #1 Tainted: G U [86.861226] ------------------------------------------------------ [86.861238] i915_module_loa/1432 is trying to acquire lock: [86.861252] ffffffff83489090 (cpu_hotplug_lock){++++}-{0:0}, at: stop_machine+0x1c/0x50 [86.861290] but task is already holding lock: [86.861303] ffffc90002e0b4c8 (reservation_ww_class_mutex){+.+.}-{3:3}, at: i915_vma_pin.constprop.0+0x39/0x1d0 [i915] [86.862233] which lock already depends on the new lock. [86.862251] the existing dependency chain (in reverse order) is: [86.862265] -> #5 (reservation_ww_class_mutex){+.+.}-{3:3}: [86.862292] dma_resv_lockdep+0x19a/0x390 [86.862315] do_one_initcall+0x60/0x3f0 [86.862334] kernel_init_freeable+0x3cd/0x680 [86.862353] kernel_init+0x1b/0x200 [86.862369] ret_from_fork+0x47/0x70 [86.862383] ret_from_fork_asm+0x1a/0x30 [86.862399] -> #4 (reservation_ww_class_acquire){+.+.}-{0:0}: [86.862425] dma_resv_lockdep+0x178/0x390 [86.862440] do_one_initcall+0x60/0x3f0 [86.862454] kernel_init_freeable+0x3cd/0x680 [86.862470] kernel_init+0x1b/0x200 [86.862482] ret_from_fork+0x47/0x70 [86.862495] ret_from_fork_asm+0x1a/0x30 [86.862509] -> #3 (&mm->mmap_lock){++++}-{3:3}: [86.862531] down_read_killable+0x46/0x1e0 [86.862546] lock_mm_and_find_vma+0xa2/0x280 [86.862561] do_user_addr_fault+0x266/0x8e0 [86.862578] exc_page_fault+0x8a/0x2f0 [86.862593] asm_exc_page_fault+0x27/0x30 [86.862607] filldir64+0xeb/0x180 [86.862620] kernfs_fop_readdir+0x118/0x480 [86.862635] iterate_dir+0xcf/0x2b0 [86.862648] __x64_sys_getdents64+0x84/0x140 [86.862661] x64_sys_call+0x1058/0x2660 [86.862675] do_syscall_64+0x91/0xe90 [86.862689] entry_SYSCALL_64_after_hwframe+0x76/0x7e [86.862703] -> #2 (&root->kernfs_rwsem){++++}-{3:3}: [86.862725] down_write+0x3e/0xf0 [86.862738] kernfs_add_one+0x30/0x3c0 [86.862751] kernfs_create_dir_ns+0x53/0xb0 [86.862765] internal_create_group+0x134/0x4c0 [86.862779] sysfs_create_group+0x13/0x20 [86.862792] topology_add_dev+0x1d/0x30 [86.862806] cpuhp_invoke_callback+0x4b5/0x850 [86.862822] cpuhp_issue_call+0xbf/0x1f0 [86.862836] __cpuhp_setup_state_cpuslocked+0x111/0x320 [86.862852] __cpuhp_setup_state+0xb0/0x220 [86.862866] topology_sysfs_init+0x30/0x50 [86.862879] do_one_initcall+0x60/0x3f0 [86.862893] kernel_init_freeable+0x3cd/0x680 [86.862908] kernel_init+0x1b/0x200 [86.862921] ret_from_fork+0x47/0x70 [86.862934] ret_from_fork_asm+0x1a/0x30 [86.862947] -> #1 (cpuhp_state_mutex){+.+.}-{3:3}: [86.862969] __mutex_lock+0xaa/0xed0 [86.862982] mutex_lock_nested+0x1b/0x30 [86.862995] __cpuhp_setup_state_cpuslocked+0x67/0x320 [86.863012] __cpuhp_setup_state+0xb0/0x220 [86.863026] page_alloc_init_cpuhp+0x2d/0x60 [86.863041] mm_core_init+0x22/0x2d0 [86.863054] start_kernel+0x576/0xbd0 [86.863068] x86_64_start_reservations+0x18/0x30 [86.863084] x86_64_start_kernel+0xbf/0x110 [86.863098] common_startup_64+0x13e/0x141 [86.863114] -> #0 (cpu_hotplug_lock){++++}-{0:0}: [86.863135] __lock_acquire+0x1635/0x2810 [86.863152] lock_acquire+0xc4/0x2f0 [86.863166] cpus_read_lock+0x41/0x100 [86.863180] stop_machine+0x1c/0x50 [86.863194] bxt_vtd_ggtt_insert_entries__BKL+0x3b/0x60 [i915] [86.863987] intel_ggtt_bind_vma+0x43/0x70 [i915] [86.864735] __vma_bind+0x55/0x70 [i915] [86.865510] fence_work+0x26/0xa0 [i915] [86.866248] fence_notify+0xa1/0x140 [i915] [86.866983] __i915_sw_fence_complete+0x8f/0x270 [i915] [86.867719] i915_sw_fence_commit+0x39/0x60 [i915] [86.868453] i915_vma_pin_ww+0x462/0x1360 [i915] [86.869228] i915_vma_pin.constprop.0+0x133/0x1d0 [i915] [86.870001] initial_plane_vma+0x307/0x840 [i915] [86.870774] intel_initial_plane_config+0x33f/0x670 [i915] [86.871546] intel_display_driver_probe_nogem+0x1c6/0x260 [i915] [86.872330] i915_driver_probe+0x7fa/0xe80 [i915] [86.873057] i915_pci_probe+0xe6/0x220 [i915] [86.873782] local_pci_probe+0x47/0xb0 [86.873802] pci_device_probe+0xf3/0x260 [86.873817] really_probe+0xf1/0x3c0 [86.873833] __driver_probe_device+0x8c/0x180 [86.873848] driver_probe_device+0x24/0xd0 [86.873862] __driver_attach+0x10f/0x220 [86.873876] bus_for_each_dev+0x7f/0xe0 [86.873892] driver_attach+0x1e/0x30 [86.873904] bus_add_driver+0x151/0x290 [86.873917] driver_register+0x5e/0x130 [86.873931] __pci_register_driver+0x7d/0x90 [86.873945] i915_pci_register_driver+0x23/0x30 [i915] [86.874678] i915_init+0x37/0x120 [i915] [86.875347] do_one_initcall+0x60/0x3f0 [86.875369] do_init_module+0x97/0x2a0 [86.875385] load_module+0x2c54/0x2d80 [86.875398] init_module_from_file+0x96/0xe0 [86.875413] idempotent_init_module+0x117/0x330 [86.875426] __x64_sys_finit_module+0x77/0x100 [86.875440] x64_sys_call+0x24de/0x2660 [86.875454] do_syscall_64+0x91/0xe90 [86.875470] entry_SYSCALL_64_after_hwframe+0x76/0x7e [86.875486] other info that might help us debug this: [86.875502] Chain exists of: cpu_hotplug_lock --> reservation_ww_class_acquire --> reservation_ww_class_mutex [86.875539] Possible unsafe locking scenario: [86.875552] CPU0 CPU1 [86.875563] ---- ---- [86.875573] lock(reservation_ww_class_mutex); [86.875588] lock(reservation_ww_class_acquire); [86.875606] lock(reservation_ww_class_mutex); [86.875624] rlock(cpu_hotplug_lock); [86.875637] *** DEADLOCK *** [86.875650] 3 locks held by i915_module_loa/1432: [86.875663] #0: ffff888101f5c1b0 (&dev->mutex){....}-{3:3}, at: __driver_attach+0x104/0x220 [86.875699] #1: ffffc90002e0b4a0 (reservation_ww_class_acquire){+.+.}-{0:0}, at: i915_vma_pin.constprop.0+0x39/0x1d0 [i915] [86.876512] #2: ffffc90002e0b4c8 (reservation_ww_class_mutex){+.+.}-{3:3}, at: i915_vma_pin.constprop.0+0x39/0x1d0 [i915] [86.877305] stack backtrace: [86.877326] CPU: 0 UID: 0 PID: 1432 Comm: i915_module_loa Tainted: G U 6.15.0-rc5-CI_DRM_16515-gca0305cadc2d+ #1 PREEMPT(voluntary) [86.877334] Tainted: [U]=USER [86.877336] Hardware name: /NUC5CPYB, BIOS PYBSWCEL.86A.0079.2020.0420.1316 04/20/2020 [86.877339] Call Trace: [86.877344] <TASK> [86.877353] dump_stack_lvl+0x91/0xf0 [86.877364] dump_stack+0x10/0x20 [86.877369] print_circular_bug+0x285/0x360 [86.877379] check_noncircular+0x135/0x150 [86.877390] __lock_acquire+0x1635/0x2810 [86.877403] lock_acquire+0xc4/0x2f0 [86.877408] ? stop_machine+0x1c/0x50 [86.877422] ? __pfx_bxt_vtd_ggtt_insert_entries__cb+0x10/0x10 [i915] [86.878173] cpus_read_lock+0x41/0x100 [86.878182] ? stop_machine+0x1c/0x50 [86.878191] ? __pfx_bxt_vtd_ggtt_insert_entries__cb+0x10/0x10 [i915] [86.878916] stop_machine+0x1c/0x50 [86.878927] bxt_vtd_ggtt_insert_entries__BKL+0x3b/0x60 [i915] [86.879652] intel_ggtt_bind_vma+0x43/0x70 [i915] [86.880375] __vma_bind+0x55/0x70 [i915] [86.881133] fence_work+0x26/0xa0 [i915] [86.881851] fence_notify+0xa1/0x140 [i915] [86.882566] __i915_sw_fence_complete+0x8f/0x270 [i915] [86.883286] i915_sw_fence_commit+0x39/0x60 [i915] [86.884003] i915_vma_pin_ww+0x462/0x1360 [i915] [86.884756] ? i915_vma_pin.constprop.0+0x6c/0x1d0 [i915] [86.885513] i915_vma_pin.constprop.0+0x133/0x1d0 [i915] [86.886281] initial_plane_vma+0x307/0x840 [i915] [86.887049] intel_initial_plane_config+0x33f/0x670 [i915] [86.887819] intel_display_driver_probe_nogem+0x1c6/0x260 [i915] [86.888587] i915_driver_probe+0x7fa/0xe80 [i915] [86.889293] ? mutex_unlock+0x12/0x20 [86.889301] ? drm_privacy_screen_get+0x171/0x190 [86.889308] ? acpi_dev_found+0x66/0x80 [86.889321] i915_pci_probe+0xe6/0x220 [i915] [86.890038] local_pci_probe+0x47/0xb0 [86.890049] pci_device_probe+0xf3/0x260 [86.890058] really_probe+0xf1/0x3c0 [86.890067] __driver_probe_device+0x8c/0x180 [86.890072] driver_probe_device+0x24/0xd0 [86.890078] __driver_attach+0x10f/0x220 [86.890083] ? __pfx___driver_attach+0x10/0x10 [86.890088] bus_for_each_dev+0x7f/0xe0 [86.890097] driver_attach+0x1e/0x30 [86.890101] bus_add_driver+0x151/0x290 [86.890107] driver_register+0x5e/0x130 [86.890113] __pci_register_driver+0x7d/0x90 [86.890119] i915_pci_register_driver+0x23/0x30 [i915] [86.890833] i915_init+0x37/0x120 [i915] [86.891482] ? __pfx_i915_init+0x10/0x10 [i915] [86.892135] do_one_initcall+0x60/0x3f0 [86.892145] ? __kmalloc_cache_noprof+0x33f/0x470 [86.892157] do_init_module+0x97/0x2a0 [86.892164] load_module+0x2c54/0x2d80 [86.892168] ? __kernel_read+0x15c/0x300 [86.892185] ? kernel_read_file+0x2b1/0x320 [86.892195] init_module_from_file+0x96/0xe0 [86.892199] ? init_module_from_file+0x96/0xe0 [86.892211] idempotent_init_module+0x117/0x330 [86.892224] __x64_sys_finit_module+0x77/0x100 [86.892230] x64_sys_call+0x24de/0x2660 [86.892236] do_syscall_64+0x91/0xe90 [86.892243] ? irqentry_exit+0x77/0xb0 [86.892249] ? sysvec_apic_timer_interrupt+0x57/0xc0 [86.892256] entry_SYSCALL_64_after_hwframe+0x76/0x7e [86.892261] RIP: 0033:0x7303e1b2725d [86.892271] Code: ff c3 66 2e 0f 1f 84 00 00 00 00 00 90 f3 0f 1e fa 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 8b bb 0d 00 f7 d8 64 89 01 48 [86.892276] RSP: 002b:00007ffddd1fdb38 EFLAGS: 00000246 ORIG_RAX: 0000000000000139 [86.892281] RAX: ffffffffffffffda RBX: 00005d771d88fd90 RCX: 00007303e1b2725d [86.892285] RDX: 0000000000000000 RSI: 00005d771d893aa0 RDI: 000000000000000c [86.892287] RBP: 00007ffddd1fdbf0 R08: 0000000000000040 R09: 00007ffddd1fdb80 [86.892289] R10: 00007303e1c03b20 R11: 0000000000000246 R12: 00005d771d893aa0 [86.892292] R13: 0000000000000000 R14: 00005d771d88f0d0 R15: 00005d771d895710 [86.892304] </TASK> Call asynchronous variant of dma_fence_work_commit() in that case. v3: Provide more verbose in-line comment (Andi), - mention target environments in commit message. Fixes: 7d1c261 ("drm/i915: Take reservation lock around i915_vma_pin.") Closes: https://gitlab.freedesktop.org/drm/i915/kernel/-/issues/14985 Cc: Andi Shyti <andi.shyti@kernel.org> Signed-off-by: Janusz Krzysztofik <janusz.krzysztofik@linux.intel.com> Reviewed-by: Sebastian Brzezinka <sebastian.brzezinka@intel.com> Reviewed-by: Krzysztof Karas <krzysztof.karas@intel.com> Acked-by: Andi Shyti <andi.shyti@linux.intel.com> Signed-off-by: Andi Shyti <andi.shyti@linux.intel.com> Link: https://lore.kernel.org/r/20251023082925.351307-6-janusz.krzysztofik@linux.intel.com (cherry picked from commit 648ef13) Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
kunbus-gitlab-sync
pushed a commit
that referenced
this pull request
Dec 11, 2025
commit 4aa1714 upstream. Typically copynotify stateid is freed either when parent's stateid is being close/freed or in nfsd4_laundromat if the stateid hasn't been used in a lease period. However, in case when the server got an OPEN (which created a parent stateid), followed by a COPY_NOTIFY using that stateid, followed by a client reboot. New client instance while doing CREATE_SESSION would force expire previous state of this client. It leads to the open state being freed thru release_openowner-> nfs4_free_ol_stateid() and it finds that it still has copynotify stateid associated with it. We currently print a warning and is triggerred WARNING: CPU: 1 PID: 8858 at fs/nfsd/nfs4state.c:1550 nfs4_free_ol_stateid+0xb0/0x100 [nfsd] This patch, instead, frees the associated copynotify stateid here. If the parent stateid is freed (without freeing the copynotify stateids associated with it), it leads to the list corruption when laundromat ends up freeing the copynotify state later. [ 1626.839430] Internal error: Oops - BUG: 00000000f2000800 [#1] SMP [ 1626.842828] Modules linked in: nfnetlink_queue nfnetlink_log bluetooth cfg80211 rpcrdma rdma_cm iw_cm ib_cm ib_core nfsd nfs_acl lockd grace nfs_localio ext4 crc16 mbcache jbd2 overlay uinput snd_seq_dummy snd_hrtimer qrtr rfkill vfat fat uvcvideo snd_hda_codec_generic videobuf2_vmalloc videobuf2_memops snd_hda_intel uvc snd_intel_dspcfg videobuf2_v4l2 videobuf2_common snd_hda_codec snd_hda_core videodev snd_hwdep snd_seq mc snd_seq_device snd_pcm snd_timer snd soundcore sg loop auth_rpcgss vsock_loopback vmw_vsock_virtio_transport_common vmw_vsock_vmci_transport vmw_vmci vsock xfs 8021q garp stp llc mrp nvme ghash_ce e1000e nvme_core sr_mod nvme_keyring nvme_auth cdrom vmwgfx drm_ttm_helper ttm sunrpc dm_mirror dm_region_hash dm_log iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi fuse dm_multipath dm_mod nfnetlink [ 1626.855594] CPU: 2 UID: 0 PID: 199 Comm: kworker/u24:33 Kdump: loaded Tainted: G B W 6.17.0-rc7+ #22 PREEMPT(voluntary) [ 1626.857075] Tainted: [B]=BAD_PAGE, [W]=WARN [ 1626.857573] Hardware name: VMware, Inc. VMware20,1/VBSA, BIOS VMW201.00V.24006586.BA64.2406042154 06/04/2024 [ 1626.858724] Workqueue: nfsd4 laundromat_main [nfsd] [ 1626.859304] pstate: 6140000 (nZCv daif +PAN -UAO -TCO +DIT -SSBS BTYPE=--) [ 1626.860010] pc : __list_del_entry_valid_or_report+0x148/0x200 [ 1626.860601] lr : __list_del_entry_valid_or_report+0x148/0x200 [ 1626.861182] sp : ffff8000881d7a40 [ 1626.861521] x29: ffff8000881d7a40 x28: 0000000000000018 x27: ffff0000c2a98200 [ 1626.862260] x26: 0000000000000600 x25: 0000000000000000 x24: ffff8000881d7b20 [ 1626.862986] x23: ffff0000c2a981e8 x22: 1fffe00012410e7d x21: ffff0000920873e8 [ 1626.863701] x20: ffff0000920873e8 x19: ffff000086f22998 x18: 0000000000000000 [ 1626.864421] x17: 20747562202c3839 x16: 3932326636383030 x15: 3030666666662065 [ 1626.865092] x14: 6220646c756f6873 x13: 0000000000000001 x12: ffff60004fd9e4a3 [ 1626.865713] x11: 1fffe0004fd9e4a2 x10: ffff60004fd9e4a2 x9 : dfff800000000000 [ 1626.866320] x8 : 00009fffb0261b5e x7 : ffff00027ecf2513 x6 : 0000000000000001 [ 1626.866938] x5 : ffff00027ecf2510 x4 : ffff60004fd9e4a3 x3 : 0000000000000000 [ 1626.867553] x2 : 0000000000000000 x1 : ffff000096069640 x0 : 000000000000006d [ 1626.868167] Call trace: [ 1626.868382] __list_del_entry_valid_or_report+0x148/0x200 (P) [ 1626.868876] _free_cpntf_state_locked+0xd0/0x268 [nfsd] [ 1626.869368] nfs4_laundromat+0x6f8/0x1058 [nfsd] [ 1626.869813] laundromat_main+0x24/0x60 [nfsd] [ 1626.870231] process_one_work+0x584/0x1050 [ 1626.870595] worker_thread+0x4c4/0xc60 [ 1626.870893] kthread+0x2f8/0x398 [ 1626.871146] ret_from_fork+0x10/0x20 [ 1626.871422] Code: aa1303e1 aa1403e3 910e8000 97bc55d7 (d4210000) [ 1626.871892] SMP: stopping secondary CPUs Reported-by: rtm@csail.mit.edu Closes: https://lore.kernel.org/linux-nfs/d8f064c1-a26f-4eed-b4f0-1f7f608f415f@oracle.com/T/#t Fixes: 624322f ("NFSD add COPY_NOTIFY operation") Cc: stable@vger.kernel.org Signed-off-by: Olga Kornievskaia <okorniev@redhat.com> Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
kunbus-gitlab-sync
pushed a commit
that referenced
this pull request
Dec 11, 2025
…or slabobj_ext commit 1abbdf3 upstream. When alloc_slab_obj_exts() fails and then later succeeds in allocating a slab extension vector, it calls handle_failed_objexts_alloc() to mark all objects in the vector as empty. As a result all objects in this slab (slabA) will have their extensions set to CODETAG_EMPTY. Later on if this slabA is used to allocate a slabobj_ext vector for another slab (slabB), we end up with the slabB->obj_exts pointing to a slabobj_ext vector that itself has a non-NULL slabobj_ext equal to CODETAG_EMPTY. When slabB gets freed, free_slab_obj_exts() is called to free slabB->obj_exts vector. free_slab_obj_exts() calls mark_objexts_empty(slabB->obj_exts) which will generate a warning because it expects slabobj_ext vectors to have a NULL obj_ext, not CODETAG_EMPTY. Modify mark_objexts_empty() to skip the warning and setting the obj_ext value if it's already set to CODETAG_EMPTY. To quickly detect this WARN, I modified the code from WARN_ON(slab_exts[offs].ref.ct) to BUG_ON(slab_exts[offs].ref.ct == 1); We then obtained this message: [21630.898561] ------------[ cut here ]------------ [21630.898596] kernel BUG at mm/slub.c:2050! [21630.898611] Internal error: Oops - BUG: 00000000f2000800 [#1] SMP [21630.900372] Modules linked in: squashfs isofs vfio_iommu_type1 vhost_vsock vfio vhost_net vmw_vsock_virtio_transport_common vhost tap vhost_iotlb iommufd vsock binfmt_misc nfsv3 nfs_acl nfs lockd grace netfs tls rds dns_resolver tun brd overlay ntfs3 exfat btrfs blake2b_generic xor xor_neon raid6_pq loop sctp ip6_udp_tunnel udp_tunnel nft_fib_inet nft_fib_ipv4 nft_fib_ipv6 nft_fib nft_reject_inet nf_reject_ipv4 nf_reject_ipv6 nft_reject nft_ct nft_chain_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 nf_tables rfkill ip_set sunrpc vfat fat joydev sg sch_fq_codel nfnetlink virtio_gpu sr_mod cdrom drm_client_lib virtio_dma_buf drm_shmem_helper drm_kms_helper drm ghash_ce backlight virtio_net virtio_blk virtio_scsi net_failover virtio_console failover virtio_mmio dm_mirror dm_region_hash dm_log dm_multipath dm_mod fuse i2c_dev virtio_pci virtio_pci_legacy_dev virtio_pci_modern_dev virtio virtio_ring autofs4 aes_neon_bs aes_ce_blk [last unloaded: hwpoison_inject] [21630.909177] CPU: 3 UID: 0 PID: 3787 Comm: kylin-process-m Kdump: loaded Tainted: G W 6.18.0-rc1+ #74 PREEMPT(voluntary) [21630.910495] Tainted: [W]=WARN [21630.910867] Hardware name: QEMU KVM Virtual Machine, BIOS unknown 2/2/2022 [21630.911625] pstate: 80400005 (Nzcv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--) [21630.912392] pc : __free_slab+0x228/0x250 [21630.912868] lr : __free_slab+0x18c/0x250[21630.913334] sp : ffff8000a02f73e0 [21630.913830] x29: ffff8000a02f73e0 x28: fffffdffc43fc800 x27: ffff0000c0011c40 [21630.914677] x26: ffff0000c000cac0 x25: ffff00010fe5e5f0 x24: ffff000102199b40 [21630.915469] x23: 0000000000000003 x22: 0000000000000003 x21: ffff0000c0011c40 [21630.916259] x20: fffffdffc4086600 x19: fffffdffc43fc800 x18: 0000000000000000 [21630.917048] x17: 0000000000000000 x16: 0000000000000000 x15: 0000000000000000 [21630.917837] x14: 0000000000000000 x13: 0000000000000000 x12: ffff70001405ee66 [21630.918640] x11: 1ffff0001405ee65 x10: ffff70001405ee65 x9 : ffff800080a295dc [21630.919442] x8 : ffff8000a02f7330 x7 : 0000000000000000 x6 : 0000000000003000 [21630.920232] x5 : 0000000024924925 x4 : 0000000000000001 x3 : 0000000000000007 [21630.921021] x2 : 0000000000001b40 x1 : 000000000000001f x0 : 0000000000000001 [21630.921810] Call trace: [21630.922130] __free_slab+0x228/0x250 (P) [21630.922669] free_slab+0x38/0x118 [21630.923079] free_to_partial_list+0x1d4/0x340 [21630.923591] __slab_free+0x24c/0x348 [21630.924024] ___cache_free+0xf0/0x110 [21630.924468] qlist_free_all+0x78/0x130 [21630.924922] kasan_quarantine_reduce+0x114/0x148 [21630.925525] __kasan_slab_alloc+0x7c/0xb0 [21630.926006] kmem_cache_alloc_noprof+0x164/0x5c8 [21630.926699] __alloc_object+0x44/0x1f8 [21630.927153] __create_object+0x34/0xc8 [21630.927604] kmemleak_alloc+0xb8/0xd8 [21630.928052] kmem_cache_alloc_noprof+0x368/0x5c8 [21630.928606] getname_flags.part.0+0xa4/0x610 [21630.929112] getname_flags+0x80/0xd8 [21630.929557] vfs_fstatat+0xc8/0xe0 [21630.929975] __do_sys_newfstatat+0xa0/0x100 [21630.930469] __arm64_sys_newfstatat+0x90/0xd8 [21630.931046] invoke_syscall+0xd4/0x258 [21630.931685] el0_svc_common.constprop.0+0xb4/0x240 [21630.932467] do_el0_svc+0x48/0x68 [21630.932972] el0_svc+0x40/0xe0 [21630.933472] el0t_64_sync_handler+0xa0/0xe8 [21630.934151] el0t_64_sync+0x1ac/0x1b0 [21630.934923] Code: aa1803e0 97ffef2b a9446bf9 17ffff9c (d4210000) [21630.936461] SMP: stopping secondary CPUs [21630.939550] Starting crashdump kernel... [21630.940108] Bye! Link: https://lkml.kernel.org/r/20251029014317.1533488-1-hao.ge@linux.dev Fixes: 09c4656 ("codetag: debug: introduce OBJEXTS_ALLOC_FAIL to mark failed slab_ext allocations") Signed-off-by: Hao Ge <gehao@kylinos.cn> Reviewed-by: Suren Baghdasaryan <surenb@google.com> Cc: Christoph Lameter (Ampere) <cl@gentwo.org> Cc: David Rientjes <rientjes@google.com> Cc: gehao <gehao@kylinos.cn> Cc: Roman Gushchin <roman.gushchin@linux.dev> Cc: Shakeel Butt <shakeel.butt@linux.dev> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
kunbus-gitlab-sync
pushed a commit
that referenced
this pull request
Dec 11, 2025
commit 00fbff7 upstream. When crashkernel is configured with a high reservation, shrinking its value below the low crashkernel reservation causes two issues: 1. Invalid crashkernel resource objects 2. Kernel crash if crashkernel shrinking is done twice For example, with crashkernel=200M,high, the kernel reserves 200MB of high memory and some default low memory (say 256MB). The reservation appears as: cat /proc/iomem | grep -i crash af000000-beffffff : Crash kernel 433000000-43f7fffff : Crash kernel If crashkernel is then shrunk to 50MB (echo 52428800 > /sys/kernel/kexec_crash_size), /proc/iomem still shows 256MB reserved: af000000-beffffff : Crash kernel Instead, it should show 50MB: af000000-b21fffff : Crash kernel Further shrinking crashkernel to 40MB causes a kernel crash with the following trace (x86): BUG: kernel NULL pointer dereference, address: 0000000000000038 PGD 0 P4D 0 Oops: 0000 [#1] PREEMPT SMP NOPTI <snip...> Call Trace: <TASK> ? __die_body.cold+0x19/0x27 ? page_fault_oops+0x15a/0x2f0 ? search_module_extables+0x19/0x60 ? search_bpf_extables+0x5f/0x80 ? exc_page_fault+0x7e/0x180 ? asm_exc_page_fault+0x26/0x30 ? __release_resource+0xd/0xb0 release_resource+0x26/0x40 __crash_shrink_memory+0xe5/0x110 crash_shrink_memory+0x12a/0x190 kexec_crash_size_store+0x41/0x80 kernfs_fop_write_iter+0x141/0x1f0 vfs_write+0x294/0x460 ksys_write+0x6d/0xf0 <snip...> This happens because __crash_shrink_memory()/kernel/crash_core.c incorrectly updates the crashk_res resource object even when crashk_low_res should be updated. Fix this by ensuring the correct crashkernel resource object is updated when shrinking crashkernel memory. Link: https://lkml.kernel.org/r/20251101193741.289252-1-sourabhjain@linux.ibm.com Fixes: 16c6006 ("kexec: enable kexec_crash_size to support two crash kernel regions") Signed-off-by: Sourabh Jain <sourabhjain@linux.ibm.com> Acked-by: Baoquan He <bhe@redhat.com> Cc: Zhen Lei <thunder.leizhen@huawei.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
kunbus-gitlab-sync
pushed a commit
that referenced
this pull request
Dec 11, 2025
commit ec33b59 upstream. The kernel test has reported: BUG: unable to handle page fault for address: fffba000 #PF: supervisor write access in kernel mode #PF: error_code(0x0002) - not-present page *pde = 03171067 *pte = 00000000 Oops: Oops: 0002 [#1] CPU: 0 UID: 0 PID: 1 Comm: swapper/0 Tainted: G T 6.18.0-rc2-00031-gec7f31b2a2d3 #1 NONE a1d066dfe789f54bc7645c7989957d2bdee593ca Tainted: [T]=RANDSTRUCT Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.3-debian-1.16.3-2 04/01/2014 EIP: memset (arch/x86/include/asm/string_32.h:168 arch/x86/lib/memcpy_32.c:17) Code: a5 8b 4d f4 83 e1 03 74 02 f3 a4 83 c4 04 5e 5f 5d 2e e9 73 41 01 00 90 90 90 3e 8d 74 26 00 55 89 e5 57 56 89 c6 89 d0 89 f7 <f3> aa 89 f0 5e 5f 5d 2e e9 53 41 01 00 cc cc cc 55 89 e5 53 57 56 EAX: 0000006b EBX: 00000015 ECX: 001fefff EDX: 0000006b ESI: fffb9000 EDI: fffba000 EBP: c611fbf0 ESP: c611fbe8 DS: 007b ES: 007b FS: 0000 GS: 0000 SS: 0068 EFLAGS: 00010287 CR0: 80050033 CR2: fffba000 CR3: 0316e000 CR4: 00040690 Call Trace: poison_element (mm/mempool.c:83 mm/mempool.c:102) mempool_init_node (mm/mempool.c:142 mm/mempool.c:226) mempool_init_noprof (mm/mempool.c:250 (discriminator 1)) ? mempool_alloc_pages (mm/mempool.c:640) bio_integrity_initfn (block/bio-integrity.c:483 (discriminator 8)) ? mempool_alloc_pages (mm/mempool.c:640) do_one_initcall (init/main.c:1283) Christoph found out this is due to the poisoning code not dealing properly with CONFIG_HIGHMEM because only the first page is mapped but then the whole potentially high-order page is accessed. We could give up on HIGHMEM here, but it's straightforward to fix this with a loop that's mapping, poisoning or checking and unmapping individual pages. Reported-by: kernel test robot <oliver.sang@intel.com> Closes: https://lore.kernel.org/oe-lkp/202511111411.9ebfa1ba-lkp@intel.com Analyzed-by: Christoph Hellwig <hch@lst.de> Fixes: bdfedb7 ("mm, mempool: poison elements backed by slab allocator") Cc: stable@vger.kernel.org Tested-by: kernel test robot <oliver.sang@intel.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Link: https://patch.msgid.link/20251113-mempool-poison-v1-1-233b3ef984c3@suse.cz Signed-off-by: Vlastimil Babka <vbabka@suse.cz> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
kunbus-gitlab-sync
pushed a commit
that referenced
this pull request
Dec 11, 2025
commit 0a2c549 upstream. nvme_fc_delete_assocation() waits for pending I/O to complete before returning, and an error can cause ->ioerr_work to be queued after cancel_work_sync() had been called. Move the call to cancel_work_sync() to be after nvme_fc_delete_association() to ensure ->ioerr_work is not running when the nvme_fc_ctrl object is freed. Otherwise the following can occur: [ 1135.911754] list_del corruption, ff2d24c8093f31f8->next is NULL [ 1135.917705] ------------[ cut here ]------------ [ 1135.922336] kernel BUG at lib/list_debug.c:52! [ 1135.926784] Oops: invalid opcode: 0000 [#1] SMP NOPTI [ 1135.931851] CPU: 48 UID: 0 PID: 726 Comm: kworker/u449:23 Kdump: loaded Not tainted 6.12.0 #1 PREEMPT(voluntary) [ 1135.943490] Hardware name: Dell Inc. PowerEdge R660/0HGTK9, BIOS 2.5.4 01/16/2025 [ 1135.950969] Workqueue: 0x0 (nvme-wq) [ 1135.954673] RIP: 0010:__list_del_entry_valid_or_report.cold+0xf/0x6f [ 1135.961041] Code: c7 c7 98 68 72 94 e8 26 45 fe ff 0f 0b 48 c7 c7 70 68 72 94 e8 18 45 fe ff 0f 0b 48 89 fe 48 c7 c7 80 69 72 94 e8 07 45 fe ff <0f> 0b 48 89 d1 48 c7 c7 a0 6a 72 94 48 89 c2 e8 f3 44 fe ff 0f 0b [ 1135.979788] RSP: 0018:ff579b19482d3e50 EFLAGS: 00010046 [ 1135.985015] RAX: 0000000000000033 RBX: ff2d24c8093f31f0 RCX: 0000000000000000 [ 1135.992148] RDX: 0000000000000000 RSI: ff2d24d6bfa1d0c0 RDI: ff2d24d6bfa1d0c0 [ 1135.999278] RBP: ff2d24c8093f31f8 R08: 0000000000000000 R09: ffffffff951e2b08 [ 1136.006413] R10: ffffffff95122ac8 R11: 0000000000000003 R12: ff2d24c78697c100 [ 1136.013546] R13: fffffffffffffff8 R14: 0000000000000000 R15: ff2d24c78697c0c0 [ 1136.020677] FS: 0000000000000000(0000) GS:ff2d24d6bfa00000(0000) knlGS:0000000000000000 [ 1136.028765] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 1136.034510] CR2: 00007fd207f90b80 CR3: 000000163ea22003 CR4: 0000000000f73ef0 [ 1136.041641] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 1136.048776] DR3: 0000000000000000 DR6: 00000000fffe07f0 DR7: 0000000000000400 [ 1136.055910] PKRU: 55555554 [ 1136.058623] Call Trace: [ 1136.061074] <TASK> [ 1136.063179] ? show_trace_log_lvl+0x1b0/0x2f0 [ 1136.067540] ? show_trace_log_lvl+0x1b0/0x2f0 [ 1136.071898] ? move_linked_works+0x4a/0xa0 [ 1136.075998] ? __list_del_entry_valid_or_report.cold+0xf/0x6f [ 1136.081744] ? __die_body.cold+0x8/0x12 [ 1136.085584] ? die+0x2e/0x50 [ 1136.088469] ? do_trap+0xca/0x110 [ 1136.091789] ? do_error_trap+0x65/0x80 [ 1136.095543] ? __list_del_entry_valid_or_report.cold+0xf/0x6f [ 1136.101289] ? exc_invalid_op+0x50/0x70 [ 1136.105127] ? __list_del_entry_valid_or_report.cold+0xf/0x6f [ 1136.110874] ? asm_exc_invalid_op+0x1a/0x20 [ 1136.115059] ? __list_del_entry_valid_or_report.cold+0xf/0x6f [ 1136.120806] move_linked_works+0x4a/0xa0 [ 1136.124733] worker_thread+0x216/0x3a0 [ 1136.128485] ? __pfx_worker_thread+0x10/0x10 [ 1136.132758] kthread+0xfa/0x240 [ 1136.135904] ? __pfx_kthread+0x10/0x10 [ 1136.139657] ret_from_fork+0x31/0x50 [ 1136.143236] ? __pfx_kthread+0x10/0x10 [ 1136.146988] ret_from_fork_asm+0x1a/0x30 [ 1136.150915] </TASK> Fixes: 19fce04 ("nvme-fc: avoid calling _nvme_fc_abort_outstanding_ios from interrupt context") Cc: stable@vger.kernel.org Tested-by: Marco Patalano <mpatalan@redhat.com> Reviewed-by: Justin Tee <justin.tee@broadcom.com> Signed-off-by: Ewan D. Milne <emilne@redhat.com> Signed-off-by: Keith Busch <kbusch@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
kunbus-gitlab-sync
pushed a commit
that referenced
this pull request
Dec 11, 2025
commit e696518 upstream. If the allocation of tl_hba->sh fails in tcm_loop_driver_probe() and we attempt to dereference it in tcm_loop_tpg_address_show() we will get a segfault, see below for an example. So, check tl_hba->sh before dereferencing it. Unable to allocate struct scsi_host BUG: kernel NULL pointer dereference, address: 0000000000000194 #PF: supervisor read access in kernel mode #PF: error_code(0x0000) - not-present page PGD 0 P4D 0 Oops: 0000 [#1] PREEMPT SMP NOPTI CPU: 1 PID: 8356 Comm: tokio-runtime-w Not tainted 6.6.104.2-4.azl3 #1 Hardware name: Microsoft Corporation Virtual Machine/Virtual Machine, BIOS Hyper-V UEFI Release v4.1 09/28/2024 RIP: 0010:tcm_loop_tpg_address_show+0x2e/0x50 [tcm_loop] ... Call Trace: <TASK> configfs_read_iter+0x12d/0x1d0 [configfs] vfs_read+0x1b5/0x300 ksys_read+0x6f/0xf0 ... Cc: stable@vger.kernel.org Fixes: 2628b35 ("tcm_loop: Show address of tpg in configfs") Signed-off-by: Hamza Mahfooz <hamzamahfooz@linux.microsoft.com> Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com> Reviewed-by: Allen Pais <apais@linux.microsoft.com> Link: https://patch.msgid.link/1762370746-6304-1-git-send-email-hamzamahfooz@linux.microsoft.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
kunbus-gitlab-sync
pushed a commit
that referenced
this pull request
Dec 11, 2025
[ Upstream commit dfe28c4 ] The validation of the set(nsh(...)) action is completely wrong. It runs through the nsh_key_put_from_nlattr() function that is the same function that validates NSH keys for the flow match and the push_nsh() action. However, the set(nsh(...)) has a very different memory layout. Nested attributes in there are doubled in size in case of the masked set(). That makes proper validation impossible. There is also confusion in the code between the 'masked' flag, that says that the nested attributes are doubled in size containing both the value and the mask, and the 'is_mask' that says that the value we're parsing is the mask. This is causing kernel crash on trying to write into mask part of the match with SW_FLOW_KEY_PUT() during validation, while validate_nsh() doesn't allocate any memory for it: BUG: kernel NULL pointer dereference, address: 0000000000000018 #PF: supervisor read access in kernel mode #PF: error_code(0x0000) - not-present page PGD 1c2383067 P4D 1c2383067 PUD 20b703067 PMD 0 Oops: Oops: 0000 [#1] SMP NOPTI CPU: 8 UID: 0 Kdump: loaded Not tainted 6.17.0-rc4+ #107 PREEMPT(voluntary) RIP: 0010:nsh_key_put_from_nlattr+0x19d/0x610 [openvswitch] Call Trace: <TASK> validate_nsh+0x60/0x90 [openvswitch] validate_set.constprop.0+0x270/0x3c0 [openvswitch] __ovs_nla_copy_actions+0x477/0x860 [openvswitch] ovs_nla_copy_actions+0x8d/0x100 [openvswitch] ovs_packet_cmd_execute+0x1cc/0x310 [openvswitch] genl_family_rcv_msg_doit+0xdb/0x130 genl_family_rcv_msg+0x14b/0x220 genl_rcv_msg+0x47/0xa0 netlink_rcv_skb+0x53/0x100 genl_rcv+0x24/0x40 netlink_unicast+0x280/0x3b0 netlink_sendmsg+0x1f7/0x430 ____sys_sendmsg+0x36b/0x3a0 ___sys_sendmsg+0x87/0xd0 __sys_sendmsg+0x6d/0xd0 do_syscall_64+0x7b/0x2c0 entry_SYSCALL_64_after_hwframe+0x76/0x7e The third issue with this process is that while trying to convert the non-masked set into masked one, validate_set() copies and doubles the size of the OVS_KEY_ATTR_NSH as if it didn't have any nested attributes. It should be copying each nested attribute and doubling them in size independently. And the process must be properly reversed during the conversion back from masked to a non-masked variant during the flow dump. In the end, the only two outcomes of trying to use this action are either validation failure or a kernel crash. And if somehow someone manages to install a flow with such an action, it will most definitely not do what it is supposed to, since all the keys and the masks are mixed up. Fixing all the issues is a complex task as it requires re-writing most of the validation code. Given that and the fact that this functionality never worked since introduction, let's just remove it altogether. It's better to re-introduce it later with a proper implementation instead of trying to fix it in stable releases. Fixes: b2d0f5d ("openvswitch: enable NSH support") Reported-by: Junvy Yang <zhuque@tencent.com> Signed-off-by: Ilya Maximets <i.maximets@ovn.org> Acked-by: Eelco Chaudron <echaudro@redhat.com> Reviewed-by: Aaron Conole <aconole@redhat.com> Link: https://patch.msgid.link/20251112112246.95064-1-i.maximets@ovn.org Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
kunbus-gitlab-sync
pushed a commit
that referenced
this pull request
Dec 11, 2025
[ Upstream commit f94c1a1 ] The function devl_rate_nodes_destroy is documented to "Unset parent for all rate objects". However, it was only calling the driver-specific `rate_leaf_parent_set` or `rate_node_parent_set` ops and decrementing the parent's refcount, without actually setting the `devlink_rate->parent` pointer to NULL. This leaves a dangling pointer in the `devlink_rate` struct, which cause refcount error in netdevsim[1] and mlx5[2]. In addition, this is inconsistent with the behavior of `devlink_nl_rate_parent_node_set`, where the parent pointer is correctly cleared. This patch fixes the issue by explicitly setting `devlink_rate->parent` to NULL after notifying the driver, thus fulfilling the function's documented behavior for all rate objects. [1] repro steps: echo 1 > /sys/bus/netdevsim/new_device devlink dev eswitch set netdevsim/netdevsim1 mode switchdev echo 1 > /sys/bus/netdevsim/devices/netdevsim1/sriov_numvfs devlink port function rate add netdevsim/netdevsim1/test_node devlink port function rate set netdevsim/netdevsim1/128 parent test_node echo 1 > /sys/bus/netdevsim/del_device dmesg: refcount_t: decrement hit 0; leaking memory. WARNING: CPU: 8 PID: 1530 at lib/refcount.c:31 refcount_warn_saturate+0x42/0xe0 CPU: 8 UID: 0 PID: 1530 Comm: bash Not tainted 6.18.0-rc4+ #1 NONE Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.16.0-0-gd239552ce722-prebuilt.qemu.org 04/01/2014 RIP: 0010:refcount_warn_saturate+0x42/0xe0 Call Trace: <TASK> devl_rate_leaf_destroy+0x8d/0x90 __nsim_dev_port_del+0x6c/0x70 [netdevsim] nsim_dev_reload_destroy+0x11c/0x140 [netdevsim] nsim_drv_remove+0x2b/0xb0 [netdevsim] device_release_driver_internal+0x194/0x1f0 bus_remove_device+0xc6/0x130 device_del+0x159/0x3c0 device_unregister+0x1a/0x60 del_device_store+0x111/0x170 [netdevsim] kernfs_fop_write_iter+0x12e/0x1e0 vfs_write+0x215/0x3d0 ksys_write+0x5f/0xd0 do_syscall_64+0x55/0x10f0 entry_SYSCALL_64_after_hwframe+0x4b/0x53 [2] devlink dev eswitch set pci/0000:08:00.0 mode switchdev devlink port add pci/0000:08:00.0 flavour pcisf pfnum 0 sfnum 1000 devlink port function rate add pci/0000:08:00.0/group1 devlink port function rate set pci/0000:08:00.0/32768 parent group1 modprobe -r mlx5_ib mlx5_fwctl mlx5_core dmesg: refcount_t: decrement hit 0; leaking memory. WARNING: CPU: 7 PID: 16151 at lib/refcount.c:31 refcount_warn_saturate+0x42/0xe0 CPU: 7 UID: 0 PID: 16151 Comm: bash Not tainted 6.17.0-rc7_for_upstream_min_debug_2025_10_02_12_44 #1 NONE Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.16.3-0-ga6ed6b701f0a-prebuilt.qemu.org 04/01/2014 RIP: 0010:refcount_warn_saturate+0x42/0xe0 Call Trace: <TASK> devl_rate_leaf_destroy+0x8d/0x90 mlx5_esw_offloads_devlink_port_unregister+0x33/0x60 [mlx5_core] mlx5_esw_offloads_unload_rep+0x3f/0x50 [mlx5_core] mlx5_eswitch_unload_sf_vport+0x40/0x90 [mlx5_core] mlx5_sf_esw_event+0xc4/0x120 [mlx5_core] notifier_call_chain+0x33/0xa0 blocking_notifier_call_chain+0x3b/0x50 mlx5_eswitch_disable_locked+0x50/0x110 [mlx5_core] mlx5_eswitch_disable+0x63/0x90 [mlx5_core] mlx5_unload+0x1d/0x170 [mlx5_core] mlx5_uninit_one+0xa2/0x130 [mlx5_core] remove_one+0x78/0xd0 [mlx5_core] pci_device_remove+0x39/0xa0 device_release_driver_internal+0x194/0x1f0 unbind_store+0x99/0xa0 kernfs_fop_write_iter+0x12e/0x1e0 vfs_write+0x215/0x3d0 ksys_write+0x5f/0xd0 do_syscall_64+0x53/0x1f0 entry_SYSCALL_64_after_hwframe+0x4b/0x53 Fixes: d755598 ("devlink: Allow setting parent node of rate objects") Signed-off-by: Shay Drory <shayd@nvidia.com> Reviewed-by: Carolina Jubran <cjubran@nvidia.com> Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Link: https://patch.msgid.link/1763381149-1234377-1-git-send-email-tariqt@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
kunbus-gitlab-sync
pushed a commit
that referenced
this pull request
Dec 11, 2025
[ Upstream commit d47515a ] The mlx5_irq_alloc() function can inadvertently free the entire rmap and end up in a crash[1] when the other threads tries to access this, when request_irq() fails due to exhausted IRQ vectors. This commit modifies the cleanup to remove only the specific IRQ mapping that was just added. This prevents removal of other valid mappings and ensures precise cleanup of the failed IRQ allocation's associated glue object. Note: This error is observed when both fwctl and rds configs are enabled. [1] mlx5_core 0000:05:00.0: Successfully registered panic handler for port 1 mlx5_core 0000:05:00.0: mlx5_irq_alloc:293:(pid 66740): Failed to request irq. err = -28 infiniband mlx5_0: mlx5_ib_test_wc:290:(pid 66740): Error -28 while trying to test write-combining support mlx5_core 0000:05:00.0: Successfully unregistered panic handler for port 1 mlx5_core 0000:06:00.0: Successfully registered panic handler for port 1 mlx5_core 0000:06:00.0: mlx5_irq_alloc:293:(pid 66740): Failed to request irq. err = -28 infiniband mlx5_0: mlx5_ib_test_wc:290:(pid 66740): Error -28 while trying to test write-combining support mlx5_core 0000:06:00.0: Successfully unregistered panic handler for port 1 mlx5_core 0000:03:00.0: mlx5_irq_alloc:293:(pid 28895): Failed to request irq. err = -28 mlx5_core 0000:05:00.0: mlx5_irq_alloc:293:(pid 28895): Failed to request irq. err = -28 general protection fault, probably for non-canonical address 0xe277a58fde16f291: 0000 [#1] SMP NOPTI RIP: 0010:free_irq_cpu_rmap+0x23/0x7d Call Trace: <TASK> ? show_trace_log_lvl+0x1d6/0x2f9 ? show_trace_log_lvl+0x1d6/0x2f9 ? mlx5_irq_alloc.cold+0x5d/0xf3 [mlx5_core] ? __die_body.cold+0x8/0xa ? die_addr+0x39/0x53 ? exc_general_protection+0x1c4/0x3e9 ? dev_vprintk_emit+0x5f/0x90 ? asm_exc_general_protection+0x22/0x27 ? free_irq_cpu_rmap+0x23/0x7d mlx5_irq_alloc.cold+0x5d/0xf3 [mlx5_core] irq_pool_request_vector+0x7d/0x90 [mlx5_core] mlx5_irq_request+0x2e/0xe0 [mlx5_core] mlx5_irq_request_vector+0xad/0xf7 [mlx5_core] comp_irq_request_pci+0x64/0xf0 [mlx5_core] create_comp_eq+0x71/0x385 [mlx5_core] ? mlx5e_open_xdpsq+0x11c/0x230 [mlx5_core] mlx5_comp_eqn_get+0x72/0x90 [mlx5_core] ? xas_load+0x8/0x91 mlx5_comp_irqn_get+0x40/0x90 [mlx5_core] mlx5e_open_channel+0x7d/0x3c7 [mlx5_core] mlx5e_open_channels+0xad/0x250 [mlx5_core] mlx5e_open_locked+0x3e/0x110 [mlx5_core] mlx5e_open+0x23/0x70 [mlx5_core] __dev_open+0xf1/0x1a5 __dev_change_flags+0x1e1/0x249 dev_change_flags+0x21/0x5c do_setlink+0x28b/0xcc4 ? __nla_parse+0x22/0x3d ? inet6_validate_link_af+0x6b/0x108 ? cpumask_next+0x1f/0x35 ? __snmp6_fill_stats64.constprop.0+0x66/0x107 ? __nla_validate_parse+0x48/0x1e6 __rtnl_newlink+0x5ff/0xa57 ? kmem_cache_alloc_trace+0x164/0x2ce rtnl_newlink+0x44/0x6e rtnetlink_rcv_msg+0x2bb/0x362 ? __netlink_sendskb+0x4c/0x6c ? netlink_unicast+0x28f/0x2ce ? rtnl_calcit.isra.0+0x150/0x146 netlink_rcv_skb+0x5f/0x112 netlink_unicast+0x213/0x2ce netlink_sendmsg+0x24f/0x4d9 __sock_sendmsg+0x65/0x6a ____sys_sendmsg+0x28f/0x2c9 ? import_iovec+0x17/0x2b ___sys_sendmsg+0x97/0xe0 __sys_sendmsg+0x81/0xd8 do_syscall_64+0x35/0x87 entry_SYSCALL_64_after_hwframe+0x6e/0x0 RIP: 0033:0x7fc328603727 Code: c3 66 90 41 54 41 89 d4 55 48 89 f5 53 89 fb 48 83 ec 10 e8 0b ed ff ff 44 89 e2 48 89 ee 89 df 41 89 c0 b8 2e 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 35 44 89 c7 48 89 44 24 08 e8 44 ed ff ff 48 RSP: 002b:00007ffe8eb3f1a0 EFLAGS: 00000293 ORIG_RAX: 000000000000002e RAX: ffffffffffffffda RBX: 000000000000000d RCX: 00007fc328603727 RDX: 0000000000000000 RSI: 00007ffe8eb3f1f0 RDI: 000000000000000d RBP: 00007ffe8eb3f1f0 R08: 0000000000000000 R09: 0000000000000000 R10: 0000000000000000 R11: 0000000000000293 R12: 0000000000000000 R13: 0000000000000000 R14: 00007ffe8eb3f3c8 R15: 00007ffe8eb3f3bc </TASK> ---[ end trace f43ce73c3c2b13a2 ]--- RIP: 0010:free_irq_cpu_rmap+0x23/0x7d Code: 0f 1f 80 00 00 00 00 48 85 ff 74 6b 55 48 89 fd 53 66 83 7f 06 00 74 24 31 db 48 8b 55 08 0f b7 c3 48 8b 04 c2 48 85 c0 74 09 <8b> 38 31 f6 e8 c4 0a b8 ff 83 c3 01 66 3b 5d 06 72 de b8 ff ff ff RSP: 0018:ff384881640eaca0 EFLAGS: 00010282 RAX: e277a58fde16f291 RBX: 0000000000000000 RCX: 0000000000000000 RDX: ff2335e2e20b3600 RSI: 0000000000000000 RDI: ff2335e2e20b3400 RBP: ff2335e2e20b3400 R08: 0000000000000000 R09: 0000000000000000 R10: 0000000000000000 R11: 00000000ffffffe4 R12: ff384881640ead88 R13: ff2335c3760751e0 R14: ff2335e2e1672200 R15: ff2335c3760751f8 FS: 00007fc32ac22480(0000) GS:ff2335e2d6e00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007f651ab54000 CR3: 00000029f1206003 CR4: 0000000000771ef0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 PKRU: 55555554 Kernel panic - not syncing: Fatal exception Kernel Offset: 0x1dc00000 from 0xffffffff81000000 (relocation range: 0xffffffff80000000-0xffffffffbfffffff) kvm-guest: disable async PF for cpu 0 Fixes: 3354822 ("net/mlx5: Use dynamic msix vectors allocation") Signed-off-by: Mohith Kumar Thummaluru<mohith.k.kumar.thummaluru@oracle.com> Tested-by: Mohith Kumar Thummaluru<mohith.k.kumar.thummaluru@oracle.com> Reviewed-by: Moshe Shemesh <moshe@nvidia.com> Reviewed-by: Shay Drori <shayd@nvidia.com> Signed-off-by: Pradyumn Rahar <pradyumn.rahar@oracle.com> Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Link: https://patch.msgid.link/1763381768-1234998-1-git-send-email-tariqt@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
kunbus-gitlab-sync
pushed a commit
that referenced
this pull request
Dec 11, 2025
… NULL on error [ Upstream commit 90a8830 ] Make knav_dma_open_channel consistently return NULL on error instead of ERR_PTR. Currently the header include/linux/soc/ti/knav_dma.h returns NULL when the driver is disabled, but the driver implementation does not even return NULL or ERR_PTR on failure, causing inconsistency in the users. This results in a crash in netcp_free_navigator_resources as followed (trimmed): Unhandled fault: alignment exception (0x221) at 0xfffffff2 [fffffff2] *pgd=80000800207003, *pmd=82ffda003, *pte=00000000 Internal error: : 221 [#1] SMP ARM Modules linked in: CPU: 0 UID: 0 PID: 1 Comm: swapper/0 Not tainted 6.17.0-rc7 #1 NONE Hardware name: Keystone PC is at knav_dma_close_channel+0x30/0x19c LR is at netcp_free_navigator_resources+0x2c/0x28c [... TRIM...] Call trace: knav_dma_close_channel from netcp_free_navigator_resources+0x2c/0x28c netcp_free_navigator_resources from netcp_ndo_open+0x430/0x46c netcp_ndo_open from __dev_open+0x114/0x29c __dev_open from __dev_change_flags+0x190/0x208 __dev_change_flags from netif_change_flags+0x1c/0x58 netif_change_flags from dev_change_flags+0x38/0xa0 dev_change_flags from ip_auto_config+0x2c4/0x11f0 ip_auto_config from do_one_initcall+0x58/0x200 do_one_initcall from kernel_init_freeable+0x1cc/0x238 kernel_init_freeable from kernel_init+0x1c/0x12c kernel_init from ret_from_fork+0x14/0x38 [... TRIM...] Standardize the error handling by making the function return NULL on all error conditions. The API is used in just the netcp_core.c so the impact is limited. Note, this change, in effect reverts commit 5b6cb43 ("net: ethernet: ti: netcp_core: return error while dma channel open issue"), but provides a less error prone implementation. Suggested-by: Simon Horman <horms@kernel.org> Suggested-by: Jacob Keller <jacob.e.keller@intel.com> Signed-off-by: Nishanth Menon <nm@ti.com> Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Link: https://patch.msgid.link/20251103162811.3730055-1-nm@ti.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
kunbus-gitlab-sync
pushed a commit
that referenced
this pull request
Dec 11, 2025
commit 43962db upstream. The crash in process_v2_sparse_read() for fscrypt-encrypted directories has been reported. Issue takes place for Ceph msgr2 protocol in secure mode. It can be reproduced by the steps: sudo mount -t ceph :/ /mnt/cephfs/ -o name=admin,fs=cephfs,ms_mode=secure (1) mkdir /mnt/cephfs/fscrypt-test-3 (2) cp area_decrypted.tar /mnt/cephfs/fscrypt-test-3 (3) fscrypt encrypt --source=raw_key --key=./my.key /mnt/cephfs/fscrypt-test-3 (4) fscrypt lock /mnt/cephfs/fscrypt-test-3 (5) fscrypt unlock --key=my.key /mnt/cephfs/fscrypt-test-3 (6) cat /mnt/cephfs/fscrypt-test-3/area_decrypted.tar (7) Issue has been triggered [ 408.072247] ------------[ cut here ]------------ [ 408.072251] WARNING: CPU: 1 PID: 392 at net/ceph/messenger_v2.c:865 ceph_con_v2_try_read+0x4b39/0x72f0 [ 408.072267] Modules linked in: intel_rapl_msr intel_rapl_common intel_uncore_frequency_common intel_pmc_core pmt_telemetry pmt_discovery pmt_class intel_pmc_ssram_telemetry intel_vsec kvm_intel joydev kvm irqbypass polyval_clmulni ghash_clmulni_intel aesni_intel rapl input_leds psmouse serio_raw i2c_piix4 vga16fb bochs vgastate i2c_smbus floppy mac_hid qemu_fw_cfg pata_acpi sch_fq_codel rbd msr parport_pc ppdev lp parport efi_pstore [ 408.072304] CPU: 1 UID: 0 PID: 392 Comm: kworker/1:3 Not tainted 6.17.0-rc7+ [ 408.072307] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.17.0-5.fc42 04/01/2014 [ 408.072310] Workqueue: ceph-msgr ceph_con_workfn [ 408.072314] RIP: 0010:ceph_con_v2_try_read+0x4b39/0x72f0 [ 408.072317] Code: c7 c1 20 f0 d4 ae 50 31 d2 48 c7 c6 60 27 d5 ae 48 c7 c7 f8 8e 6f b0 68 60 38 d5 ae e8 00 47 61 fe 48 83 c4 18 e9 ac fc ff ff <0f> 0b e9 06 fe ff ff 4c 8b 9d 98 fd ff ff 0f 84 64 e7 ff ff 89 85 [ 408.072319] RSP: 0018:ffff88811c3e7a30 EFLAGS: 00010246 [ 408.072322] RAX: ffffed1024874c6f RBX: ffffea00042c2b40 RCX: 0000000000000f38 [ 408.072324] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000 [ 408.072325] RBP: ffff88811c3e7ca8 R08: 0000000000000000 R09: 00000000000000c8 [ 408.072326] R10: 00000000000000c8 R11: 0000000000000000 R12: 00000000000000c8 [ 408.072327] R13: dffffc0000000000 R14: ffff8881243a6030 R15: 0000000000003000 [ 408.072329] FS: 0000000000000000(0000) GS:ffff88823eadf000(0000) knlGS:0000000000000000 [ 408.072331] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 408.072332] CR2: 000000c0003c6000 CR3: 000000010c106005 CR4: 0000000000772ef0 [ 408.072336] PKRU: 55555554 [ 408.072337] Call Trace: [ 408.072338] <TASK> [ 408.072340] ? sched_clock_noinstr+0x9/0x10 [ 408.072344] ? __pfx_ceph_con_v2_try_read+0x10/0x10 [ 408.072347] ? _raw_spin_unlock+0xe/0x40 [ 408.072349] ? finish_task_switch.isra.0+0x15d/0x830 [ 408.072353] ? __kasan_check_write+0x14/0x30 [ 408.072357] ? mutex_lock+0x84/0xe0 [ 408.072359] ? __pfx_mutex_lock+0x10/0x10 [ 408.072361] ceph_con_workfn+0x27e/0x10e0 [ 408.072364] ? metric_delayed_work+0x311/0x2c50 [ 408.072367] process_one_work+0x611/0xe20 [ 408.072371] ? __kasan_check_write+0x14/0x30 [ 408.072373] worker_thread+0x7e3/0x1580 [ 408.072375] ? __pfx__raw_spin_lock_irqsave+0x10/0x10 [ 408.072378] ? __pfx_worker_thread+0x10/0x10 [ 408.072381] kthread+0x381/0x7a0 [ 408.072383] ? __pfx__raw_spin_lock_irq+0x10/0x10 [ 408.072385] ? __pfx_kthread+0x10/0x10 [ 408.072387] ? __kasan_check_write+0x14/0x30 [ 408.072389] ? recalc_sigpending+0x160/0x220 [ 408.072392] ? _raw_spin_unlock_irq+0xe/0x50 [ 408.072394] ? calculate_sigpending+0x78/0xb0 [ 408.072395] ? __pfx_kthread+0x10/0x10 [ 408.072397] ret_from_fork+0x2b6/0x380 [ 408.072400] ? __pfx_kthread+0x10/0x10 [ 408.072402] ret_from_fork_asm+0x1a/0x30 [ 408.072406] </TASK> [ 408.072407] ---[ end trace 0000000000000000 ]--- [ 408.072418] Oops: general protection fault, probably for non-canonical address 0xdffffc0000000000: 0000 [#1] SMP KASAN NOPTI [ 408.072984] KASAN: null-ptr-deref in range [0x0000000000000000- 0x0000000000000007] [ 408.073350] CPU: 1 UID: 0 PID: 392 Comm: kworker/1:3 Tainted: G W 6.17.0-rc7+ #1 PREEMPT(voluntary) [ 408.073886] Tainted: [W]=WARN [ 408.074042] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.17.0-5.fc42 04/01/2014 [ 408.074468] Workqueue: ceph-msgr ceph_con_workfn [ 408.074694] RIP: 0010:ceph_msg_data_advance+0x79/0x1a80 [ 408.074976] Code: fc ff df 49 8d 77 08 48 c1 ee 03 80 3c 16 00 0f 85 07 11 00 00 48 ba 00 00 00 00 00 fc ff df 49 8b 5f 08 48 89 de 48 c1 ee 03 <0f> b6 14 16 84 d2 74 09 80 fa 03 0f 8e 0f 0e 00 00 8b 13 83 fa 03 [ 408.075884] RSP: 0018:ffff88811c3e7990 EFLAGS: 00010246 [ 408.076305] RAX: ffff8881243a6388 RBX: 0000000000000000 RCX: 0000000000000000 [ 408.076909] RDX: dffffc0000000000 RSI: 0000000000000000 RDI: ffff8881243a6378 [ 408.077466] RBP: ffff88811c3e7a20 R08: 0000000000000000 R09: 00000000000000c8 [ 408.078034] R10: ffff8881243a6388 R11: 0000000000000000 R12: ffffed1024874c71 [ 408.078575] R13: dffffc0000000000 R14: ffff8881243a6030 R15: ffff8881243a6378 [ 408.079159] FS: 0000000000000000(0000) GS:ffff88823eadf000(0000) knlGS:0000000000000000 [ 408.079736] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 408.080039] CR2: 000000c0003c6000 CR3: 000000010c106005 CR4: 0000000000772ef0 [ 408.080376] PKRU: 55555554 [ 408.080513] Call Trace: [ 408.080630] <TASK> [ 408.080729] ceph_con_v2_try_read+0x49b9/0x72f0 [ 408.081115] ? __pfx_ceph_con_v2_try_read+0x10/0x10 [ 408.081348] ? _raw_spin_unlock+0xe/0x40 [ 408.081538] ? finish_task_switch.isra.0+0x15d/0x830 [ 408.081768] ? __kasan_check_write+0x14/0x30 [ 408.081986] ? mutex_lock+0x84/0xe0 [ 408.082160] ? __pfx_mutex_lock+0x10/0x10 [ 408.082343] ceph_con_workfn+0x27e/0x10e0 [ 408.082529] ? metric_delayed_work+0x311/0x2c50 [ 408.082737] process_one_work+0x611/0xe20 [ 408.082948] ? __kasan_check_write+0x14/0x30 [ 408.083156] worker_thread+0x7e3/0x1580 [ 408.083331] ? __pfx__raw_spin_lock_irqsave+0x10/0x10 [ 408.083557] ? __pfx_worker_thread+0x10/0x10 [ 408.083751] kthread+0x381/0x7a0 [ 408.083922] ? __pfx__raw_spin_lock_irq+0x10/0x10 [ 408.084139] ? __pfx_kthread+0x10/0x10 [ 408.084310] ? __kasan_check_write+0x14/0x30 [ 408.084510] ? recalc_sigpending+0x160/0x220 [ 408.084708] ? _raw_spin_unlock_irq+0xe/0x50 [ 408.084917] ? calculate_sigpending+0x78/0xb0 [ 408.085138] ? __pfx_kthread+0x10/0x10 [ 408.085335] ret_from_fork+0x2b6/0x380 [ 408.085525] ? __pfx_kthread+0x10/0x10 [ 408.085720] ret_from_fork_asm+0x1a/0x30 [ 408.085922] </TASK> [ 408.086036] Modules linked in: intel_rapl_msr intel_rapl_common intel_uncore_frequency_common intel_pmc_core pmt_telemetry pmt_discovery pmt_class intel_pmc_ssram_telemetry intel_vsec kvm_intel joydev kvm irqbypass polyval_clmulni ghash_clmulni_intel aesni_intel rapl input_leds psmouse serio_raw i2c_piix4 vga16fb bochs vgastate i2c_smbus floppy mac_hid qemu_fw_cfg pata_acpi sch_fq_codel rbd msr parport_pc ppdev lp parport efi_pstore [ 408.087778] ---[ end trace 0000000000000000 ]--- [ 408.088007] RIP: 0010:ceph_msg_data_advance+0x79/0x1a80 [ 408.088260] Code: fc ff df 49 8d 77 08 48 c1 ee 03 80 3c 16 00 0f 85 07 11 00 00 48 ba 00 00 00 00 00 fc ff df 49 8b 5f 08 48 89 de 48 c1 ee 03 <0f> b6 14 16 84 d2 74 09 80 fa 03 0f 8e 0f 0e 00 00 8b 13 83 fa 03 [ 408.089118] RSP: 0018:ffff88811c3e7990 EFLAGS: 00010246 [ 408.089357] RAX: ffff8881243a6388 RBX: 0000000000000000 RCX: 0000000000000000 [ 408.089678] RDX: dffffc0000000000 RSI: 0000000000000000 RDI: ffff8881243a6378 [ 408.090020] RBP: ffff88811c3e7a20 R08: 0000000000000000 R09: 00000000000000c8 [ 408.090360] R10: ffff8881243a6388 R11: 0000000000000000 R12: ffffed1024874c71 [ 408.090687] R13: dffffc0000000000 R14: ffff8881243a6030 R15: ffff8881243a6378 [ 408.091035] FS: 0000000000000000(0000) GS:ffff88823eadf000(0000) knlGS:0000000000000000 [ 408.091452] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 408.092015] CR2: 000000c0003c6000 CR3: 000000010c106005 CR4: 0000000000772ef0 [ 408.092530] PKRU: 55555554 [ 417.112915] ================================================================== [ 417.113491] BUG: KASAN: slab-use-after-free in __mutex_lock.constprop.0+0x1522/0x1610 [ 417.114014] Read of size 4 at addr ffff888124870034 by task kworker/2:0/4951 [ 417.114587] CPU: 2 UID: 0 PID: 4951 Comm: kworker/2:0 Tainted: G D W 6.17.0-rc7+ #1 PREEMPT(voluntary) [ 417.114592] Tainted: [D]=DIE, [W]=WARN [ 417.114593] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.17.0-5.fc42 04/01/2014 [ 417.114596] Workqueue: events handle_timeout [ 417.114601] Call Trace: [ 417.114602] <TASK> [ 417.114604] dump_stack_lvl+0x5c/0x90 [ 417.114610] print_report+0x171/0x4dc [ 417.114613] ? __pfx__raw_spin_lock_irqsave+0x10/0x10 [ 417.114617] ? kasan_complete_mode_report_info+0x80/0x220 [ 417.114621] kasan_report+0xbd/0x100 [ 417.114625] ? __mutex_lock.constprop.0+0x1522/0x1610 [ 417.114628] ? __mutex_lock.constprop.0+0x1522/0x1610 [ 417.114630] __asan_report_load4_noabort+0x14/0x30 [ 417.114633] __mutex_lock.constprop.0+0x1522/0x1610 [ 417.114635] ? queue_con_delay+0x8d/0x200 [ 417.114638] ? __pfx___mutex_lock.constprop.0+0x10/0x10 [ 417.114641] ? __send_subscribe+0x529/0xb20 [ 417.114644] __mutex_lock_slowpath+0x13/0x20 [ 417.114646] mutex_lock+0xd4/0xe0 [ 417.114649] ? __pfx_mutex_lock+0x10/0x10 [ 417.114652] ? ceph_monc_renew_subs+0x2a/0x40 [ 417.114654] ceph_con_keepalive+0x22/0x110 [ 417.114656] handle_timeout+0x6b3/0x11d0 [ 417.114659] ? _raw_spin_unlock_irq+0xe/0x50 [ 417.114662] ? __pfx_handle_timeout+0x10/0x10 [ 417.114664] ? queue_delayed_work_on+0x8e/0xa0 [ 417.114669] process_one_work+0x611/0xe20 [ 417.114672] ? __kasan_check_write+0x14/0x30 [ 417.114676] worker_thread+0x7e3/0x1580 [ 417.114678] ? __pfx__raw_spin_lock_irqsave+0x10/0x10 [ 417.114682] ? __pfx_sched_setscheduler_nocheck+0x10/0x10 [ 417.114687] ? __pfx_worker_thread+0x10/0x10 [ 417.114689] kthread+0x381/0x7a0 [ 417.114692] ? __pfx__raw_spin_lock_irq+0x10/0x10 [ 417.114694] ? __pfx_kthread+0x10/0x10 [ 417.114697] ? __kasan_check_write+0x14/0x30 [ 417.114699] ? recalc_sigpending+0x160/0x220 [ 417.114703] ? _raw_spin_unlock_irq+0xe/0x50 [ 417.114705] ? calculate_sigpending+0x78/0xb0 [ 417.114707] ? __pfx_kthread+0x10/0x10 [ 417.114710] ret_from_fork+0x2b6/0x380 [ 417.114713] ? __pfx_kthread+0x10/0x10 [ 417.114715] ret_from_fork_asm+0x1a/0x30 [ 417.114720] </TASK> [ 417.125171] Allocated by task 2: [ 417.125333] kasan_save_stack+0x26/0x60 [ 417.125522] kasan_save_track+0x14/0x40 [ 417.125742] kasan_save_alloc_info+0x39/0x60 [ 417.125945] __kasan_slab_alloc+0x8b/0xb0 [ 417.126133] kmem_cache_alloc_node_noprof+0x13b/0x460 [ 417.126381] copy_process+0x320/0x6250 [ 417.126595] kernel_clone+0xb7/0x840 [ 417.126792] kernel_thread+0xd6/0x120 [ 417.126995] kthreadd+0x85c/0xbe0 [ 417.127176] ret_from_fork+0x2b6/0x380 [ 417.127378] ret_from_fork_asm+0x1a/0x30 [ 417.127692] Freed by task 0: [ 417.127851] kasan_save_stack+0x26/0x60 [ 417.128057] kasan_save_track+0x14/0x40 [ 417.128267] kasan_save_free_info+0x3b/0x60 [ 417.128491] __kasan_slab_free+0x6c/0xa0 [ 417.128708] kmem_cache_free+0x182/0x550 [ 417.128906] free_task+0xeb/0x140 [ 417.129070] __put_task_struct+0x1d2/0x4f0 [ 417.129259] __put_task_struct_rcu_cb+0x15/0x20 [ 417.129480] rcu_do_batch+0x3d3/0xe70 [ 417.129681] rcu_core+0x549/0xb30 [ 417.129839] rcu_core_si+0xe/0x20 [ 417.130005] handle_softirqs+0x160/0x570 [ 417.130190] __irq_exit_rcu+0x189/0x1e0 [ 417.130369] irq_exit_rcu+0xe/0x20 [ 417.130531] sysvec_apic_timer_interrupt+0x9f/0xd0 [ 417.130768] asm_sysvec_apic_timer_interrupt+0x1b/0x20 [ 417.131082] Last potentially related work creation: [ 417.131305] kasan_save_stack+0x26/0x60 [ 417.131484] kasan_record_aux_stack+0xae/0xd0 [ 417.131695] __call_rcu_common+0xcd/0x14b0 [ 417.131909] call_rcu+0x31/0x50 [ 417.132071] delayed_put_task_struct+0x128/0x190 [ 417.132295] rcu_do_batch+0x3d3/0xe70 [ 417.132478] rcu_core+0x549/0xb30 [ 417.132658] rcu_core_si+0xe/0x20 [ 417.132808] handle_softirqs+0x160/0x570 [ 417.132993] __irq_exit_rcu+0x189/0x1e0 [ 417.133181] irq_exit_rcu+0xe/0x20 [ 417.133353] sysvec_apic_timer_interrupt+0x9f/0xd0 [ 417.133584] asm_sysvec_apic_timer_interrupt+0x1b/0x20 [ 417.133921] Second to last potentially related work creation: [ 417.134183] kasan_save_stack+0x26/0x60 [ 417.134362] kasan_record_aux_stack+0xae/0xd0 [ 417.134566] __call_rcu_common+0xcd/0x14b0 [ 417.134782] call_rcu+0x31/0x50 [ 417.134929] put_task_struct_rcu_user+0x58/0xb0 [ 417.135143] finish_task_switch.isra.0+0x5d3/0x830 [ 417.135366] __schedule+0xd30/0x5100 [ 417.135534] schedule_idle+0x5a/0x90 [ 417.135712] do_idle+0x25f/0x410 [ 417.135871] cpu_startup_entry+0x53/0x70 [ 417.136053] start_secondary+0x216/0x2c0 [ 417.136233] common_startup_64+0x13e/0x141 [ 417.136894] The buggy address belongs to the object at ffff888124870000 which belongs to the cache task_struct of size 10504 [ 417.138122] The buggy address is located 52 bytes inside of freed 10504-byte region [ffff888124870000, ffff888124872908) [ 417.139465] The buggy address belongs to the physical page: [ 417.140016] page: refcount:0 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x124870 [ 417.140789] head: order:3 mapcount:0 entire_mapcount:0 nr_pages_mapped:0 pincount:0 [ 417.141519] memcg:ffff88811aa20e01 [ 417.141874] anon flags: 0x17ffffc0000040(head|node=0|zone=2|lastcpupid=0x1fffff) [ 417.142600] page_type: f5(slab) [ 417.142922] raw: 0017ffffc0000040 ffff88810094f040 0000000000000000 dead000000000001 [ 417.143554] raw: 0000000000000000 0000000000030003 00000000f5000000 ffff88811aa20e01 [ 417.143954] head: 0017ffffc0000040 ffff88810094f040 0000000000000000 dead000000000001 [ 417.144329] head: 0000000000000000 0000000000030003 00000000f5000000 ffff88811aa20e01 [ 417.144710] head: 0017ffffc0000003 ffffea0004921c01 00000000ffffffff 00000000ffffffff [ 417.145106] head: ffffffffffffffff 0000000000000000 00000000ffffffff 0000000000000008 [ 417.145485] page dumped because: kasan: bad access detected [ 417.145859] Memory state around the buggy address: [ 417.146094] ffff88812486ff00: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc [ 417.146439] ffff88812486ff80: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc [ 417.146791] >ffff888124870000: fa fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb [ 417.147145] ^ [ 417.147387] ffff888124870080: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb [ 417.147751] ffff888124870100: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb [ 417.148123] ================================================================== First of all, we have warning in get_bvec_at() because cursor->total_resid contains zero value. And, finally, we have crash in ceph_msg_data_advance() because cursor->data is NULL. It means that get_bvec_at() receives not initialized ceph_msg_data_cursor structure because data is NULL and total_resid contains zero. Moreover, we don't have likewise issue for the case of Ceph msgr1 protocol because ceph_msg_data_cursor_init() has been called before reading sparse data. This patch adds calling of ceph_msg_data_cursor_init() in the beginning of process_v2_sparse_read() with the goal to guarantee that logic of reading sparse data works correctly for the case of Ceph msgr2 protocol. Cc: stable@vger.kernel.org Link: https://tracker.ceph.com/issues/73152 Signed-off-by: Viacheslav Dubeyko <Slava.Dubeyko@ibm.com> Reviewed-by: Ilya Dryomov <idryomov@gmail.com> Signed-off-by: Ilya Dryomov <idryomov@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
kunbus-gitlab-sync
pushed a commit
that referenced
this pull request
Dec 11, 2025
…ptcp_do_fastclose(). commit f07f4ea upstream. syzbot reported divide-by-zero in __tcp_select_window() by MPTCP socket. [0] We had a similar issue for the bare TCP and fixed in commit 499350a ("tcp: initialize rcv_mss to TCP_MIN_MSS instead of 0"). Let's apply the same fix to mptcp_do_fastclose(). [0]: Oops: divide error: 0000 [#1] SMP KASAN PTI CPU: 0 UID: 0 PID: 6068 Comm: syz.0.17 Not tainted syzkaller #0 PREEMPT(full) Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 10/25/2025 RIP: 0010:__tcp_select_window+0x824/0x1320 net/ipv4/tcp_output.c:3336 Code: ff ff ff 44 89 f1 d3 e0 89 c1 f7 d1 41 01 cc 41 21 c4 e9 a9 00 00 00 e8 ca 49 01 f8 e9 9c 00 00 00 e8 c0 49 01 f8 44 89 e0 99 <f7> 7c 24 1c 41 29 d4 48 bb 00 00 00 00 00 fc ff df e9 80 00 00 00 RSP: 0018:ffffc90003017640 EFLAGS: 00010293 RAX: 0000000000000000 RBX: 0000000000000000 RCX: ffff88807b469e40 RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000 RBP: ffffc90003017730 R08: ffff888033268143 R09: 1ffff1100664d028 R10: dffffc0000000000 R11: ffffed100664d029 R12: 0000000000000000 R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000 FS: 000055557faa0500(0000) GS:ffff888126135000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007f64a1912ff8 CR3: 0000000072122000 CR4: 00000000003526f0 Call Trace: <TASK> tcp_select_window net/ipv4/tcp_output.c:281 [inline] __tcp_transmit_skb+0xbc7/0x3aa0 net/ipv4/tcp_output.c:1568 tcp_transmit_skb net/ipv4/tcp_output.c:1649 [inline] tcp_send_active_reset+0x2d1/0x5b0 net/ipv4/tcp_output.c:3836 mptcp_do_fastclose+0x27e/0x380 net/mptcp/protocol.c:2793 mptcp_disconnect+0x238/0x710 net/mptcp/protocol.c:3253 mptcp_sendmsg_fastopen+0x2f8/0x580 net/mptcp/protocol.c:1776 mptcp_sendmsg+0x1774/0x1980 net/mptcp/protocol.c:1855 sock_sendmsg_nosec net/socket.c:727 [inline] __sock_sendmsg+0xe5/0x270 net/socket.c:742 __sys_sendto+0x3bd/0x520 net/socket.c:2244 __do_sys_sendto net/socket.c:2251 [inline] __se_sys_sendto net/socket.c:2247 [inline] __x64_sys_sendto+0xde/0x100 net/socket.c:2247 do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline] do_syscall_64+0xfa/0xfa0 arch/x86/entry/syscall_64.c:94 entry_SYSCALL_64_after_hwframe+0x77/0x7f RIP: 0033:0x7f66e998f749 Code: ff ff c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 40 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 a8 ff ff ff f7 d8 64 89 01 48 RSP: 002b:00007ffff9acedb8 EFLAGS: 00000246 ORIG_RAX: 000000000000002c RAX: ffffffffffffffda RBX: 00007f66e9be5fa0 RCX: 00007f66e998f749 RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000003 RBP: 00007ffff9acee10 R08: 0000000000000000 R09: 0000000000000000 R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000001 R13: 00007f66e9be5fa0 R14: 00007f66e9be5fa0 R15: 0000000000000006 </TASK> Fixes: ae15506 ("mptcp: fix duplicate reset on fastclose") Reported-by: syzbot+3a92d359bc2ec6255a33@syzkaller.appspotmail.com Closes: https://lore.kernel.org/netdev/69260882.a70a0220.d98e3.00b4.GAE@google.com/ Signed-off-by: Kuniyuki Iwashima <kuniyu@google.com> Reviewed-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Cc: stable@vger.kernel.org Link: https://patch.msgid.link/20251125195331.309558-1-kuniyu@google.com Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
kunbus-gitlab-sync
pushed a commit
that referenced
this pull request
Dec 11, 2025
commit eb9ac77 upstream. A synchronous external abort occurs on the Renesas RZ/G3S SoC if unbind is executed after the configuration sequence described above: modprobe usb_f_ecm modprobe libcomposite modprobe configfs cd /sys/kernel/config/usb_gadget mkdir -p g1 cd g1 echo "0x1d6b" > idVendor echo "0x0104" > idProduct mkdir -p strings/0x409 echo "0123456789" > strings/0x409/serialnumber echo "Renesas." > strings/0x409/manufacturer echo "Ethernet Gadget" > strings/0x409/product mkdir -p functions/ecm.usb0 mkdir -p configs/c.1 mkdir -p configs/c.1/strings/0x409 echo "ECM" > configs/c.1/strings/0x409/configuration if [ ! -L configs/c.1/ecm.usb0 ]; then ln -s functions/ecm.usb0 configs/c.1 fi echo 11e20000.usb > UDC echo 11e20000.usb > /sys/bus/platform/drivers/renesas_usbhs/unbind The displayed trace is as follows: Internal error: synchronous external abort: 0000000096000010 [#1] SMP CPU: 0 UID: 0 PID: 188 Comm: sh Tainted: G M 6.17.0-rc7-next-20250922-00010-g41050493b2bd #55 PREEMPT Tainted: [M]=MACHINE_CHECK Hardware name: Renesas SMARC EVK version 2 based on r9a08g045s33 (DT) pstate: 604000c5 (nZCv daIF +PAN -UAO -TCO -DIT -SSBS BTYPE=--) pc : usbhs_sys_function_pullup+0x10/0x40 [renesas_usbhs] lr : usbhsg_update_pullup+0x3c/0x68 [renesas_usbhs] sp : ffff8000838b3920 x29: ffff8000838b3920 x28: ffff00000d585780 x27: 0000000000000000 x26: 0000000000000000 x25: 0000000000000000 x24: ffff00000c3e3810 x23: ffff00000d5e5c80 x22: ffff00000d5e5d40 x21: 0000000000000000 x20: 0000000000000000 x19: ffff00000d5e5c80 x18: 0000000000000020 x17: 2e30303230316531 x16: 312d7968703a7968 x15: 3d454d414e5f4344 x14: 000000000000002c x13: 0000000000000000 x12: 0000000000000000 x11: ffff00000f358f38 x10: ffff00000f358db0 x9 : ffff00000b41f418 x8 : 0101010101010101 x7 : 7f7f7f7f7f7f7f7f x6 : fefefeff6364626d x5 : 8080808000000000 x4 : 000000004b5ccb9d x3 : 0000000000000000 x2 : 0000000000000000 x1 : ffff800083790000 x0 : ffff00000d5e5c80 Call trace: usbhs_sys_function_pullup+0x10/0x40 [renesas_usbhs] (P) usbhsg_pullup+0x4c/0x7c [renesas_usbhs] usb_gadget_disconnect_locked+0x48/0xd4 gadget_unbind_driver+0x44/0x114 device_remove+0x4c/0x80 device_release_driver_internal+0x1c8/0x224 device_release_driver+0x18/0x24 bus_remove_device+0xcc/0x10c device_del+0x14c/0x404 usb_del_gadget+0x88/0xc0 usb_del_gadget_udc+0x18/0x30 usbhs_mod_gadget_remove+0x24/0x44 [renesas_usbhs] usbhs_mod_remove+0x20/0x30 [renesas_usbhs] usbhs_remove+0x98/0xdc [renesas_usbhs] platform_remove+0x20/0x30 device_remove+0x4c/0x80 device_release_driver_internal+0x1c8/0x224 device_driver_detach+0x18/0x24 unbind_store+0xb4/0xb8 drv_attr_store+0x24/0x38 sysfs_kf_write+0x7c/0x94 kernfs_fop_write_iter+0x128/0x1b8 vfs_write+0x2ac/0x350 ksys_write+0x68/0xfc __arm64_sys_write+0x1c/0x28 invoke_syscall+0x48/0x110 el0_svc_common.constprop.0+0xc0/0xe0 do_el0_svc+0x1c/0x28 el0_svc+0x34/0xf0 el0t_64_sync_handler+0xa0/0xe4 el0t_64_sync+0x198/0x19c Code: 7100003f 1a9f07e1 531c6c22 f9400001 (79400021) ---[ end trace 0000000000000000 ]--- note: sh[188] exited with irqs disabled note: sh[188] exited with preempt_count 1 The issue occurs because usbhs_sys_function_pullup(), which accesses the IP registers, is executed after the USBHS clocks have been disabled. The problem is reproducible on the Renesas RZ/G3S SoC starting with the addition of module stop in the clock enable/disable APIs. With module stop functionality enabled, a bus error is expected if a master accesses a module whose clock has been stopped and module stop activated. Disable the IP clocks at the end of remove. Cc: stable <stable@kernel.org> Fixes: f1407d5 ("usb: renesas_usbhs: Add Renesas USBHS common code") Signed-off-by: Claudiu Beznea <claudiu.beznea.uj@bp.renesas.com> Link: https://patch.msgid.link/20251027140741.557198-1-claudiu.beznea.uj@bp.renesas.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
kunbus-gitlab-sync
pushed a commit
that referenced
this pull request
Dec 11, 2025
commit 3ce62c1 upstream. [WHAT] IGT kms_cursor_legacy's long-nonblocking-modeset-vs-cursor-atomic fails with NULL pointer dereference. This can be reproduced with both an eDP panel and a DP monitors connected. BUG: kernel NULL pointer dereference, address: 0000000000000000 #PF: supervisor read access in kernel mode #PF: error_code(0x0000) - not-present page PGD 0 P4D 0 Oops: Oops: 0000 [#1] SMP NOPTI CPU: 13 UID: 0 PID: 2960 Comm: kms_cursor_lega Not tainted 6.16.0-99-custom #8 PREEMPT(voluntary) Hardware name: AMD ........ RIP: 0010:dc_stream_get_scanoutpos+0x34/0x130 [amdgpu] Code: 57 4d 89 c7 41 56 49 89 ce 41 55 49 89 d5 41 54 49 89 fc 53 48 83 ec 18 48 8b 87 a0 64 00 00 48 89 75 d0 48 c7 c6 e0 41 30 c2 <48> 8b 38 48 8b 9f 68 06 00 00 e8 8d d7 fd ff 31 c0 48 81 c3 e0 02 RSP: 0018:ffffd0f3c2bd7608 EFLAGS: 00010292 RAX: 0000000000000000 RBX: 0000000000000000 RCX: ffffd0f3c2bd7668 RDX: ffffd0f3c2bd7664 RSI: ffffffffc23041e0 RDI: ffff8b32494b8000 RBP: ffffd0f3c2bd7648 R08: ffffd0f3c2bd766c R09: ffffd0f3c2bd7760 R10: ffffd0f3c2bd7820 R11: 0000000000000000 R12: ffff8b32494b8000 R13: ffffd0f3c2bd7664 R14: ffffd0f3c2bd7668 R15: ffffd0f3c2bd766c FS: 000071f631b68700(0000) GS:ffff8b399f114000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000000000000 CR3: 00000001b8105000 CR4: 0000000000f50ef0 PKRU: 55555554 Call Trace: <TASK> dm_crtc_get_scanoutpos+0xd7/0x180 [amdgpu] amdgpu_display_get_crtc_scanoutpos+0x86/0x1c0 [amdgpu] ? __pfx_amdgpu_crtc_get_scanout_position+0x10/0x10[amdgpu] amdgpu_crtc_get_scanout_position+0x27/0x50 [amdgpu] drm_crtc_vblank_helper_get_vblank_timestamp_internal+0xf7/0x400 drm_crtc_vblank_helper_get_vblank_timestamp+0x1c/0x30 drm_crtc_get_last_vbltimestamp+0x55/0x90 drm_crtc_next_vblank_start+0x45/0xa0 drm_atomic_helper_wait_for_fences+0x81/0x1f0 ... Cc: Mario Limonciello <mario.limonciello@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Aurabindo Pillai <aurabindo.pillai@amd.com> Signed-off-by: Alex Hung <alex.hung@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit 621e55f) Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Add nowayout support for gpi-wdt.