-
Notifications
You must be signed in to change notification settings - Fork 0
x86/module: use large ROX pages for text allocations #3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Upstream branch: 9852d85 |
|
Upstream branch: 9852d85 |
c456fb4 to
3fef166
Compare
|
Upstream branch: 9852d85 |
2 similar comments
|
Upstream branch: 9852d85 |
|
Upstream branch: 9852d85 |
On the node of an NFS client, some files saved in the mountpoint of the NFS server were copied to another location of the same NFS server. Accidentally, the nfs42_complete_copies() got a NULL-pointer dereference crash with the following syslog: [232064.838881] NFSv4: state recovery failed for open file nfs/pvc-12b5200d-cd0f-46a3-b9f0-af8f4fe0ef64.qcow2, error = -116 [232064.839360] NFSv4: state recovery failed for open file nfs/pvc-12b5200d-cd0f-46a3-b9f0-af8f4fe0ef64.qcow2, error = -116 [232066.588183] Unable to handle kernel NULL pointer dereference at virtual address 0000000000000058 [232066.588586] Mem abort info: [232066.588701] ESR = 0x0000000096000007 [232066.588862] EC = 0x25: DABT (current EL), IL = 32 bits [232066.589084] SET = 0, FnV = 0 [232066.589216] EA = 0, S1PTW = 0 [232066.589340] FSC = 0x07: level 3 translation fault [232066.589559] Data abort info: [232066.589683] ISV = 0, ISS = 0x00000007 [232066.589842] CM = 0, WnR = 0 [232066.589967] user pgtable: 64k pages, 48-bit VAs, pgdp=00002000956ff400 [232066.590231] [0000000000000058] pgd=08001100ae100003, p4d=08001100ae100003, pud=08001100ae100003, pmd=08001100b3c00003, pte=0000000000000000 [232066.590757] Internal error: Oops: 96000007 [#1] SMP [232066.590958] Modules linked in: rpcsec_gss_krb5 auth_rpcgss nfsv4 dns_resolver nfs lockd grace fscache netfs ocfs2_dlmfs ocfs2_stack_o2cb ocfs2_dlm vhost_net vhost vhost_iotlb tap tun ipt_rpfilter xt_multiport ip_set_hash_ip ip_set_hash_net xfrm_interface xfrm6_tunnel tunnel4 tunnel6 esp4 ah4 wireguard libcurve25519_generic veth xt_addrtype xt_set nf_conntrack_netlink ip_set_hash_ipportnet ip_set_hash_ipportip ip_set_bitmap_port ip_set_hash_ipport dummy ip_set ip_vs_sh ip_vs_wrr ip_vs_rr ip_vs iptable_filter sch_ingress nfnetlink_cttimeout vport_gre ip_gre ip_tunnel gre vport_geneve geneve vport_vxlan vxlan ip6_udp_tunnel udp_tunnel openvswitch nf_conncount dm_round_robin dm_service_time dm_multipath xt_nat xt_MASQUERADE nft_chain_nat nf_nat xt_mark xt_conntrack xt_comment nft_compat nft_counter nf_tables nfnetlink ocfs2 ocfs2_nodemanager ocfs2_stackglue iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi ipmi_ssif nbd overlay 8021q garp mrp bonding tls rfkill sunrpc ext4 mbcache jbd2 [232066.591052] vfat fat cas_cache cas_disk ses enclosure scsi_transport_sas sg acpi_ipmi ipmi_si ipmi_devintf ipmi_msghandler ip_tables vfio_pci vfio_pci_core vfio_virqfd vfio_iommu_type1 vfio dm_mirror dm_region_hash dm_log dm_mod nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 br_netfilter bridge stp llc fuse xfs libcrc32c ast drm_vram_helper qla2xxx drm_kms_helper syscopyarea crct10dif_ce sysfillrect ghash_ce sysimgblt sha2_ce fb_sys_fops cec sha256_arm64 sha1_ce drm_ttm_helper ttm nvme_fc igb sbsa_gwdt nvme_fabrics drm nvme_core i2c_algo_bit i40e scsi_transport_fc megaraid_sas aes_neon_bs [232066.596953] CPU: 6 PID: 4124696 Comm: 10.253.166.125- Kdump: loaded Not tainted 5.15.131-9.cl9_ocfs2.aarch64 #1 [232066.597356] Hardware name: Great Wall .\x93\x8e...RF6260 V5/GWMSSE2GL1T, BIOS T656FBE_V3.0.18 2024-01-06 [232066.597721] pstate: 20400009 (nzCv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--) [232066.598034] pc : nfs4_reclaim_open_state+0x220/0x800 [nfsv4] [232066.598327] lr : nfs4_reclaim_open_state+0x12c/0x800 [nfsv4] [232066.598595] sp : ffff8000f568fc70 [232066.598731] x29: ffff8000f568fc70 x28: 0000000000001000 x27: ffff21003db33000 [232066.599030] x26: ffff800005521ae0 x25: ffff0100f98fa3f0 x24: 0000000000000001 [232066.599319] x23: ffff800009920008 x22: ffff21003db33040 x21: ffff21003db33050 [232066.599628] x20: ffff410172fe9e40 x19: ffff410172fe9e00 x18: 0000000000000000 [232066.599914] x17: 0000000000000000 x16: 0000000000000004 x15: 0000000000000000 [232066.600195] x14: 0000000000000000 x13: ffff800008e685a8 x12: 00000000eac0c6e6 [232066.600498] x11: 0000000000000000 x10: 0000000000000008 x9 : ffff8000054e5828 [232066.600784] x8 : 00000000ffffffbf x7 : 0000000000000001 x6 : 000000000a9eb14a [232066.601062] x5 : 0000000000000000 x4 : ffff70ff8a14a800 x3 : 0000000000000058 [232066.601348] x2 : 0000000000000001 x1 : 54dce46366daa6c6 x0 : 0000000000000000 [232066.601636] Call trace: [232066.601749] nfs4_reclaim_open_state+0x220/0x800 [nfsv4] [232066.601998] nfs4_do_reclaim+0x1b8/0x28c [nfsv4] [232066.602218] nfs4_state_manager+0x928/0x10f0 [nfsv4] [232066.602455] nfs4_run_state_manager+0x78/0x1b0 [nfsv4] [232066.602690] kthread+0x110/0x114 [232066.602830] ret_from_fork+0x10/0x20 [232066.602985] Code: 1400000d f9403f20 f9402e61 91016003 (f9402c00) [232066.603284] SMP: stopping secondary CPUs [232066.606936] Starting crashdump kernel... [232066.607146] Bye! Analysing the vmcore, we know that nfs4_copy_state listed by destination nfs_server->ss_copies was added by the field copies in handle_async_copy(), and we found a waiting copy process with the stack as: PID: 3511963 TASK: ffff710028b47e00 CPU: 0 COMMAND: "cp" #0 [ffff8001116ef740] __switch_to at ffff8000081b92f4 #1 [ffff8001116ef760] __schedule at ffff800008dd0650 #2 [ffff8001116ef7c0] schedule at ffff800008dd0a00 #3 [ffff8001116ef7e0] schedule_timeout at ffff800008dd6aa0 #4 [ffff8001116ef860] __wait_for_common at ffff800008dd166c #5 [ffff8001116ef8e0] wait_for_completion_interruptible at ffff800008dd1898 #6 [ffff8001116ef8f0] handle_async_copy at ffff8000055142f4 [nfsv4] #7 [ffff8001116ef970] _nfs42_proc_copy at ffff8000055147c8 [nfsv4] #8 [ffff8001116efa80] nfs42_proc_copy at ffff800005514cf0 [nfsv4] #9 [ffff8001116efc50] __nfs4_copy_file_range.constprop.0 at ffff8000054ed694 [nfsv4] The NULL-pointer dereference was due to nfs42_complete_copies() listed the nfs_server->ss_copies by the field ss_copies of nfs4_copy_state. So the nfs4_copy_state address ffff0100f98fa3f0 was offset by 0x10 and the data accessed through this pointer was also incorrect. Generally, the ordered list nfs4_state_owner->so_states indicate open(O_RDWR) or open(O_WRITE) states are reclaimed firstly by nfs4_reclaim_open_state(). When destination state reclaim is failed with NFS_STATE_RECOVERY_FAILED and copies are not deleted in nfs_server->ss_copies, the source state may be passed to the nfs42_complete_copies() process earlier, resulting in this crash scene finally. To solve this issue, we add a list_head nfs_server->ss_src_copies for a server-to-server copy specially. Fixes: 0e65a32 ("NFS: handle source server reboot") Signed-off-by: Yanjun Zhang <zhangyanjun@cestc.cn> Reviewed-by: Trond Myklebust <trond.myklebust@hammerspace.com> Signed-off-by: Anna Schumaker <anna.schumaker@oracle.com>
|
Upstream branch: 9852d85 |
|
Upstream branch: 9852d85 |
16 similar comments
|
Upstream branch: 9852d85 |
|
Upstream branch: 9852d85 |
|
Upstream branch: 9852d85 |
|
Upstream branch: 9852d85 |
|
Upstream branch: 9852d85 |
|
Upstream branch: 9852d85 |
|
Upstream branch: 9852d85 |
|
Upstream branch: 9852d85 |
|
Upstream branch: 9852d85 |
|
Upstream branch: 9852d85 |
|
Upstream branch: 9852d85 |
|
Upstream branch: 9852d85 |
|
Upstream branch: 9852d85 |
|
Upstream branch: 9852d85 |
|
Upstream branch: 9852d85 |
|
Upstream branch: 9852d85 |
Problem description
===================
Lockdep reports a possible circular locking dependency (AB/BA) between
&pl->state_mutex and &phy->lock, as follows.
phylink_resolve() // acquires &pl->state_mutex
-> phylink_major_config()
-> phy_config_inband() // acquires &pl->phydev->lock
whereas all the other call sites where &pl->state_mutex and
&pl->phydev->lock have the locking scheme reversed. Everywhere else,
&pl->phydev->lock is acquired at the top level, and &pl->state_mutex at
the lower level. A clear example is phylink_bringup_phy().
The outlier is the newly introduced phy_config_inband() and the existing
lock order is the correct one. To understand why it cannot be the other
way around, it is sufficient to consider phylink_phy_change(), phylink's
callback from the PHY device's phy->phy_link_change() virtual method,
invoked by the PHY state machine.
phy_link_up() and phy_link_down(), the (indirect) callers of
phylink_phy_change(), are called with &phydev->lock acquired.
Then phylink_phy_change() acquires its own &pl->state_mutex, to
serialize changes made to its pl->phy_state and pl->link_config.
So all other instances of &pl->state_mutex and &phydev->lock must be
consistent with this order.
Problem impact
==============
I think the kernel runs a serious deadlock risk if an existing
phylink_resolve() thread, which results in a phy_config_inband() call,
is concurrent with a phy_link_up() or phy_link_down() call, which will
deadlock on &pl->state_mutex in phylink_phy_change(). Practically
speaking, the impact may be limited by the slow speed of the medium
auto-negotiation protocol, which makes it unlikely for the current state
to still be unresolved when a new one is detected, but I think the
problem is there. Nonetheless, the problem was discovered using lockdep.
Proposed solution
=================
Practically speaking, the phy_config_inband() requirement of having
phydev->lock acquired must transfer to the caller (phylink is the only
caller). There, it must bubble up until immediately before
&pl->state_mutex is acquired, for the cases where that takes place.
Solution details, considerations, notes
=======================================
This is the phy_config_inband() call graph:
sfp_upstream_ops :: connect_phy()
|
v
phylink_sfp_connect_phy()
|
v
phylink_sfp_config_phy()
|
| sfp_upstream_ops :: module_insert()
| |
| v
| phylink_sfp_module_insert()
| |
| | sfp_upstream_ops :: module_start()
| | |
| | v
| | phylink_sfp_module_start()
| | |
| v v
| phylink_sfp_config_optical()
phylink_start() | |
| phylink_resume() v v
| | phylink_sfp_set_config()
| | |
v v v
phylink_mac_initial_config()
| phylink_resolve()
| | phylink_ethtool_ksettings_set()
v v v
phylink_major_config()
|
v
phy_config_inband()
phylink_major_config() caller #1, phylink_mac_initial_config(), does not
acquire &pl->state_mutex nor do its callers. It must acquire
&pl->phydev->lock prior to calling phylink_major_config().
phylink_major_config() caller #2, phylink_resolve() acquires
&pl->state_mutex, thus also needs to acquire &pl->phydev->lock.
phylink_major_config() caller #3, phylink_ethtool_ksettings_set(), is
completely uninteresting, because it only calls phylink_major_config()
if pl->phydev is NULL (otherwise it calls phy_ethtool_ksettings_set()).
We need to change nothing there.
Other solutions
===============
The lock inversion between &pl->state_mutex and &pl->phydev->lock has
occurred at least once before, as seen in commit c718af2 ("net:
phylink: fix ethtool -A with attached PHYs"). The solution there was to
simply not call phy_set_asym_pause() under the &pl->state_mutex. That
cannot be extended to our case though, where the phy_config_inband()
call is much deeper inside the &pl->state_mutex section.
Fixes: 5fd0f1a ("net: phylink: add negotiation of in-band capabilities")
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Link: https://patch.msgid.link/20250904125238.193990-2-vladimir.oltean@nxp.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
From v6.13 -> v6.14 the selftests kmod takes around 10s longer to run. Use KMOD_TIMEOUT Makefile cli variable to increase the default 165s timeout to 175s. Output: ok 1 selftests: kmod: kmod.sh real 2m47.077s user 0m0.062s sys 0m0.235s kdevops@dani-kmod:/data/selftests/kselftest/kselftest_install$ uname -a Linux dani-kmod 6.14.0 #3 SMP PREEMPT_DYNAMIC Fri Jun 27 18:28:16 UTC 2025 x86_64 GNU/Linux ok 1 selftests: kmod: kmod.sh real 2m41.456s user 0m0.068s sys 0m0.241s kdevops@dani-kmod:/data/selftests/kselftest/kselftest_install$ uname -a Linux dani-kmod 6.13.0 #2 SMP PREEMPT_DYNAMIC Fri Jun 27 17:13:42 UTC 2025 x86_64 GNU/Linux Signed-off-by: Daniel Gomez <da.gomez@samsung.com>
…ux/kernel/git/kvmarm/kvmarm into HEAD KVM/arm64 changes for 6.17, round #3 - Invalidate nested MMUs upon freeing the PGD to avoid WARNs when visiting from an MMU notifier - Fixes to the TLB match process and TLB invalidation range for managing the VCNR pseudo-TLB - Prevent SPE from erroneously profiling guests due to UNKNOWN reset values in PMSCR_EL1 - Fix save/restore of host MDCR_EL2 to account for eagerly programming at vcpu_load() on VHE systems - Correct lock ordering when dealing with VGIC LPIs, avoiding scenarios where an xarray's spinlock was nested with a *raw* spinlock - Permit stage-2 read permission aborts which are possible in the case of NV depending on the guest hypervisor's stage-2 translation - Call raw_spin_unlock() instead of the internal spinlock API - Fix parameter ordering when assigning VBAR_EL1
From v6.13 -> v6.14 the selftests kmod takes around 10s longer to run. Use KMOD_TIMEOUT Makefile cli variable to increase the default 165s timeout to 175s. Output: ok 1 selftests: kmod: kmod.sh real 2m47.077s user 0m0.062s sys 0m0.235s kdevops@dani-kmod:/data/selftests/kselftest/kselftest_install$ uname -a Linux dani-kmod 6.14.0 #3 SMP PREEMPT_DYNAMIC Fri Jun 27 18:28:16 UTC 2025 x86_64 GNU/Linux ok 1 selftests: kmod: kmod.sh real 2m41.456s user 0m0.068s sys 0m0.241s kdevops@dani-kmod:/data/selftests/kselftest/kselftest_install$ uname -a Linux dani-kmod 6.13.0 #2 SMP PREEMPT_DYNAMIC Fri Jun 27 17:13:42 UTC 2025 x86_64 GNU/Linux Signed-off-by: Daniel Gomez <da.gomez@samsung.com>
This fixes the following UAF caused by not properly locking hdev when processing HCI_EV_NUM_COMP_PKTS: BUG: KASAN: slab-use-after-free in hci_conn_tx_dequeue+0x1be/0x220 net/bluetooth/hci_conn.c:3036 Read of size 4 at addr ffff8880740f0940 by task kworker/u11:0/54 CPU: 1 UID: 0 PID: 54 Comm: kworker/u11:0 Not tainted 6.16.0-rc7 #3 PREEMPT(full) Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-1ubuntu1 04/01/2014 Workqueue: hci1 hci_rx_work Call Trace: <TASK> dump_stack_lvl+0x189/0x250 lib/dump_stack.c:120 print_address_description mm/kasan/report.c:378 [inline] print_report+0xca/0x230 mm/kasan/report.c:480 kasan_report+0x118/0x150 mm/kasan/report.c:593 hci_conn_tx_dequeue+0x1be/0x220 net/bluetooth/hci_conn.c:3036 hci_num_comp_pkts_evt+0x1c8/0xa50 net/bluetooth/hci_event.c:4404 hci_event_func net/bluetooth/hci_event.c:7477 [inline] hci_event_packet+0x7e0/0x1200 net/bluetooth/hci_event.c:7531 hci_rx_work+0x46a/0xe80 net/bluetooth/hci_core.c:4070 process_one_work kernel/workqueue.c:3238 [inline] process_scheduled_works+0xae1/0x17b0 kernel/workqueue.c:3321 worker_thread+0x8a0/0xda0 kernel/workqueue.c:3402 kthread+0x70e/0x8a0 kernel/kthread.c:464 ret_from_fork+0x3fc/0x770 arch/x86/kernel/process.c:148 ret_from_fork_asm+0x1a/0x30 home/kwqcheii/source/fuzzing/kernel/kasan/linux-6.16-rc7/arch/x86/entry/entry_64.S:245 </TASK> Allocated by task 54: kasan_save_stack mm/kasan/common.c:47 [inline] kasan_save_track+0x3e/0x80 mm/kasan/common.c:68 poison_kmalloc_redzone mm/kasan/common.c:377 [inline] __kasan_kmalloc+0x93/0xb0 mm/kasan/common.c:394 kasan_kmalloc include/linux/kasan.h:260 [inline] __kmalloc_cache_noprof+0x230/0x3d0 mm/slub.c:4359 kmalloc_noprof include/linux/slab.h:905 [inline] kzalloc_noprof include/linux/slab.h:1039 [inline] __hci_conn_add+0x233/0x1b30 net/bluetooth/hci_conn.c:939 le_conn_complete_evt+0x3d6/0x1220 net/bluetooth/hci_event.c:5628 hci_le_enh_conn_complete_evt+0x189/0x470 net/bluetooth/hci_event.c:5794 hci_event_func net/bluetooth/hci_event.c:7474 [inline] hci_event_packet+0x78c/0x1200 net/bluetooth/hci_event.c:7531 hci_rx_work+0x46a/0xe80 net/bluetooth/hci_core.c:4070 process_one_work kernel/workqueue.c:3238 [inline] process_scheduled_works+0xae1/0x17b0 kernel/workqueue.c:3321 worker_thread+0x8a0/0xda0 kernel/workqueue.c:3402 kthread+0x70e/0x8a0 kernel/kthread.c:464 ret_from_fork+0x3fc/0x770 arch/x86/kernel/process.c:148 ret_from_fork_asm+0x1a/0x30 home/kwqcheii/source/fuzzing/kernel/kasan/linux-6.16-rc7/arch/x86/entry/entry_64.S:245 Freed by task 9572: kasan_save_stack mm/kasan/common.c:47 [inline] kasan_save_track+0x3e/0x80 mm/kasan/common.c:68 kasan_save_free_info+0x46/0x50 mm/kasan/generic.c:576 poison_slab_object mm/kasan/common.c:247 [inline] __kasan_slab_free+0x62/0x70 mm/kasan/common.c:264 kasan_slab_free include/linux/kasan.h:233 [inline] slab_free_hook mm/slub.c:2381 [inline] slab_free mm/slub.c:4643 [inline] kfree+0x18e/0x440 mm/slub.c:4842 device_release+0x9c/0x1c0 kobject_cleanup lib/kobject.c:689 [inline] kobject_release lib/kobject.c:720 [inline] kref_put include/linux/kref.h:65 [inline] kobject_put+0x22b/0x480 lib/kobject.c:737 hci_conn_cleanup net/bluetooth/hci_conn.c:175 [inline] hci_conn_del+0x8ff/0xcb0 net/bluetooth/hci_conn.c:1173 hci_abort_conn_sync+0x5d1/0xdf0 net/bluetooth/hci_sync.c:5689 hci_cmd_sync_work+0x210/0x3a0 net/bluetooth/hci_sync.c:332 process_one_work kernel/workqueue.c:3238 [inline] process_scheduled_works+0xae1/0x17b0 kernel/workqueue.c:3321 worker_thread+0x8a0/0xda0 kernel/workqueue.c:3402 kthread+0x70e/0x8a0 kernel/kthread.c:464 ret_from_fork+0x3fc/0x770 arch/x86/kernel/process.c:148 ret_from_fork_asm+0x1a/0x30 home/kwqcheii/source/fuzzing/kernel/kasan/linux-6.16-rc7/arch/x86/entry/entry_64.S:245 Fixes: 134f4b3 ("Bluetooth: add support for skb TX SND/COMPLETION timestamping") Reported-by: Junvyyang, Tencent Zhuque Lab <zhuque@tencent.com> Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
This fixes the following UFA in hci_acl_create_conn_sync where a connection still pending is command submission (conn->state == BT_OPEN) maybe freed, also since this also can happen with the likes of hci_le_create_conn_sync fix it as well: BUG: KASAN: slab-use-after-free in hci_acl_create_conn_sync+0x5ef/0x790 net/bluetooth/hci_sync.c:6861 Write of size 2 at addr ffff88805ffcc038 by task kworker/u11:2/9541 CPU: 1 UID: 0 PID: 9541 Comm: kworker/u11:2 Not tainted 6.16.0-rc7 #3 PREEMPT(full) Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-1ubuntu1 04/01/2014 Workqueue: hci3 hci_cmd_sync_work Call Trace: <TASK> dump_stack_lvl+0x189/0x250 lib/dump_stack.c:120 print_address_description mm/kasan/report.c:378 [inline] print_report+0xca/0x230 mm/kasan/report.c:480 kasan_report+0x118/0x150 mm/kasan/report.c:593 hci_acl_create_conn_sync+0x5ef/0x790 net/bluetooth/hci_sync.c:6861 hci_cmd_sync_work+0x210/0x3a0 net/bluetooth/hci_sync.c:332 process_one_work kernel/workqueue.c:3238 [inline] process_scheduled_works+0xae1/0x17b0 kernel/workqueue.c:3321 worker_thread+0x8a0/0xda0 kernel/workqueue.c:3402 kthread+0x70e/0x8a0 kernel/kthread.c:464 ret_from_fork+0x3fc/0x770 arch/x86/kernel/process.c:148 ret_from_fork_asm+0x1a/0x30 home/kwqcheii/source/fuzzing/kernel/kasan/linux-6.16-rc7/arch/x86/entry/entry_64.S:245 </TASK> Allocated by task 123736: kasan_save_stack mm/kasan/common.c:47 [inline] kasan_save_track+0x3e/0x80 mm/kasan/common.c:68 poison_kmalloc_redzone mm/kasan/common.c:377 [inline] __kasan_kmalloc+0x93/0xb0 mm/kasan/common.c:394 kasan_kmalloc include/linux/kasan.h:260 [inline] __kmalloc_cache_noprof+0x230/0x3d0 mm/slub.c:4359 kmalloc_noprof include/linux/slab.h:905 [inline] kzalloc_noprof include/linux/slab.h:1039 [inline] __hci_conn_add+0x233/0x1b30 net/bluetooth/hci_conn.c:939 hci_conn_add_unset net/bluetooth/hci_conn.c:1051 [inline] hci_connect_acl+0x16c/0x4e0 net/bluetooth/hci_conn.c:1634 pair_device+0x418/0xa70 net/bluetooth/mgmt.c:3556 hci_mgmt_cmd+0x9c9/0xef0 net/bluetooth/hci_sock.c:1719 hci_sock_sendmsg+0x6ca/0xef0 net/bluetooth/hci_sock.c:1839 sock_sendmsg_nosec net/socket.c:712 [inline] __sock_sendmsg+0x219/0x270 net/socket.c:727 sock_write_iter+0x258/0x330 net/socket.c:1131 new_sync_write fs/read_write.c:593 [inline] vfs_write+0x54b/0xa90 fs/read_write.c:686 ksys_write+0x145/0x250 fs/read_write.c:738 do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline] do_syscall_64+0xfa/0x3b0 arch/x86/entry/syscall_64.c:94 entry_SYSCALL_64_after_hwframe+0x77/0x7f Freed by task 103680: kasan_save_stack mm/kasan/common.c:47 [inline] kasan_save_track+0x3e/0x80 mm/kasan/common.c:68 kasan_save_free_info+0x46/0x50 mm/kasan/generic.c:576 poison_slab_object mm/kasan/common.c:247 [inline] __kasan_slab_free+0x62/0x70 mm/kasan/common.c:264 kasan_slab_free include/linux/kasan.h:233 [inline] slab_free_hook mm/slub.c:2381 [inline] slab_free mm/slub.c:4643 [inline] kfree+0x18e/0x440 mm/slub.c:4842 device_release+0x9c/0x1c0 kobject_cleanup lib/kobject.c:689 [inline] kobject_release lib/kobject.c:720 [inline] kref_put include/linux/kref.h:65 [inline] kobject_put+0x22b/0x480 lib/kobject.c:737 hci_conn_cleanup net/bluetooth/hci_conn.c:175 [inline] hci_conn_del+0x8ff/0xcb0 net/bluetooth/hci_conn.c:1173 hci_conn_complete_evt+0x3c7/0x1040 net/bluetooth/hci_event.c:3199 hci_event_func net/bluetooth/hci_event.c:7477 [inline] hci_event_packet+0x7e0/0x1200 net/bluetooth/hci_event.c:7531 hci_rx_work+0x46a/0xe80 net/bluetooth/hci_core.c:4070 process_one_work kernel/workqueue.c:3238 [inline] process_scheduled_works+0xae1/0x17b0 kernel/workqueue.c:3321 worker_thread+0x8a0/0xda0 kernel/workqueue.c:3402 kthread+0x70e/0x8a0 kernel/kthread.c:464 ret_from_fork+0x3fc/0x770 arch/x86/kernel/process.c:148 ret_from_fork_asm+0x1a/0x30 home/kwqcheii/source/fuzzing/kernel/kasan/linux-6.16-rc7/arch/x86/entry/entry_64.S:245 Last potentially related work creation: kasan_save_stack+0x3e/0x60 mm/kasan/common.c:47 kasan_record_aux_stack+0xbd/0xd0 mm/kasan/generic.c:548 insert_work+0x3d/0x330 kernel/workqueue.c:2183 __queue_work+0xbd9/0xfe0 kernel/workqueue.c:2345 queue_delayed_work_on+0x18b/0x280 kernel/workqueue.c:2561 pairing_complete+0x1e7/0x2b0 net/bluetooth/mgmt.c:3451 pairing_complete_cb+0x1ac/0x230 net/bluetooth/mgmt.c:3487 hci_connect_cfm include/net/bluetooth/hci_core.h:2064 [inline] hci_conn_failed+0x24d/0x310 net/bluetooth/hci_conn.c:1275 hci_conn_complete_evt+0x3c7/0x1040 net/bluetooth/hci_event.c:3199 hci_event_func net/bluetooth/hci_event.c:7477 [inline] hci_event_packet+0x7e0/0x1200 net/bluetooth/hci_event.c:7531 hci_rx_work+0x46a/0xe80 net/bluetooth/hci_core.c:4070 process_one_work kernel/workqueue.c:3238 [inline] process_scheduled_works+0xae1/0x17b0 kernel/workqueue.c:3321 worker_thread+0x8a0/0xda0 kernel/workqueue.c:3402 kthread+0x70e/0x8a0 kernel/kthread.c:464 ret_from_fork+0x3fc/0x770 arch/x86/kernel/process.c:148 ret_from_fork_asm+0x1a/0x30 home/kwqcheii/source/fuzzing/kernel/kasan/linux-6.16-rc7/arch/x86/entry/entry_64.S:245 Fixes: aef2aa4 ("Bluetooth: hci_event: Fix creating hci_conn object on error status") Reported-by: Junvyyang, Tencent Zhuque Lab <zhuque@tencent.com> Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
Ido Schimmel says: ==================== nexthop: Various fixes Patch #1 fixes a NPD that was recently reported by syzbot. Patch #2 fixes an issue in the existing FIB nexthop selftest. Patch #3 extends the selftest with test cases for the bug that was fixed in the first patch. ==================== Link: https://patch.msgid.link/20250921150824.149157-1-idosch@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
From v6.13 -> v6.14 the selftests kmod takes around 10s longer to run. Use KMOD_TIMEOUT Makefile cli variable to increase the default 165s timeout to 175s. Output: ok 1 selftests: kmod: kmod.sh real 2m47.077s user 0m0.062s sys 0m0.235s kdevops@dani-kmod:/data/selftests/kselftest/kselftest_install$ uname -a Linux dani-kmod 6.14.0 #3 SMP PREEMPT_DYNAMIC Fri Jun 27 18:28:16 UTC 2025 x86_64 GNU/Linux ok 1 selftests: kmod: kmod.sh real 2m41.456s user 0m0.068s sys 0m0.241s kdevops@dani-kmod:/data/selftests/kselftest/kselftest_install$ uname -a Linux dani-kmod 6.13.0 #2 SMP PREEMPT_DYNAMIC Fri Jun 27 17:13:42 UTC 2025 x86_64 GNU/Linux Signed-off-by: Daniel Gomez <da.gomez@samsung.com>
Since blamed commit, unregister_netdevice_many_notify() takes the netdev
mutex if the device needs it.
If the device list is too long, this will lock more device mutexes than
lockdep can handle:
unshare -n \
bash -c 'for i in $(seq 1 100);do ip link add foo$i type dummy;done'
BUG: MAX_LOCK_DEPTH too low!
turning off the locking correctness validator.
depth: 48 max: 48!
48 locks held by kworker/u16:1/69:
#0: ..148 ((wq_completion)netns){+.+.}-{0:0}, at: process_one_work
#1: ..d40 (net_cleanup_work){+.+.}-{0:0}, at: process_one_work
#2: ..bd0 (pernet_ops_rwsem){++++}-{4:4}, at: cleanup_net
#3: ..aa8 (rtnl_mutex){+.+.}-{4:4}, at: default_device_exit_batch
#4: ..cb0 (&dev_instance_lock_key#3){+.+.}-{4:4}, at: unregister_netdevice_many_notify
[..]
Add a helper to close and then unlock a list of net_devices.
Devices that are not up have to be skipped - netif_close_many always
removes them from the list without any other actions taken, so they'd
remain in locked state.
Close devices whenever we've used up half of the tracking slots or we
processed entire list without hitting the limit.
Fixes: 7e4d784 ("net: hold netdev instance lock during rtnetlink operations")
Signed-off-by: Florian Westphal <fw@strlen.de>
Link: https://patch.msgid.link/20251013185052.14021-1-fw@strlen.de
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Expand the prefault memory selftest to add a regression test for a KVM bug where KVM's retry logic would result in (breakable) deadlock due to the memslot deletion waiting on prefaulting to release SRCU, and prefaulting waiting on the memslot to fully disappear (KVM uses a two-step process to delete memslots, and KVM x86 retries page faults if a to-be-deleted, a.k.a. INVALID, memslot is encountered). To exercise concurrent memslot remove, spawn a second thread to initiate memslot removal at roughly the same time as prefaulting. Test memslot removal for all testcases, i.e. don't limit concurrent removal to only the success case. There are essentially three prefault scenarios (so far) that are of interest: 1. Success 2. ENOENT due to no memslot 3. EAGAIN due to INVALID memslot For all intents and purposes, #1 and #2 are mutually exclusive, or rather, easier to test via separate testcases since writing to non-existent memory is trivial. But for #3, making it mutually exclusive with #1 _or_ #2 is actually more complex than testing memslot removal for all scenarios. The only requirement to let memslot removal coexist with other scenarios is a way to guarantee a stable result, e.g. that the "no memslot" test observes ENOENT, not EAGAIN, for the final checks. So, rather than make memslot removal mutually exclusive with the ENOENT scenario, simply restore the memslot and retry prefaulting. For the "no memslot" case, KVM_PRE_FAULT_MEMORY should be idempotent, i.e. should always fail with ENOENT regardless of how many times userspace attempts prefaulting. Pass in both the base GPA and the offset (instead of the "full" GPA) so that the worker can recreate the memslot. Signed-off-by: Yan Zhao <yan.y.zhao@intel.com> Co-developed-by: Sean Christopherson <seanjc@google.com> Link: https://lore.kernel.org/r/20250924174255.2141847-1-seanjc@google.com Signed-off-by: Sean Christopherson <seanjc@google.com>
From v6.13 -> v6.14 the selftests kmod takes around 10s longer to run. Use KMOD_TIMEOUT Makefile cli variable to increase the default 165s timeout to 175s. Output: ok 1 selftests: kmod: kmod.sh real 2m47.077s user 0m0.062s sys 0m0.235s kdevops@dani-kmod:/data/selftests/kselftest/kselftest_install$ uname -a Linux dani-kmod 6.14.0 #3 SMP PREEMPT_DYNAMIC Fri Jun 27 18:28:16 UTC 2025 x86_64 GNU/Linux ok 1 selftests: kmod: kmod.sh real 2m41.456s user 0m0.068s sys 0m0.241s kdevops@dani-kmod:/data/selftests/kselftest/kselftest_install$ uname -a Linux dani-kmod 6.13.0 #2 SMP PREEMPT_DYNAMIC Fri Jun 27 17:13:42 UTC 2025 x86_64 GNU/Linux Signed-off-by: Daniel Gomez <da.gomez@samsung.com>
The ns_bpf_qdisc selftest triggers a kernel panic: Oops[#1]: CPU 0 Unable to handle kernel paging request at virtual address 0000000000741d58, era == 90000000851b5ac0, ra == 90000000851b5aa4 CPU: 0 UID: 0 PID: 449 Comm: test_progs Tainted: G OE 6.16.0+ #3 PREEMPT(full) Tainted: [O]=OOT_MODULE, [E]=UNSIGNED_MODULE Hardware name: QEMU QEMU Virtual Machine, BIOS unknown 2/2/2022 pc 90000000851b5ac0 ra 90000000851b5aa4 tp 90000001076b8000 sp 90000001076bb600 a0 0000000000741ce8 a1 0000000000000001 a2 90000001076bb5c0 a3 0000000000000008 a4 90000001004c4620 a5 9000000100741ce8 a6 0000000000000000 a7 0100000000000000 t0 0000000000000010 t1 0000000000000000 t2 9000000104d24d30 t3 0000000000000001 t4 4f2317da8a7e08c4 t5 fffffefffc002f00 t6 90000001004c4620 t7 ffffffffc61c5b3d t8 0000000000000000 u0 0000000000000001 s9 0000000000000050 s0 90000001075bc800 s1 0000000000000040 s2 900000010597c400 s3 0000000000000008 s4 90000001075bc880 s5 90000001075bc8f0 s6 0000000000000000 s7 0000000000741ce8 s8 0000000000000000 ra: 90000000851b5aa4 __qdisc_run+0xac/0x8d8 ERA: 90000000851b5ac0 __qdisc_run+0xc8/0x8d8 CRMD: 000000b0 (PLV0 -IE -DA +PG DACF=CC DACM=CC -WE) PRMD: 00000004 (PPLV0 +PIE -PWE) EUEN: 00000007 (+FPE +SXE +ASXE -BTE) ECFG: 00071c1d (LIE=0,2-4,10-12 VS=7) ESTAT: 00010000 [PIL] (IS= ECode=1 EsubCode=0) BADV: 0000000000741d58 PRID: 0014c010 (Loongson-64bit, Loongson-3A5000) Modules linked in: bpf_testmod(OE) [last unloaded: bpf_testmod(OE)] Process test_progs (pid: 449, threadinfo=000000009af02b3a, task=00000000e9ba4956) Stack : 0000000000000000 90000001075bc8ac 90000000869524a8 9000000100741ce8 90000001075bc800 9000000100415300 90000001075bc8ac 0000000000000000 900000010597c400 900000008694a000 0000000000000000 9000000105b59000 90000001075bc800 9000000100741ce8 0000000000000050 900000008513000c 9000000086936000 0000000100094d4c fffffff400676208 0000000000000000 9000000105b59000 900000008694a000 9000000086bf0dc0 9000000105b59000 9000000086bf0d68 9000000085147010 90000001075be788 0000000000000000 9000000086bf0f98 0000000000000001 0000000000000010 9000000006015840 0000000000000000 9000000086be6c40 0000000000000000 0000000000000000 0000000000000000 4f2317da8a7e08c4 0000000000000101 4f2317da8a7e08c4 ... Call Trace: [<90000000851b5ac0>] __qdisc_run+0xc8/0x8d8 [<9000000085130008>] __dev_queue_xmit+0x578/0x10f0 [<90000000853701c0>] ip6_finish_output2+0x2f0/0x950 [<9000000085374bc8>] ip6_finish_output+0x2b8/0x448 [<9000000085370b24>] ip6_xmit+0x304/0x858 [<90000000853c4438>] inet6_csk_xmit+0x100/0x170 [<90000000852b32f0>] __tcp_transmit_skb+0x490/0xdd0 [<90000000852b47fc>] tcp_connect+0xbcc/0x1168 [<90000000853b9088>] tcp_v6_connect+0x580/0x8a0 [<90000000852e7738>] __inet_stream_connect+0x170/0x480 [<90000000852e7a98>] inet_stream_connect+0x50/0x88 [<90000000850f2814>] __sys_connect+0xe4/0x110 [<90000000850f2858>] sys_connect+0x18/0x28 [<9000000085520c94>] do_syscall+0x94/0x1a0 [<9000000083df1fb8>] handle_syscall+0xb8/0x158 Code: 4001ad80 2400873 2400832d <240073cc> 001137ff 001133ff 6407b41f 001503cc 0280041d ---[ end trace 0000000000000000 ]--- The bpf_fifo_dequeue prog returns a skb which is a pointer. The pointer is treated as a 32bit value and sign extend to 64bit in epilogue. This behavior is right for most bpf prog types but wrong for struct ops which requires LoongArch ABI. So let's sign extend struct ops return values according to the LoongArch ABI ([1]) and return value spec in function model. [1]: https://loongson.github.io/LoongArch-Documentation/LoongArch-ELF-ABI-EN.html Cc: stable@vger.kernel.org Fixes: 6abf17d ("LoongArch: BPF: Add struct ops support for trampoline") Signed-off-by: Hengqi Chen <hengqi.chen@gmail.com> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
From v6.13 -> v6.14 the selftests kmod takes around 10s longer to run. Use KMOD_TIMEOUT Makefile cli variable to increase the default 165s timeout to 175s. Output: ok 1 selftests: kmod: kmod.sh real 2m47.077s user 0m0.062s sys 0m0.235s kdevops@dani-kmod:/data/selftests/kselftest/kselftest_install$ uname -a Linux dani-kmod 6.14.0 #3 SMP PREEMPT_DYNAMIC Fri Jun 27 18:28:16 UTC 2025 x86_64 GNU/Linux ok 1 selftests: kmod: kmod.sh real 2m41.456s user 0m0.068s sys 0m0.241s kdevops@dani-kmod:/data/selftests/kselftest/kselftest_install$ uname -a Linux dani-kmod 6.13.0 #2 SMP PREEMPT_DYNAMIC Fri Jun 27 17:13:42 UTC 2025 x86_64 GNU/Linux Signed-off-by: Daniel Gomez <da.gomez@samsung.com>
From v6.13 -> v6.14 the selftests kmod takes around 10s longer to run. Use KMOD_TIMEOUT Makefile cli variable to increase the default 165s timeout to 175s. Output: ok 1 selftests: kmod: kmod.sh real 2m47.077s user 0m0.062s sys 0m0.235s kdevops@dani-kmod:/data/selftests/kselftest/kselftest_install$ uname -a Linux dani-kmod 6.14.0 #3 SMP PREEMPT_DYNAMIC Fri Jun 27 18:28:16 UTC 2025 x86_64 GNU/Linux ok 1 selftests: kmod: kmod.sh real 2m41.456s user 0m0.068s sys 0m0.241s kdevops@dani-kmod:/data/selftests/kselftest/kselftest_install$ uname -a Linux dani-kmod 6.13.0 #2 SMP PREEMPT_DYNAMIC Fri Jun 27 17:13:42 UTC 2025 x86_64 GNU/Linux Signed-off-by: Daniel Gomez <da.gomez@samsung.com>
Michael Chan says: ==================== bnxt_en: Bug fixes Patches 1, 3, and 4 are bug fixes related to the FW log tracing driver coredump feature recently added in 6.13. Patch #1 adds the necessary call to shutdown the FW logging DMA during PCI shutdown. Patch #3 fixes a possible null pointer derefernce when using early versions of the FW with this feature. Patch #4 adds the coredump header information unconditionally to make it more robust. Patch #2 fixes a possible memory leak during PTP shutdown. Patch #5 eliminates a dmesg warning when doing devlink reload. ==================== Link: https://patch.msgid.link/20251104005700.542174-1-michael.chan@broadcom.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
On completion of i915_vma_pin_ww(), a synchronous variant of dma_fence_work_commit() is called. When pinning a VMA to GGTT address space on a Cherry View family processor, or on a Broxton generation SoC with VTD enabled, i.e., when stop_machine() is then called from intel_ggtt_bind_vma(), that can potentially lead to lock inversion among reservation_ww and cpu_hotplug locks. [86.861179] ====================================================== [86.861193] WARNING: possible circular locking dependency detected [86.861209] 6.15.0-rc5-CI_DRM_16515-gca0305cadc2d+ #1 Tainted: G U [86.861226] ------------------------------------------------------ [86.861238] i915_module_loa/1432 is trying to acquire lock: [86.861252] ffffffff83489090 (cpu_hotplug_lock){++++}-{0:0}, at: stop_machine+0x1c/0x50 [86.861290] but task is already holding lock: [86.861303] ffffc90002e0b4c8 (reservation_ww_class_mutex){+.+.}-{3:3}, at: i915_vma_pin.constprop.0+0x39/0x1d0 [i915] [86.862233] which lock already depends on the new lock. [86.862251] the existing dependency chain (in reverse order) is: [86.862265] -> #5 (reservation_ww_class_mutex){+.+.}-{3:3}: [86.862292] dma_resv_lockdep+0x19a/0x390 [86.862315] do_one_initcall+0x60/0x3f0 [86.862334] kernel_init_freeable+0x3cd/0x680 [86.862353] kernel_init+0x1b/0x200 [86.862369] ret_from_fork+0x47/0x70 [86.862383] ret_from_fork_asm+0x1a/0x30 [86.862399] -> #4 (reservation_ww_class_acquire){+.+.}-{0:0}: [86.862425] dma_resv_lockdep+0x178/0x390 [86.862440] do_one_initcall+0x60/0x3f0 [86.862454] kernel_init_freeable+0x3cd/0x680 [86.862470] kernel_init+0x1b/0x200 [86.862482] ret_from_fork+0x47/0x70 [86.862495] ret_from_fork_asm+0x1a/0x30 [86.862509] -> #3 (&mm->mmap_lock){++++}-{3:3}: [86.862531] down_read_killable+0x46/0x1e0 [86.862546] lock_mm_and_find_vma+0xa2/0x280 [86.862561] do_user_addr_fault+0x266/0x8e0 [86.862578] exc_page_fault+0x8a/0x2f0 [86.862593] asm_exc_page_fault+0x27/0x30 [86.862607] filldir64+0xeb/0x180 [86.862620] kernfs_fop_readdir+0x118/0x480 [86.862635] iterate_dir+0xcf/0x2b0 [86.862648] __x64_sys_getdents64+0x84/0x140 [86.862661] x64_sys_call+0x1058/0x2660 [86.862675] do_syscall_64+0x91/0xe90 [86.862689] entry_SYSCALL_64_after_hwframe+0x76/0x7e [86.862703] -> #2 (&root->kernfs_rwsem){++++}-{3:3}: [86.862725] down_write+0x3e/0xf0 [86.862738] kernfs_add_one+0x30/0x3c0 [86.862751] kernfs_create_dir_ns+0x53/0xb0 [86.862765] internal_create_group+0x134/0x4c0 [86.862779] sysfs_create_group+0x13/0x20 [86.862792] topology_add_dev+0x1d/0x30 [86.862806] cpuhp_invoke_callback+0x4b5/0x850 [86.862822] cpuhp_issue_call+0xbf/0x1f0 [86.862836] __cpuhp_setup_state_cpuslocked+0x111/0x320 [86.862852] __cpuhp_setup_state+0xb0/0x220 [86.862866] topology_sysfs_init+0x30/0x50 [86.862879] do_one_initcall+0x60/0x3f0 [86.862893] kernel_init_freeable+0x3cd/0x680 [86.862908] kernel_init+0x1b/0x200 [86.862921] ret_from_fork+0x47/0x70 [86.862934] ret_from_fork_asm+0x1a/0x30 [86.862947] -> #1 (cpuhp_state_mutex){+.+.}-{3:3}: [86.862969] __mutex_lock+0xaa/0xed0 [86.862982] mutex_lock_nested+0x1b/0x30 [86.862995] __cpuhp_setup_state_cpuslocked+0x67/0x320 [86.863012] __cpuhp_setup_state+0xb0/0x220 [86.863026] page_alloc_init_cpuhp+0x2d/0x60 [86.863041] mm_core_init+0x22/0x2d0 [86.863054] start_kernel+0x576/0xbd0 [86.863068] x86_64_start_reservations+0x18/0x30 [86.863084] x86_64_start_kernel+0xbf/0x110 [86.863098] common_startup_64+0x13e/0x141 [86.863114] -> #0 (cpu_hotplug_lock){++++}-{0:0}: [86.863135] __lock_acquire+0x1635/0x2810 [86.863152] lock_acquire+0xc4/0x2f0 [86.863166] cpus_read_lock+0x41/0x100 [86.863180] stop_machine+0x1c/0x50 [86.863194] bxt_vtd_ggtt_insert_entries__BKL+0x3b/0x60 [i915] [86.863987] intel_ggtt_bind_vma+0x43/0x70 [i915] [86.864735] __vma_bind+0x55/0x70 [i915] [86.865510] fence_work+0x26/0xa0 [i915] [86.866248] fence_notify+0xa1/0x140 [i915] [86.866983] __i915_sw_fence_complete+0x8f/0x270 [i915] [86.867719] i915_sw_fence_commit+0x39/0x60 [i915] [86.868453] i915_vma_pin_ww+0x462/0x1360 [i915] [86.869228] i915_vma_pin.constprop.0+0x133/0x1d0 [i915] [86.870001] initial_plane_vma+0x307/0x840 [i915] [86.870774] intel_initial_plane_config+0x33f/0x670 [i915] [86.871546] intel_display_driver_probe_nogem+0x1c6/0x260 [i915] [86.872330] i915_driver_probe+0x7fa/0xe80 [i915] [86.873057] i915_pci_probe+0xe6/0x220 [i915] [86.873782] local_pci_probe+0x47/0xb0 [86.873802] pci_device_probe+0xf3/0x260 [86.873817] really_probe+0xf1/0x3c0 [86.873833] __driver_probe_device+0x8c/0x180 [86.873848] driver_probe_device+0x24/0xd0 [86.873862] __driver_attach+0x10f/0x220 [86.873876] bus_for_each_dev+0x7f/0xe0 [86.873892] driver_attach+0x1e/0x30 [86.873904] bus_add_driver+0x151/0x290 [86.873917] driver_register+0x5e/0x130 [86.873931] __pci_register_driver+0x7d/0x90 [86.873945] i915_pci_register_driver+0x23/0x30 [i915] [86.874678] i915_init+0x37/0x120 [i915] [86.875347] do_one_initcall+0x60/0x3f0 [86.875369] do_init_module+0x97/0x2a0 [86.875385] load_module+0x2c54/0x2d80 [86.875398] init_module_from_file+0x96/0xe0 [86.875413] idempotent_init_module+0x117/0x330 [86.875426] __x64_sys_finit_module+0x77/0x100 [86.875440] x64_sys_call+0x24de/0x2660 [86.875454] do_syscall_64+0x91/0xe90 [86.875470] entry_SYSCALL_64_after_hwframe+0x76/0x7e [86.875486] other info that might help us debug this: [86.875502] Chain exists of: cpu_hotplug_lock --> reservation_ww_class_acquire --> reservation_ww_class_mutex [86.875539] Possible unsafe locking scenario: [86.875552] CPU0 CPU1 [86.875563] ---- ---- [86.875573] lock(reservation_ww_class_mutex); [86.875588] lock(reservation_ww_class_acquire); [86.875606] lock(reservation_ww_class_mutex); [86.875624] rlock(cpu_hotplug_lock); [86.875637] *** DEADLOCK *** [86.875650] 3 locks held by i915_module_loa/1432: [86.875663] #0: ffff888101f5c1b0 (&dev->mutex){....}-{3:3}, at: __driver_attach+0x104/0x220 [86.875699] #1: ffffc90002e0b4a0 (reservation_ww_class_acquire){+.+.}-{0:0}, at: i915_vma_pin.constprop.0+0x39/0x1d0 [i915] [86.876512] #2: ffffc90002e0b4c8 (reservation_ww_class_mutex){+.+.}-{3:3}, at: i915_vma_pin.constprop.0+0x39/0x1d0 [i915] [86.877305] stack backtrace: [86.877326] CPU: 0 UID: 0 PID: 1432 Comm: i915_module_loa Tainted: G U 6.15.0-rc5-CI_DRM_16515-gca0305cadc2d+ #1 PREEMPT(voluntary) [86.877334] Tainted: [U]=USER [86.877336] Hardware name: /NUC5CPYB, BIOS PYBSWCEL.86A.0079.2020.0420.1316 04/20/2020 [86.877339] Call Trace: [86.877344] <TASK> [86.877353] dump_stack_lvl+0x91/0xf0 [86.877364] dump_stack+0x10/0x20 [86.877369] print_circular_bug+0x285/0x360 [86.877379] check_noncircular+0x135/0x150 [86.877390] __lock_acquire+0x1635/0x2810 [86.877403] lock_acquire+0xc4/0x2f0 [86.877408] ? stop_machine+0x1c/0x50 [86.877422] ? __pfx_bxt_vtd_ggtt_insert_entries__cb+0x10/0x10 [i915] [86.878173] cpus_read_lock+0x41/0x100 [86.878182] ? stop_machine+0x1c/0x50 [86.878191] ? __pfx_bxt_vtd_ggtt_insert_entries__cb+0x10/0x10 [i915] [86.878916] stop_machine+0x1c/0x50 [86.878927] bxt_vtd_ggtt_insert_entries__BKL+0x3b/0x60 [i915] [86.879652] intel_ggtt_bind_vma+0x43/0x70 [i915] [86.880375] __vma_bind+0x55/0x70 [i915] [86.881133] fence_work+0x26/0xa0 [i915] [86.881851] fence_notify+0xa1/0x140 [i915] [86.882566] __i915_sw_fence_complete+0x8f/0x270 [i915] [86.883286] i915_sw_fence_commit+0x39/0x60 [i915] [86.884003] i915_vma_pin_ww+0x462/0x1360 [i915] [86.884756] ? i915_vma_pin.constprop.0+0x6c/0x1d0 [i915] [86.885513] i915_vma_pin.constprop.0+0x133/0x1d0 [i915] [86.886281] initial_plane_vma+0x307/0x840 [i915] [86.887049] intel_initial_plane_config+0x33f/0x670 [i915] [86.887819] intel_display_driver_probe_nogem+0x1c6/0x260 [i915] [86.888587] i915_driver_probe+0x7fa/0xe80 [i915] [86.889293] ? mutex_unlock+0x12/0x20 [86.889301] ? drm_privacy_screen_get+0x171/0x190 [86.889308] ? acpi_dev_found+0x66/0x80 [86.889321] i915_pci_probe+0xe6/0x220 [i915] [86.890038] local_pci_probe+0x47/0xb0 [86.890049] pci_device_probe+0xf3/0x260 [86.890058] really_probe+0xf1/0x3c0 [86.890067] __driver_probe_device+0x8c/0x180 [86.890072] driver_probe_device+0x24/0xd0 [86.890078] __driver_attach+0x10f/0x220 [86.890083] ? __pfx___driver_attach+0x10/0x10 [86.890088] bus_for_each_dev+0x7f/0xe0 [86.890097] driver_attach+0x1e/0x30 [86.890101] bus_add_driver+0x151/0x290 [86.890107] driver_register+0x5e/0x130 [86.890113] __pci_register_driver+0x7d/0x90 [86.890119] i915_pci_register_driver+0x23/0x30 [i915] [86.890833] i915_init+0x37/0x120 [i915] [86.891482] ? __pfx_i915_init+0x10/0x10 [i915] [86.892135] do_one_initcall+0x60/0x3f0 [86.892145] ? __kmalloc_cache_noprof+0x33f/0x470 [86.892157] do_init_module+0x97/0x2a0 [86.892164] load_module+0x2c54/0x2d80 [86.892168] ? __kernel_read+0x15c/0x300 [86.892185] ? kernel_read_file+0x2b1/0x320 [86.892195] init_module_from_file+0x96/0xe0 [86.892199] ? init_module_from_file+0x96/0xe0 [86.892211] idempotent_init_module+0x117/0x330 [86.892224] __x64_sys_finit_module+0x77/0x100 [86.892230] x64_sys_call+0x24de/0x2660 [86.892236] do_syscall_64+0x91/0xe90 [86.892243] ? irqentry_exit+0x77/0xb0 [86.892249] ? sysvec_apic_timer_interrupt+0x57/0xc0 [86.892256] entry_SYSCALL_64_after_hwframe+0x76/0x7e [86.892261] RIP: 0033:0x7303e1b2725d [86.892271] Code: ff c3 66 2e 0f 1f 84 00 00 00 00 00 90 f3 0f 1e fa 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 8b bb 0d 00 f7 d8 64 89 01 48 [86.892276] RSP: 002b:00007ffddd1fdb38 EFLAGS: 00000246 ORIG_RAX: 0000000000000139 [86.892281] RAX: ffffffffffffffda RBX: 00005d771d88fd90 RCX: 00007303e1b2725d [86.892285] RDX: 0000000000000000 RSI: 00005d771d893aa0 RDI: 000000000000000c [86.892287] RBP: 00007ffddd1fdbf0 R08: 0000000000000040 R09: 00007ffddd1fdb80 [86.892289] R10: 00007303e1c03b20 R11: 0000000000000246 R12: 00005d771d893aa0 [86.892292] R13: 0000000000000000 R14: 00005d771d88f0d0 R15: 00005d771d895710 [86.892304] </TASK> Call asynchronous variant of dma_fence_work_commit() in that case. v3: Provide more verbose in-line comment (Andi), - mention target environments in commit message. Fixes: 7d1c261 ("drm/i915: Take reservation lock around i915_vma_pin.") Closes: https://gitlab.freedesktop.org/drm/i915/kernel/-/issues/14985 Cc: Andi Shyti <andi.shyti@kernel.org> Signed-off-by: Janusz Krzysztofik <janusz.krzysztofik@linux.intel.com> Reviewed-by: Sebastian Brzezinka <sebastian.brzezinka@intel.com> Reviewed-by: Krzysztof Karas <krzysztof.karas@intel.com> Acked-by: Andi Shyti <andi.shyti@linux.intel.com> Signed-off-by: Andi Shyti <andi.shyti@linux.intel.com> Link: https://lore.kernel.org/r/20251023082925.351307-6-janusz.krzysztofik@linux.intel.com (cherry picked from commit 648ef13) Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
From v6.13 -> v6.14 the selftests kmod takes around 10s longer to run. Use KMOD_TIMEOUT Makefile cli variable to increase the default 165s timeout to 175s. Output: ok 1 selftests: kmod: kmod.sh real 2m47.077s user 0m0.062s sys 0m0.235s kdevops@dani-kmod:/data/selftests/kselftest/kselftest_install$ uname -a Linux dani-kmod 6.14.0 #3 SMP PREEMPT_DYNAMIC Fri Jun 27 18:28:16 UTC 2025 x86_64 GNU/Linux ok 1 selftests: kmod: kmod.sh real 2m41.456s user 0m0.068s sys 0m0.241s kdevops@dani-kmod:/data/selftests/kselftest/kselftest_install$ uname -a Linux dani-kmod 6.13.0 #2 SMP PREEMPT_DYNAMIC Fri Jun 27 17:13:42 UTC 2025 x86_64 GNU/Linux Signed-off-by: Daniel Gomez <da.gomez@samsung.com>
Add VMX exit handlers for SEAMCALL and TDCALL to inject a #UD if a non-TD guest attempts to execute SEAMCALL or TDCALL. Neither SEAMCALL nor TDCALL is gated by any software enablement other than VMXON, and so will generate a VM-Exit instead of e.g. a native #UD when executed from the guest kernel. Note! No unprivileged DoS of the L1 kernel is possible as TDCALL and SEAMCALL #GP at CPL > 0, and the CPL check is performed prior to the VMX non-root (VM-Exit) check, i.e. userspace can't crash the VM. And for a nested guest, KVM forwards unknown exits to L1, i.e. an L2 kernel can crash itself, but not L1. Note #2! The Intel® Trust Domain CPU Architectural Extensions spec's pseudocode shows the CPL > 0 check for SEAMCALL coming _after_ the VM-Exit, but that appears to be a documentation bug (likely because the CPL > 0 check was incorrectly bundled with other lower-priority #GP checks). Testing on SPR and EMR shows that the CPL > 0 check is performed before the VMX non-root check, i.e. SEAMCALL #GPs when executed in usermode. Note #3! The aforementioned Trust Domain spec uses confusing pseudocode that says that SEAMCALL will #UD if executed "inSEAM", but "inSEAM" specifically means in SEAM Root Mode, i.e. in the TDX-Module. The long- form description explicitly states that SEAMCALL generates an exit when executed in "SEAM VMX non-root operation". But that's a moot point as the TDX-Module injects #UD if the guest attempts to execute SEAMCALL, as documented in the "Unconditionally Blocked Instructions" section of the TDX-Module base specification. Cc: stable@vger.kernel.org Cc: Kai Huang <kai.huang@intel.com> Cc: Xiaoyao Li <xiaoyao.li@intel.com> Cc: Rick Edgecombe <rick.p.edgecombe@intel.com> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Binbin Wu <binbin.wu@linux.intel.com> Reviewed-by: Kai Huang <kai.huang@intel.com> Reviewed-by: Binbin Wu <binbin.wu@linux.intel.com> Reviewed-by: Xiaoyao Li <xiaoyao.li@intel.com> Link: https://lore.kernel.org/r/20251016182148.69085-2-seanjc@google.com Signed-off-by: Sean Christopherson <seanjc@google.com>
From v6.13 -> v6.14 the selftests kmod takes around 10s longer to run. Use KMOD_TIMEOUT Makefile cli variable to increase the default 165s timeout to 175s. Output: ok 1 selftests: kmod: kmod.sh real 2m47.077s user 0m0.062s sys 0m0.235s kdevops@dani-kmod:/data/selftests/kselftest/kselftest_install$ uname -a Linux dani-kmod 6.14.0 #3 SMP PREEMPT_DYNAMIC Fri Jun 27 18:28:16 UTC 2025 x86_64 GNU/Linux ok 1 selftests: kmod: kmod.sh real 2m41.456s user 0m0.068s sys 0m0.241s kdevops@dani-kmod:/data/selftests/kselftest/kselftest_install$ uname -a Linux dani-kmod 6.13.0 #2 SMP PREEMPT_DYNAMIC Fri Jun 27 17:13:42 UTC 2025 x86_64 GNU/Linux Signed-off-by: Daniel Gomez <da.gomez@samsung.com>
…/kernel/git/kvmarm/kvmarm into HEAD KVM/arm64 fixes for 6.18, take #3 - Only adjust the ID registers when no irqchip has been created once per VM run, instead of doing it once per vcpu, as this otherwise triggers a pretty bad conbsistency check failure in the sysreg code. - Make sure the per-vcpu Fine Grain Traps are computed before we load the system registers on the HW, as we otherwise start running without anything set until the first preemption of the vcpu.
From v6.13 -> v6.14 the selftests kmod takes around 10s longer to run. Use KMOD_TIMEOUT Makefile cli variable to increase the default 165s timeout to 175s. Output: ok 1 selftests: kmod: kmod.sh real 2m47.077s user 0m0.062s sys 0m0.235s kdevops@dani-kmod:/data/selftests/kselftest/kselftest_install$ uname -a Linux dani-kmod 6.14.0 #3 SMP PREEMPT_DYNAMIC Fri Jun 27 18:28:16 UTC 2025 x86_64 GNU/Linux ok 1 selftests: kmod: kmod.sh real 2m41.456s user 0m0.068s sys 0m0.241s kdevops@dani-kmod:/data/selftests/kselftest/kselftest_install$ uname -a Linux dani-kmod 6.13.0 #2 SMP PREEMPT_DYNAMIC Fri Jun 27 17:13:42 UTC 2025 x86_64 GNU/Linux Signed-off-by: Daniel Gomez <da.gomez@samsung.com>
For some reason, of_find_node_with_property() is creating a spinlock recursion issue along with fwnode_count_parents(), and this issue is making all MediaTek boards unbootable. As of kernel v6.18-rc6, there are only three users of this function, one of which is this driver. Migrate away from of_find_node_with_property() by adding a local scpsys_get_legacy_regmap_node() function, which acts similarly to of_find_node_with_property(), and calling the former in place of the latter. This resolves the following spinlock recursion issue: [ 1.773979] BUG: spinlock recursion on CPU#2, kworker/u24:1/60 [ 1.790485] lock: devtree_lock+0x0/0x40, .magic: dead4ead, .owner: kworker/u24:1/60, .owner_cpu: 2 [ 1.791644] CPU: 2 UID: 0 PID: 60 Comm: kworker/u24:1 Tainted: G W 6.18.0-rc6 #3 PREEMPT [ 1.791649] Tainted: [W]=WARN [ 1.791650] Hardware name: MediaTek Genio-510 EVK (DT) [ 1.791653] Workqueue: events_unbound deferred_probe_work_func [ 1.791658] Call trace: [ 1.791659] show_stack+0x18/0x30 (C) [ 1.791664] dump_stack_lvl+0x68/0x94 [ 1.791668] dump_stack+0x18/0x24 [ 1.791672] spin_dump+0x78/0x88 [ 1.791678] do_raw_spin_lock+0x110/0x140 [ 1.791684] _raw_spin_lock_irqsave+0x58/0x6c [ 1.791690] of_get_parent+0x28/0x74 [ 1.791694] of_fwnode_get_parent+0x38/0x7c [ 1.791700] fwnode_count_parents+0x34/0xf0 [ 1.791705] fwnode_full_name_string+0x28/0x120 [ 1.791710] device_node_string+0x3e4/0x50c [ 1.791715] pointer+0x294/0x430 [ 1.791718] vsnprintf+0x21c/0x5bc [ 1.791722] vprintk_store+0x108/0x47c [ 1.791728] vprintk_emit+0xc4/0x350 [ 1.791732] vprintk_default+0x34/0x40 [ 1.791736] vprintk+0x24/0x30 [ 1.791740] _printk+0x60/0x8c [ 1.791744] of_node_release+0x154/0x194 [ 1.791749] kobject_put+0xa0/0x120 [ 1.791753] of_node_put+0x18/0x28 [ 1.791756] of_find_node_with_property+0x74/0x100 [ 1.791761] scpsys_probe+0x338/0x5e0 [ 1.791765] platform_probe+0x5c/0xa4 [ 1.791770] really_probe+0xbc/0x2ac [ 1.791774] __driver_probe_device+0x78/0x118 [ 1.791779] driver_probe_device+0x3c/0x170 [ 1.791783] __device_attach_driver+0xb8/0x150 [ 1.791788] bus_for_each_drv+0x88/0xe8 [ 1.791792] __device_attach+0x9c/0x1a0 [ 1.791796] device_initial_probe+0x14/0x20 [ 1.791801] bus_probe_device+0xa0/0xa4 [ 1.791805] deferred_probe_work_func+0x88/0xd0 [ 1.791809] process_one_work+0x1e8/0x448 [ 1.791813] worker_thread+0x1ac/0x340 [ 1.791816] kthread+0x138/0x220 [ 1.791821] ret_from_fork+0x10/0x20 Fixes: c29345f ("pmdomain: mediatek: Refactor bus protection regmaps retrieval") Signed-off-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com> Tested-by: Louis-Alexis Eyraud <louisalexis.eyraud@collabora.com> Tested-by: Macpaul Lin <macpaul.lin@mediatek.com> Reviewed-by: Macpaul Lin <macpaul.lin@mediatek.com> Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
The U-Blox EVK-M101 enumerates as 1546:0506 [1] with four FTDI interfaces: - EVK-M101 current sensors - EVK-M101 I2C - EVK-M101 UART - EVK-M101 port D Only the third USB interface is a UART. This change lets ftdi_sio probe the VID/PID and registers only interface #3 as a TTY, leaving the rest available for other drivers. [1] usb 5-1.3: new high-speed USB device number 11 using xhci_hcd usb 5-1.3: New USB device found, idVendor=1546, idProduct=0506, bcdDevice= 8.00 usb 5-1.3: New USB device strings: Mfr=1, Product=2, SerialNumber=0 usb 5-1.3: Product: EVK-M101 usb 5-1.3: Manufacturer: u-blox AG Datasheet: https://content.u-blox.com/sites/default/files/documents/EVK-M10_UserGuide_UBX-21003949.pdf Signed-off-by: Oleksandr Suvorov <cryosay@gmail.com> Link: https://lore.kernel.org/20250926060235.3442748-1-cryosay@gmail.com/ Cc: stable@vger.kernel.org Signed-off-by: Johan Hovold <johan@kernel.org>
From v6.13 -> v6.14 the selftests kmod takes around 10s longer to run. Use KMOD_TIMEOUT Makefile cli variable to increase the default 165s timeout to 175s. Output: ok 1 selftests: kmod: kmod.sh real 2m47.077s user 0m0.062s sys 0m0.235s kdevops@dani-kmod:/data/selftests/kselftest/kselftest_install$ uname -a Linux dani-kmod 6.14.0 #3 SMP PREEMPT_DYNAMIC Fri Jun 27 18:28:16 UTC 2025 x86_64 GNU/Linux ok 1 selftests: kmod: kmod.sh real 2m41.456s user 0m0.068s sys 0m0.241s kdevops@dani-kmod:/data/selftests/kselftest/kselftest_install$ uname -a Linux dani-kmod 6.13.0 #2 SMP PREEMPT_DYNAMIC Fri Jun 27 17:13:42 UTC 2025 x86_64 GNU/Linux Signed-off-by: Daniel Gomez <da.gomez@samsung.com>
We sometimes observe use-after-free when dereferencing a neighbour [1]. The problem seems to be that the driver stores a pointer to the neighbour, but without holding a reference on it. A reference is only taken when the neighbour is used by a nexthop. Fix by simplifying the reference counting scheme. Always take a reference when storing a neighbour pointer in a neighbour entry. Avoid taking a referencing when the neighbour is used by a nexthop as the neighbour entry associated with the nexthop already holds a reference. Tested by running the test that uncovered the problem over 300 times. Without this patch the problem was reproduced after a handful of iterations. [1] BUG: KASAN: slab-use-after-free in mlxsw_sp_neigh_entry_update+0x2d4/0x310 Read of size 8 at addr ffff88817f8e3420 by task ip/3929 CPU: 3 UID: 0 PID: 3929 Comm: ip Not tainted 6.18.0-rc4-virtme-g36b21a067510 #3 PREEMPT(full) Hardware name: Nvidia SN5600/VMOD0013, BIOS 5.13 05/31/2023 Call Trace: <TASK> dump_stack_lvl+0x6f/0xa0 print_address_description.constprop.0+0x6e/0x300 print_report+0xfc/0x1fb kasan_report+0xe4/0x110 mlxsw_sp_neigh_entry_update+0x2d4/0x310 mlxsw_sp_router_rif_gone_sync+0x35f/0x510 mlxsw_sp_rif_destroy+0x1ea/0x730 mlxsw_sp_inetaddr_port_vlan_event+0xa1/0x1b0 __mlxsw_sp_inetaddr_lag_event+0xcc/0x130 __mlxsw_sp_inetaddr_event+0xf5/0x3c0 mlxsw_sp_router_netdevice_event+0x1015/0x1580 notifier_call_chain+0xcc/0x150 call_netdevice_notifiers_info+0x7e/0x100 __netdev_upper_dev_unlink+0x10b/0x210 netdev_upper_dev_unlink+0x79/0xa0 vrf_del_slave+0x18/0x50 do_set_master+0x146/0x7d0 do_setlink.isra.0+0x9a0/0x2880 rtnl_newlink+0x637/0xb20 rtnetlink_rcv_msg+0x6fe/0xb90 netlink_rcv_skb+0x123/0x380 netlink_unicast+0x4a3/0x770 netlink_sendmsg+0x75b/0xc90 __sock_sendmsg+0xbe/0x160 ____sys_sendmsg+0x5b2/0x7d0 ___sys_sendmsg+0xfd/0x180 __sys_sendmsg+0x124/0x1c0 do_syscall_64+0xbb/0xfd0 entry_SYSCALL_64_after_hwframe+0x4b/0x53 [...] Allocated by task 109: kasan_save_stack+0x30/0x50 kasan_save_track+0x14/0x30 __kasan_kmalloc+0x7b/0x90 __kmalloc_noprof+0x2c1/0x790 neigh_alloc+0x6af/0x8f0 ___neigh_create+0x63/0xe90 mlxsw_sp_nexthop_neigh_init+0x430/0x7e0 mlxsw_sp_nexthop_type_init+0x212/0x960 mlxsw_sp_nexthop6_group_info_init.constprop.0+0x81f/0x1280 mlxsw_sp_nexthop6_group_get+0x392/0x6a0 mlxsw_sp_fib6_entry_create+0x46a/0xfd0 mlxsw_sp_router_fib6_replace+0x1ed/0x5f0 mlxsw_sp_router_fib6_event_work+0x10a/0x2a0 process_one_work+0xd57/0x1390 worker_thread+0x4d6/0xd40 kthread+0x355/0x5b0 ret_from_fork+0x1d4/0x270 ret_from_fork_asm+0x11/0x20 Freed by task 154: kasan_save_stack+0x30/0x50 kasan_save_track+0x14/0x30 __kasan_save_free_info+0x3b/0x60 __kasan_slab_free+0x43/0x70 kmem_cache_free_bulk.part.0+0x1eb/0x5e0 kvfree_rcu_bulk+0x1f2/0x260 kfree_rcu_work+0x130/0x1b0 process_one_work+0xd57/0x1390 worker_thread+0x4d6/0xd40 kthread+0x355/0x5b0 ret_from_fork+0x1d4/0x270 ret_from_fork_asm+0x11/0x20 Last potentially related work creation: kasan_save_stack+0x30/0x50 kasan_record_aux_stack+0x8c/0xa0 kvfree_call_rcu+0x93/0x5b0 mlxsw_sp_router_neigh_event_work+0x67d/0x860 process_one_work+0xd57/0x1390 worker_thread+0x4d6/0xd40 kthread+0x355/0x5b0 ret_from_fork+0x1d4/0x270 ret_from_fork_asm+0x11/0x20 Fixes: 6cf3c97 ("mlxsw: spectrum_router: Add private neigh table") Signed-off-by: Ido Schimmel <idosch@nvidia.com> Reviewed-by: Petr Machata <petrm@nvidia.com> Signed-off-by: Petr Machata <petrm@nvidia.com> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/92d75e21d95d163a41b5cea67a15cd33f547cba6.1764695650.git.petrm@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Petr Machata says: ==================== selftests: forwarding: vxlan_bridge_1q_mc_ul: Fix flakiness The net/forwarding/vxlan_bridge_1q_mc_ul selftest runs an overlay traffic, forwarded over a multicast-routed VXLAN underlay. In order to determine whether packets reach their intended destination, it uses a TC match. For convenience, it uses a flower match, which however does not allow matching on the encapsulated packet. So various service traffic ends up being indistinguishable from the test packets, and ends up confusing the test. To alleviate the problem, the test uses sleep to allow the necessary service traffic to run and clear the channel, before running the test traffic. This worked for a while, but lately we have nevertheless seen flakiness of the test in the CI. In this patchset, first generalize tc_rule_stats_get() to support u32 in patch #1, then in patch #2 convert the test to use u32 to allow parsing deeper into the packet, and in #3 drop the now-unnecessary sleep. ==================== Link: https://patch.msgid.link/cover.1765289566.git.petrm@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Fix a loop scenario of ethx:egress->ethx:egress
Example setup to reproduce:
tc qdisc add dev ethx root handle 1: drr
tc filter add dev ethx parent 1: protocol ip prio 1 matchall \
action mirred egress redirect dev ethx
Now ping out of ethx and you get a deadlock:
[ 116.892898][ T307] ============================================
[ 116.893182][ T307] WARNING: possible recursive locking detected
[ 116.893418][ T307] 6.18.0-rc6-01205-ge05021a829b8-dirty torvalds#204 Not tainted
[ 116.893682][ T307] --------------------------------------------
[ 116.893926][ T307] ping/307 is trying to acquire lock:
[ 116.894133][ T307] ffff88800c122908 (&sch->root_lock_key){+...}-{3:3}, at: __dev_queue_xmit+0x2210/0x3b50
[ 116.894517][ T307]
[ 116.894517][ T307] but task is already holding lock:
[ 116.894836][ T307] ffff88800c122908 (&sch->root_lock_key){+...}-{3:3}, at: __dev_queue_xmit+0x2210/0x3b50
[ 116.895252][ T307]
[ 116.895252][ T307] other info that might help us debug this:
[ 116.895608][ T307] Possible unsafe locking scenario:
[ 116.895608][ T307]
[ 116.895901][ T307] CPU0
[ 116.896057][ T307] ----
[ 116.896200][ T307] lock(&sch->root_lock_key);
[ 116.896392][ T307] lock(&sch->root_lock_key);
[ 116.896605][ T307]
[ 116.896605][ T307] *** DEADLOCK ***
[ 116.896605][ T307]
[ 116.896864][ T307] May be due to missing lock nesting notation
[ 116.896864][ T307]
[ 116.897123][ T307] 6 locks held by ping/307:
[ 116.897302][ T307] #0: ffff88800b4b0250 (sk_lock-AF_INET){+.+.}-{0:0}, at: raw_sendmsg+0xb20/0x2cf0
[ 116.897808][ T307] #1: ffffffff88c839c0 (rcu_read_lock){....}-{1:3}, at: ip_output+0xa9/0x600
[ 116.898138][ T307] #2: ffffffff88c839c0 (rcu_read_lock){....}-{1:3}, at: ip_finish_output2+0x2c6/0x1ee0
[ 116.898459][ T307] #3: ffffffff88c83960 (rcu_read_lock_bh){....}-{1:3}, at: __dev_queue_xmit+0x200/0x3b50
[ 116.898782][ T307] #4: ffff88800c122908 (&sch->root_lock_key){+...}-{3:3}, at: __dev_queue_xmit+0x2210/0x3b50
[ 116.899132][ T307] #5: ffffffff88c83960 (rcu_read_lock_bh){....}-{1:3}, at: __dev_queue_xmit+0x200/0x3b50
[ 116.899442][ T307]
[ 116.899442][ T307] stack backtrace:
[ 116.899667][ T307] CPU: 2 UID: 0 PID: 307 Comm: ping Not tainted 6.18.0-rc6-01205-ge05021a829b8-dirty torvalds#204 PREEMPT(voluntary)
[ 116.899672][ T307] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011
[ 116.899675][ T307] Call Trace:
[ 116.899678][ T307] <TASK>
[ 116.899680][ T307] dump_stack_lvl+0x6f/0xb0
[ 116.899688][ T307] print_deadlock_bug.cold+0xc0/0xdc
[ 116.899695][ T307] __lock_acquire+0x11f7/0x1be0
[ 116.899704][ T307] lock_acquire+0x162/0x300
[ 116.899707][ T307] ? __dev_queue_xmit+0x2210/0x3b50
[ 116.899713][ T307] ? srso_alias_return_thunk+0x5/0xfbef5
[ 116.899717][ T307] ? stack_trace_save+0x93/0xd0
[ 116.899723][ T307] _raw_spin_lock+0x30/0x40
[ 116.899728][ T307] ? __dev_queue_xmit+0x2210/0x3b50
[ 116.899731][ T307] __dev_queue_xmit+0x2210/0x3b50
Fixes: 178ca30 ("Revert "net/sched: Fix mirred deadlock on device recursion"")
Tested-by: Victor Nogueira <victor@mojatatu.com>
Signed-off-by: Jamal Hadi Salim <jhs@mojatatu.com>
Link: https://patch.msgid.link/20251210162255.1057663-1-jhs@mojatatu.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Reject attempts to disable KVM_MEM_GUEST_MEMFD on a memslot that was initially created with a guest_memfd binding, as KVM doesn't support toggling KVM_MEM_GUEST_MEMFD on existing memslots. KVM prevents enabling KVM_MEM_GUEST_MEMFD, but doesn't prevent clearing the flag. Failure to reject the new memslot results in a use-after-free due to KVM not unbinding from the guest_memfd instance. Unbinding on a FLAGS_ONLY change is easy enough, and can/will be done as a hardening measure (in anticipation of KVM supporting dirty logging on guest_memfd at some point), but fixing the use-after-free would only address the immediate symptom. ================================================================== BUG: KASAN: slab-use-after-free in kvm_gmem_release+0x362/0x400 [kvm] Write of size 8 at addr ffff8881111ae908 by task repro/745 CPU: 7 UID: 1000 PID: 745 Comm: repro Not tainted 6.18.0-rc6-115d5de2eef3-next-kasan #3 NONE Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 0.0.0 02/06/2015 Call Trace: <TASK> dump_stack_lvl+0x51/0x60 print_report+0xcb/0x5c0 kasan_report+0xb4/0xe0 kvm_gmem_release+0x362/0x400 [kvm] __fput+0x2fa/0x9d0 task_work_run+0x12c/0x200 do_exit+0x6ae/0x2100 do_group_exit+0xa8/0x230 __x64_sys_exit_group+0x3a/0x50 x64_sys_call+0x737/0x740 do_syscall_64+0x5b/0x900 entry_SYSCALL_64_after_hwframe+0x4b/0x53 RIP: 0033:0x7f581f2eac31 </TASK> Allocated by task 745 on cpu 6 at 9.746971s: kasan_save_stack+0x20/0x40 kasan_save_track+0x13/0x50 __kasan_kmalloc+0x77/0x90 kvm_set_memory_region.part.0+0x652/0x1110 [kvm] kvm_vm_ioctl+0x14b0/0x3290 [kvm] __x64_sys_ioctl+0x129/0x1a0 do_syscall_64+0x5b/0x900 entry_SYSCALL_64_after_hwframe+0x4b/0x53 Freed by task 745 on cpu 6 at 9.747467s: kasan_save_stack+0x20/0x40 kasan_save_track+0x13/0x50 __kasan_save_free_info+0x37/0x50 __kasan_slab_free+0x3b/0x60 kfree+0xf5/0x440 kvm_set_memslot+0x3c2/0x1160 [kvm] kvm_set_memory_region.part.0+0x86a/0x1110 [kvm] kvm_vm_ioctl+0x14b0/0x3290 [kvm] __x64_sys_ioctl+0x129/0x1a0 do_syscall_64+0x5b/0x900 entry_SYSCALL_64_after_hwframe+0x4b/0x53 Reported-by: Alexander Potapenko <glider@google.com> Fixes: a7800aa ("KVM: Add KVM_CREATE_GUEST_MEMFD ioctl() for guest-specific backing memory") Cc: stable@vger.kernel.org Link: https://patch.msgid.link/20251202020334.1171351-2-seanjc@google.com Signed-off-by: Sean Christopherson <seanjc@google.com>
From v6.13 -> v6.14 the selftests kmod takes around 10s longer to run. Use KMOD_TIMEOUT Makefile cli variable to increase the default 165s timeout to 175s. Output: ok 1 selftests: kmod: kmod.sh real 2m47.077s user 0m0.062s sys 0m0.235s kdevops@dani-kmod:/data/selftests/kselftest/kselftest_install$ uname -a Linux dani-kmod 6.14.0 #3 SMP PREEMPT_DYNAMIC Fri Jun 27 18:28:16 UTC 2025 x86_64 GNU/Linux ok 1 selftests: kmod: kmod.sh real 2m41.456s user 0m0.068s sys 0m0.241s kdevops@dani-kmod:/data/selftests/kselftest/kselftest_install$ uname -a Linux dani-kmod 6.13.0 #2 SMP PREEMPT_DYNAMIC Fri Jun 27 17:13:42 UTC 2025 x86_64 GNU/Linux Signed-off-by: Daniel Gomez <da.gomez@samsung.com>
From v6.13 -> v6.14 the selftests kmod takes around 10s longer to run. Use KMOD_TIMEOUT Makefile cli variable to increase the default 165s timeout to 175s. Output: ok 1 selftests: kmod: kmod.sh real 2m47.077s user 0m0.062s sys 0m0.235s kdevops@dani-kmod:/data/selftests/kselftest/kselftest_install$ uname -a Linux dani-kmod 6.14.0 #3 SMP PREEMPT_DYNAMIC Fri Jun 27 18:28:16 UTC 2025 x86_64 GNU/Linux ok 1 selftests: kmod: kmod.sh real 2m41.456s user 0m0.068s sys 0m0.241s kdevops@dani-kmod:/data/selftests/kselftest/kselftest_install$ uname -a Linux dani-kmod 6.13.0 #2 SMP PREEMPT_DYNAMIC Fri Jun 27 17:13:42 UTC 2025 x86_64 GNU/Linux Signed-off-by: Daniel Gomez <da.gomez@samsung.com>
Pull request for series with
subject: x86/module: use large ROX pages for text allocations
version: 4
url: https://patchwork.kernel.org/project/linux-modules/list/?series=896063