Update regcache-maple.c #1

namiltd · 2024-07-25T21:45:22Z

Initialize variables

…mode in i.MX 8QM Fix the issue where MEM_TO_MEM fail on i.MX8QM due to the requirement that both source and destination addresses need pass through the IOMMU. Typically, peripheral FIFO addresses bypass the IOMMU, necessitating only one of the source or destination to go through it. Set "is_remote" to true to ensure both source and destination addresses pass through the IOMMU. iMX8 Spec define "Local" and "Remote" bus as below. Local bus: bypass IOMMU to directly access other peripheral register, such as FIFO. Remote bus: go through IOMMU to access system memory. The test fail log as follow: [ 66.268506] dmatest: dma0chan0-copy0: result #1: 'test timed out' with src_off=0x100 dst_off=0x80 len=0x3ec0 (0) [ 66.278785] dmatest: dma0chan0-copy0: summary 1 tests, 1 failures 0.32 iops 4 KB/s (0) Fixes: 72f5801 ("dmaengine: fsl-edma: integrate v3 support") Signed-off-by: Joy Zou <joy.zou@nxp.com> Cc: stable@vger.kernel.org Reviewed-by: Frank Li <Frank.Li@nxp.com> Link: https://lore.kernel.org/r/20240510030959.703663-1-joy.zou@nxp.com Signed-off-by: Vinod Koul <vkoul@kernel.org>

Fix warning at drivers/pci/msi/msi.h:121. Recently, I added a PCI to PCIe bridge adaptor and a PCIe NVME card to my rp3440. Then, I noticed this warning at boot: WARNING: CPU: 0 PID: 10 at drivers/pci/msi/msi.h:121 pci_msi_setup_msi_irqs+0x68/0x90 CPU: 0 PID: 10 Comm: kworker/u32:0 Not tainted 6.9.7-parisc64 #1 Debian 6.9.7-1 Hardware name: 9000/800/rp3440 Workqueue: async async_run_entry_fn We need to select PCI_MSI_ARCH_FALLBACKS when PCI_MSI is selected. Signed-off-by: John David Anglin <dave.anglin@bell.net> Cc: stable@vger.kernel.org # v6.0+ Signed-off-by: Helge Deller <deller@gmx.de>

The lifetime of TCP-AO static_key is the same as the last tcp_ao_info. On the socket destruction tcp_ao_info ceases to be with RCU grace period, while tcp-ao static branch is currently deferred destructed. The static key definition is : DEFINE_STATIC_KEY_DEFERRED_FALSE(tcp_ao_needed, HZ); which means that if RCU grace period is delayed by more than a second and tcp_ao_needed is in the process of disablement, other CPUs may yet see tcp_ao_info which atent dead, but soon-to-be. And that breaks the assumption of static_key_fast_inc_not_disabled(). Happened on netdev test-bot[1], so not a theoretical issue: [] jump_label: Fatal kernel bug, unexpected op at tcp_inbound_hash+0x1a7/0x870 [ffffffffa8c4e9b7] (eb 50 0f 1f 44 != 66 90 0f 1f 00)) size:2 type:1 [] ------------[ cut here ]------------ [] kernel BUG at arch/x86/kernel/jump_label.c:73! [] Oops: invalid opcode: 0000 [#1] PREEMPT SMP KASAN NOPTI [] CPU: 3 PID: 243 Comm: kworker/3:3 Not tainted 6.10.0-virtme #1 [] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.16.3-0-ga6ed6b701f0a-prebuilt.qemu.org 04/01/2014 [] Workqueue: events jump_label_update_timeout [] RIP: 0010:__jump_label_patch+0x2f6/0x350 ... [] Call Trace: [] <TASK> [] arch_jump_label_transform_queue+0x6c/0x110 [] __jump_label_update+0xef/0x350 [] __static_key_slow_dec_cpuslocked.part.0+0x3c/0x60 [] jump_label_update_timeout+0x2c/0x40 [] process_one_work+0xe3b/0x1670 [] worker_thread+0x587/0xce0 [] kthread+0x28a/0x350 [] ret_from_fork+0x31/0x70 [] ret_from_fork_asm+0x1a/0x30 [] </TASK> [] Modules linked in: veth [] ---[ end trace 0000000000000000 ]--- [] RIP: 0010:__jump_label_patch+0x2f6/0x350 [1]: https://netdev-3.bots.linux.dev/vmksft-tcp-ao-dbg/results/696681/5-connect-deny-ipv6/stderr Cc: stable@kernel.org Fixes: 67fa83f ("net/tcp: Add static_key for TCP-AO") Signed-off-by: Dmitry Safonov <0x7f454c46@gmail.com> Signed-off-by: NipaLocal <nipa@local>

skbuff_fclone_cache was created without defining a usercopy region, [1] unlike skbuff_head_cache which properly whitelists the cb[] field. [2] This causes a usercopy BUG() when CONFIG_HARDENED_USERCOPY is enabled and the kernel attempts to copy sk_buff.cb data to userspace via sock_recv_errqueue() -> put_cmsg(). The crash occurs when: 1. TCP allocates an skb using alloc_skb_fclone() (from skbuff_fclone_cache) [1] 2. The skb is cloned via skb_clone() using the pre-allocated fclone [3] 3. The cloned skb is queued to sk_error_queue for timestamp reporting 4. Userspace reads the error queue via recvmsg(MSG_ERRQUEUE) 5. sock_recv_errqueue() calls put_cmsg() to copy serr->ee from skb->cb [4] 6. __check_heap_object() fails because skbuff_fclone_cache has no usercopy whitelist [5] When cloned skbs allocated from skbuff_fclone_cache are used in the socket error queue, accessing the sock_exterr_skb structure in skb->cb via put_cmsg() triggers a usercopy hardening violation: [ 5.379589] usercopy: Kernel memory exposure attempt detected from SLUB object 'skbuff_fclone_cache' (offset 296, size 16)! [ 5.382796] kernel BUG at mm/usercopy.c:102! [ 5.383923] Oops: invalid opcode: 0000 [#1] SMP KASAN NOPTI [ 5.384903] CPU: 1 UID: 0 PID: 138 Comm: poc_put_cmsg Not tainted 6.12.57 torvalds#7 [ 5.384903] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.16.3-0-ga6ed6b701f0a-prebuilt.qemu.org 04/01/2014 [ 5.384903] RIP: 0010:usercopy_abort+0x6c/0x80 [ 5.384903] Code: 1a 86 51 48 c7 c2 40 15 1a 86 41 52 48 c7 c7 c0 15 1a 86 48 0f 45 d6 48 c7 c6 80 15 1a 86 48 89 c1 49 0f 45 f3 e8 84 27 88 ff <0f> 0b 490 [ 5.384903] RSP: 0018:ffffc900006f77a8 EFLAGS: 00010246 [ 5.384903] RAX: 000000000000006f RBX: ffff88800f0ad2a8 RCX: 1ffffffff0f72e74 [ 5.384903] RDX: 0000000000000000 RSI: 0000000000000004 RDI: ffffffff87b973a0 [ 5.384903] RBP: 0000000000000010 R08: 0000000000000000 R09: fffffbfff0f72e74 [ 5.384903] R10: 0000000000000003 R11: 79706f6372657375 R12: 0000000000000001 [ 5.384903] R13: ffff88800f0ad2b8 R14: ffffea00003c2b40 R15: ffffea00003c2b00 [ 5.384903] FS: 0000000011bc4380(0000) GS:ffff8880bf100000(0000) knlGS:0000000000000000 [ 5.384903] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 5.384903] CR2: 000056aa3b8e5fe4 CR3: 000000000ea26004 CR4: 0000000000770ef0 [ 5.384903] PKRU: 55555554 [ 5.384903] Call Trace: [ 5.384903] <TASK> [ 5.384903] __check_heap_object+0x9a/0xd0 [ 5.384903] __check_object_size+0x46c/0x690 [ 5.384903] put_cmsg+0x129/0x5e0 [ 5.384903] sock_recv_errqueue+0x22f/0x380 [ 5.384903] tls_sw_recvmsg+0x7ed/0x1960 [ 5.384903] ? srso_alias_return_thunk+0x5/0xfbef5 [ 5.384903] ? schedule+0x6d/0x270 [ 5.384903] ? srso_alias_return_thunk+0x5/0xfbef5 [ 5.384903] ? mutex_unlock+0x81/0xd0 [ 5.384903] ? __pfx_mutex_unlock+0x10/0x10 [ 5.384903] ? __pfx_tls_sw_recvmsg+0x10/0x10 [ 5.384903] ? _raw_spin_lock_irqsave+0x8f/0xf0 [ 5.384903] ? _raw_read_unlock_irqrestore+0x20/0x40 [ 5.384903] ? srso_alias_return_thunk+0x5/0xfbef5 The crash offset 296 corresponds to skb2->cb within skbuff_fclones: - sizeof(struct sk_buff) = 232 - offsetof(struct sk_buff, cb) = 40 - offset of skb2.cb in fclones = 232 + 40 = 272 - crash offset 296 = 272 + 24 (inside sock_exterr_skb.ee) This patch uses a local stack variable as a bounce buffer to avoid the hardened usercopy check failure. [1] https://elixir.bootlin.com/linux/v6.12.62/source/net/ipv4/tcp.c#L885 [2] https://elixir.bootlin.com/linux/v6.12.62/source/net/core/skbuff.c#L5104 [3] https://elixir.bootlin.com/linux/v6.12.62/source/net/core/skbuff.c#L5566 [4] https://elixir.bootlin.com/linux/v6.12.62/source/net/core/skbuff.c#L5491 [5] https://elixir.bootlin.com/linux/v6.12.62/source/mm/slub.c#L5719 Fixes: 6d07d1c ("usercopy: Restrict non-usercopy caches to size 0") Reported-by: Xiang Mei <xmei5@asu.edu> Signed-off-by: Weiming Shi <bestswngs@gmail.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Signed-off-by: NipaLocal <nipa@local>

Fix assert lock warning while calling devl_param_driverinit_value_set() in ena. WARNING: net/devlink/core.c:261 at devl_assert_locked+0x62/0x90, CPU#0: kworker/0:0/9 CPU: 0 UID: 0 PID: 9 Comm: kworker/0:0 Not tainted 6.19.0-rc2+ #1 PREEMPT(lazy) Hardware name: Amazon EC2 m8i-flex.4xlarge/, BIOS 1.0 10/16/2017 Workqueue: events work_for_cpu_fn RIP: 0010:devl_assert_locked+0x62/0x90 Call Trace: <TASK> devl_param_driverinit_value_set+0x15/0x1c0 ena_devlink_alloc+0x18c/0x220 [ena] ? __pfx_ena_devlink_alloc+0x10/0x10 [ena] ? trace_hardirqs_on+0x18/0x140 ? lockdep_hardirqs_on+0x8c/0x130 ? __raw_spin_unlock_irqrestore+0x5d/0x80 ? __raw_spin_unlock_irqrestore+0x46/0x80 ? devm_ioremap_wc+0x9a/0xd0 ena_probe+0x4d2/0x1b20 [ena] ? __lock_acquire+0x56a/0xbd0 ? __pfx_ena_probe+0x10/0x10 [ena] ? local_clock+0x15/0x30 ? __lock_release.isra.0+0x1c9/0x340 ? mark_held_locks+0x40/0x70 ? lockdep_hardirqs_on_prepare.part.0+0x92/0x170 ? trace_hardirqs_on+0x18/0x140 ? lockdep_hardirqs_on+0x8c/0x130 ? __raw_spin_unlock_irqrestore+0x5d/0x80 ? __raw_spin_unlock_irqrestore+0x46/0x80 ? __pfx_ena_probe+0x10/0x10 [ena] ...... </TASK> Fixes: 816b526 ("net: ena: Control PHC enable through devlink") Signed-off-by: Frank Liang <xiliang@redhat.com> Reviewed-by: David Arinzon <darinzon@amazon.com> Signed-off-by: NipaLocal <nipa@local>

…te in qfq_reset `qfq_class->leaf_qdisc->q.qlen > 0` does not imply that the class itself is active. Two qfq_class objects may point to the same leaf_qdisc. This happens when: 1. one QFQ qdisc is attached to the dev as the root qdisc, and 2. another QFQ qdisc is temporarily referenced (e.g., via qdisc_get() / qdisc_put()) and is pending to be destroyed, as in function tc_new_tfilter. When packets are enqueued through the root QFQ qdisc, the shared leaf_qdisc->q.qlen increases. At the same time, the second QFQ qdisc triggers qdisc_put and qdisc_destroy: the qdisc enters qfq_reset() with its own q->q.qlen == 0, but its class's leaf qdisc->q.qlen > 0. Therefore, the qfq_reset would wrongly deactivate an inactive aggregate and trigger a null-deref in qfq_deactivate_agg: [ 0.977749] Oops: general protection fault, probably for non-canonical address 0xdffffc0000000000: 0000 [#1] SMP KAI [ 0.978440] KASAN: null-ptr-deref in range [0x0000000000000000-0x0000000000000007] [ 0.978875] CPU: 0 UID: 0 PID: 135 Comm: exploit Not tainted 6.12.57 #3 [ 0.979270] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.17.0-0-gb52ca86e094d-prebuilt.qemu.or4 [ 0.979913] RIP: qfq_deactivate_agg+0x187/0xca0 [ 0.980200] Code: 00 fc ff df 48 89 fe 48 c1 ee 03 80 3c 16 00 0f 85 1d 09 00 00 48 be 00 00 00 00 00 fc ff df 48 80 Code starting with the faulting instruction =========================================== 0: 00 fc add %bh,%ah 2: ff lcall (bad) 3: df 48 89 fisttps -0x77(%rax) 6: fe 48 c1 decb -0x3f(%rax) 9: ee out %al,(%dx) a: 03 80 3c 16 00 0f add 0xf00163c(%rax),%eax 10: 85 1d 09 00 00 48 test %ebx,0x48000009(%rip) # 0x4800001f 16: be 00 00 00 00 mov $0x0,%esi 1b: 00 fc add %bh,%ah 1d: ff lcall (bad) 1e: df 48 80 fisttps -0x80(%rax) [ 0.981234] RSP: 0018:ffff8880106d73f8 EFLAGS: 00010246 [ 0.981517] RAX: 0000000000000000 RBX: ffff88800c518000 RCX: ffff888010bc1358 [ 0.981943] RDX: 0000000000000000 RSI: dffffc0000000000 RDI: 0000000000000000 [ 0.982336] RBP: ffff888010bc1340 R08: ffff88800c518158 R09: ffff88800c518158 [ 0.982734] R10: 1ffff110018a302c R11: ffffffff89689156 R12: 0000000000000000 [ 0.983140] R13: ffff888010bc0180 R14: 0000000000000000 R15: ffff888010bc1350 [ 0.983521] FS: 0000000009737380(0000) GS:ffff8880bf000000(0000) knlGS:0000000000000000 [ 0.983955] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 0.984270] CR2: 00000000097c1000 CR3: 000000000ee7c004 CR4: 0000000000772ef0 [ 0.984654] PKRU: 55555554 [ 0.984804] Call Trace: [ 0.984957] <TASK> [ 0.985084] qfq_reset_qdisc+0x27c/0x3e0 [ 0.985316] ? __pfx_mutex_lock+0x10/0x10 [ 0.985541] qdisc_reset+0x9d/0x590 [ 0.985736] ? __tcf_block_put+0x2e/0x2b0 [ 0.985980] ? __pfx_mutex_unlock+0x10/0x10 [ 0.986237] ? __tcf_chain_put+0x4a/0x880 [ 0.986465] __qdisc_destroy+0xb2/0x280 [ 0.986686] tc_new_tfilter+0x9af/0x2180 [ 0.986932] ? __pfx_stack_trace_consume_entry+0x10/0x10 [ 0.987216] ? __pfx_stack_trace_consume_entry+0x10/0x10 [ 0.987505] ? __pfx_tc_new_tfilter+0x10/0x10 [ 0.987755] ? unwind_get_return_address+0x5e/0xa0 [ 0.988025] ? arch_stack_walk+0xac/0x100 [ 0.988241] ? stack_depot_save_flags+0x29/0x7e0 [ 0.988506] ? stack_trace_save+0x94/0xd0 [ 0.988722] ? security_capable+0xda/0x160 [ 0.988970] rtnetlink_rcv_msg+0x543/0xc50 [ 0.989204] ? __pfx_rtnetlink_rcv_msg+0x10/0x10 [ 0.989458] netlink_rcv_skb+0x134/0x370 [ 0.989676] ? __pfx_rtnetlink_rcv_msg+0x10/0x10 [ 0.989951] ? __pfx_netlink_rcv_skb+0x10/0x10 [ 0.990190] ? __pfx___netlink_lookup+0x10/0x10 [ 0.990440] ? kasan_save_track+0x14/0x30 [ 0.990659] ? _copy_from_iter+0x214/0x1100 [ 0.990886] netlink_unicast+0x6db/0xa20 [ 0.991116] ? __pfx_netlink_unicast+0x10/0x10 [ 0.991355] ? unwind_get_return_address+0x5e/0xa0 [ 0.991616] ? arch_stack_walk+0xac/0x100 [ 0.991854] ? __check_object_size+0x46c/0x690 [ 0.992091] netlink_sendmsg+0x72b/0xbd0 [ 0.992301] ? __pfx_netlink_sendmsg+0x10/0x10 [ 0.992545] ? __pfx_aa_file_perm+0x10/0x10 [ 0.992793] sock_write_iter+0x489/0x560 [ 0.993043] ? kmem_cache_free+0x249/0x4b0 [ 0.993282] ? __pfx_sock_write_iter+0x10/0x10 [ 0.993565] ? security_file_permission+0x7e/0xe0 [ 0.993922] ? rw_verify_area+0x70/0x4d0 [ 0.994192] vfs_write+0x930/0xea0 [ 0.994439] ? __pfx_vfs_write+0x10/0x10 [ 0.994642] ? fdget_pos+0x57/0x4f0 [ 0.994810] ? __call_rcu_common.constprop.0+0x247/0x7a0 [ 0.995105] ksys_write+0x17c/0x1d0 [ 0.995290] ? __pfx_ksys_write+0x10/0x10 [ 0.995511] ? __x64_sys_close+0x7c/0xd0 [ 0.995732] do_syscall_64+0x58/0x120 [ 0.995959] entry_SYSCALL_64_after_hwframe+0x76/0x7e [ 0.996233] RIP: 0033:0x424c34 [ 0.996397] Code: 89 02 48 c7 c0 ff ff ff ff eb bd 66 2e 0f 1f 84 00 00 00 00 00 90 f3 0f 1e fa 80 3d 2d 44 09 00 09 Code starting with the faulting instruction =========================================== 0: 89 02 mov %eax,(%rdx) 2: 48 c7 c0 ff ff ff ff mov $0xffffffffffffffff,%rax 9: eb bd jmp 0xffffffffffffffc8 b: 66 2e 0f 1f 84 00 00 cs nopw 0x0(%rax,%rax,1) 12: 00 00 00 15: 90 nop 16: f3 0f 1e fa endbr64 1a: 80 3d 2d 44 09 00 09 cmpb $0x9,0x9442d(%rip) # 0x9444e [ 0.997360] RSP: 002b:00007ffea27af418 EFLAGS: 00000202 ORIG_RAX: 0000000000000001 [ 0.997746] RAX: ffffffffffffffda RBX: 0000000000000001 RCX: 0000000000424c34 [ 0.998123] RDX: 000000000000003c RSI: 00000000097bf5d0 RDI: 0000000000000003 [ 0.998495] RBP: 00007ffea27af460 R08: 0000000000000000 R09: 0000000000000000 [ 0.998862] R10: 0000000000000001 R11: 0000000000000202 R12: 00007ffea27af5c8 [ 0.999224] R13: 00007ffea27af5d8 R14: 00000000004b3828 R15: 0000000000000001 [ 0.999592] </TASK> [ 0.999711] Modules linked in: [ 0.999899] ---[ end trace 0000000000000000 ]--- [ 1.000143] RIP: qfq_deactivate_agg+0x187/0xca0 [ 1.000396] Code: 00 fc ff df 48 89 fe 48 c1 ee 03 80 3c 16 00 0f 85 1d 09 00 00 48 be 00 00 00 00 00 fc ff df 48 80 Code starting with the faulting instruction =========================================== 0: 00 fc add %bh,%ah 2: ff lcall (bad) 3: df 48 89 fisttps -0x77(%rax) 6: fe 48 c1 decb -0x3f(%rax) 9: ee out %al,(%dx) a: 03 80 3c 16 00 0f add 0xf00163c(%rax),%eax 10: 85 1d 09 00 00 48 test %ebx,0x48000009(%rip) # 0x4800001f 16: be 00 00 00 00 mov $0x0,%esi 1b: 00 fc add %bh,%ah 1d: ff lcall (bad) 1e: df 48 80 fisttps -0x80(%rax) [ 1.001456] RSP: 0018:ffff8880106d73f8 EFLAGS: 00010246 [ 1.001735] RAX: 0000000000000000 RBX: ffff88800c518000 RCX: ffff888010bc1358 [ 1.002107] RDX: 0000000000000000 RSI: dffffc0000000000 RDI: 0000000000000000 [ 1.002478] RBP: ffff888010bc1340 R08: ffff88800c518158 R09: ffff88800c518158 [ 1.002853] R10: 1ffff110018a302c R11: ffffffff89689156 R12: 0000000000000000 [ 1.003204] R13: ffff888010bc0180 R14: 0000000000000000 R15: ffff888010bc1350 [ 1.003559] FS: 0000000009737380(0000) GS:ffff8880bf000000(0000) knlGS:0000000000000000 [ 1.003962] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 1.004243] CR2: 00000000097c1000 CR3: 000000000ee7c004 CR4: 0000000000772ef0 [ 1.004599] PKRU: 55555554 [ 1.004740] Kernel panic - not syncing: Fatal exception [ 1.005071] Kernel Offset: disabled Fixes: 0545a30 ("pkt_sched: QFQ - quick fair queue scheduler") Signed-off-by: Xiang Mei <xmei5@asu.edu> Signed-off-by: NipaLocal <nipa@local>

skbuff_fclone_cache was created without defining a usercopy region, [1] unlike skbuff_head_cache which properly whitelists the cb[] field. [2] This causes a usercopy BUG() when CONFIG_HARDENED_USERCOPY is enabled and the kernel attempts to copy sk_buff.cb data to userspace via sock_recv_errqueue() -> put_cmsg(). The crash occurs when: 1. TCP allocates an skb using alloc_skb_fclone() (from skbuff_fclone_cache) [1] 2. The skb is cloned via skb_clone() using the pre-allocated fclone [3] 3. The cloned skb is queued to sk_error_queue for timestamp reporting 4. Userspace reads the error queue via recvmsg(MSG_ERRQUEUE) 5. sock_recv_errqueue() calls put_cmsg() to copy serr->ee from skb->cb [4] 6. __check_heap_object() fails because skbuff_fclone_cache has no usercopy whitelist [5] When cloned skbs allocated from skbuff_fclone_cache are used in the socket error queue, accessing the sock_exterr_skb structure in skb->cb via put_cmsg() triggers a usercopy hardening violation: [ 5.379589] usercopy: Kernel memory exposure attempt detected from SLUB object 'skbuff_fclone_cache' (offset 296, size 16)! [ 5.382796] kernel BUG at mm/usercopy.c:102! [ 5.383923] Oops: invalid opcode: 0000 [#1] SMP KASAN NOPTI [ 5.384903] CPU: 1 UID: 0 PID: 138 Comm: poc_put_cmsg Not tainted 6.12.57 torvalds#7 [ 5.384903] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.16.3-0-ga6ed6b701f0a-prebuilt.qemu.org 04/01/2014 [ 5.384903] RIP: 0010:usercopy_abort+0x6c/0x80 [ 5.384903] Code: 1a 86 51 48 c7 c2 40 15 1a 86 41 52 48 c7 c7 c0 15 1a 86 48 0f 45 d6 48 c7 c6 80 15 1a 86 48 89 c1 49 0f 45 f3 e8 84 27 88 ff <0f> 0b 490 [ 5.384903] RSP: 0018:ffffc900006f77a8 EFLAGS: 00010246 [ 5.384903] RAX: 000000000000006f RBX: ffff88800f0ad2a8 RCX: 1ffffffff0f72e74 [ 5.384903] RDX: 0000000000000000 RSI: 0000000000000004 RDI: ffffffff87b973a0 [ 5.384903] RBP: 0000000000000010 R08: 0000000000000000 R09: fffffbfff0f72e74 [ 5.384903] R10: 0000000000000003 R11: 79706f6372657375 R12: 0000000000000001 [ 5.384903] R13: ffff88800f0ad2b8 R14: ffffea00003c2b40 R15: ffffea00003c2b00 [ 5.384903] FS: 0000000011bc4380(0000) GS:ffff8880bf100000(0000) knlGS:0000000000000000 [ 5.384903] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 5.384903] CR2: 000056aa3b8e5fe4 CR3: 000000000ea26004 CR4: 0000000000770ef0 [ 5.384903] PKRU: 55555554 [ 5.384903] Call Trace: [ 5.384903] <TASK> [ 5.384903] __check_heap_object+0x9a/0xd0 [ 5.384903] __check_object_size+0x46c/0x690 [ 5.384903] put_cmsg+0x129/0x5e0 [ 5.384903] sock_recv_errqueue+0x22f/0x380 [ 5.384903] tls_sw_recvmsg+0x7ed/0x1960 [ 5.384903] ? srso_alias_return_thunk+0x5/0xfbef5 [ 5.384903] ? schedule+0x6d/0x270 [ 5.384903] ? srso_alias_return_thunk+0x5/0xfbef5 [ 5.384903] ? mutex_unlock+0x81/0xd0 [ 5.384903] ? __pfx_mutex_unlock+0x10/0x10 [ 5.384903] ? __pfx_tls_sw_recvmsg+0x10/0x10 [ 5.384903] ? _raw_spin_lock_irqsave+0x8f/0xf0 [ 5.384903] ? _raw_read_unlock_irqrestore+0x20/0x40 [ 5.384903] ? srso_alias_return_thunk+0x5/0xfbef5 The crash offset 296 corresponds to skb2->cb within skbuff_fclones: - sizeof(struct sk_buff) = 232 - offsetof(struct sk_buff, cb) = 40 - offset of skb2.cb in fclones = 232 + 40 = 272 - crash offset 296 = 272 + 24 (inside sock_exterr_skb.ee) This patch uses a local stack variable as a bounce buffer to avoid the hardened usercopy check failure. [1] https://elixir.bootlin.com/linux/v6.12.62/source/net/ipv4/tcp.c#L885 [2] https://elixir.bootlin.com/linux/v6.12.62/source/net/core/skbuff.c#L5104 [3] https://elixir.bootlin.com/linux/v6.12.62/source/net/core/skbuff.c#L5566 [4] https://elixir.bootlin.com/linux/v6.12.62/source/net/core/skbuff.c#L5491 [5] https://elixir.bootlin.com/linux/v6.12.62/source/mm/slub.c#L5719 Fixes: 6d07d1c ("usercopy: Restrict non-usercopy caches to size 0") Reported-by: Xiang Mei <xmei5@asu.edu> Signed-off-by: Weiming Shi <bestswngs@gmail.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Signed-off-by: NipaLocal <nipa@local>

Fix assert lock warning while calling devl_param_driverinit_value_set() in ena. WARNING: net/devlink/core.c:261 at devl_assert_locked+0x62/0x90, CPU#0: kworker/0:0/9 CPU: 0 UID: 0 PID: 9 Comm: kworker/0:0 Not tainted 6.19.0-rc2+ #1 PREEMPT(lazy) Hardware name: Amazon EC2 m8i-flex.4xlarge/, BIOS 1.0 10/16/2017 Workqueue: events work_for_cpu_fn RIP: 0010:devl_assert_locked+0x62/0x90 Call Trace: <TASK> devl_param_driverinit_value_set+0x15/0x1c0 ena_devlink_alloc+0x18c/0x220 [ena] ? __pfx_ena_devlink_alloc+0x10/0x10 [ena] ? trace_hardirqs_on+0x18/0x140 ? lockdep_hardirqs_on+0x8c/0x130 ? __raw_spin_unlock_irqrestore+0x5d/0x80 ? __raw_spin_unlock_irqrestore+0x46/0x80 ? devm_ioremap_wc+0x9a/0xd0 ena_probe+0x4d2/0x1b20 [ena] ? __lock_acquire+0x56a/0xbd0 ? __pfx_ena_probe+0x10/0x10 [ena] ? local_clock+0x15/0x30 ? __lock_release.isra.0+0x1c9/0x340 ? mark_held_locks+0x40/0x70 ? lockdep_hardirqs_on_prepare.part.0+0x92/0x170 ? trace_hardirqs_on+0x18/0x140 ? lockdep_hardirqs_on+0x8c/0x130 ? __raw_spin_unlock_irqrestore+0x5d/0x80 ? __raw_spin_unlock_irqrestore+0x46/0x80 ? __pfx_ena_probe+0x10/0x10 [ena] ...... </TASK> Fixes: 816b526 ("net: ena: Control PHC enable through devlink") Signed-off-by: Frank Liang <xiliang@redhat.com> Reviewed-by: David Arinzon <darinzon@amazon.com> Signed-off-by: NipaLocal <nipa@local>

…te in qfq_reset `qfq_class->leaf_qdisc->q.qlen > 0` does not imply that the class itself is active. Two qfq_class objects may point to the same leaf_qdisc. This happens when: 1. one QFQ qdisc is attached to the dev as the root qdisc, and 2. another QFQ qdisc is temporarily referenced (e.g., via qdisc_get() / qdisc_put()) and is pending to be destroyed, as in function tc_new_tfilter. When packets are enqueued through the root QFQ qdisc, the shared leaf_qdisc->q.qlen increases. At the same time, the second QFQ qdisc triggers qdisc_put and qdisc_destroy: the qdisc enters qfq_reset() with its own q->q.qlen == 0, but its class's leaf qdisc->q.qlen > 0. Therefore, the qfq_reset would wrongly deactivate an inactive aggregate and trigger a null-deref in qfq_deactivate_agg: [ 0.977749] Oops: general protection fault, probably for non-canonical address 0xdffffc0000000000: 0000 [#1] SMP KAI [ 0.978440] KASAN: null-ptr-deref in range [0x0000000000000000-0x0000000000000007] [ 0.978875] CPU: 0 UID: 0 PID: 135 Comm: exploit Not tainted 6.12.57 #3 [ 0.979270] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.17.0-0-gb52ca86e094d-prebuilt.qemu.or4 [ 0.979913] RIP: qfq_deactivate_agg+0x187/0xca0 [ 0.980200] Code: 00 fc ff df 48 89 fe 48 c1 ee 03 80 3c 16 00 0f 85 1d 09 00 00 48 be 00 00 00 00 00 fc ff df 48 80 Code starting with the faulting instruction =========================================== 0: 00 fc add %bh,%ah 2: ff lcall (bad) 3: df 48 89 fisttps -0x77(%rax) 6: fe 48 c1 decb -0x3f(%rax) 9: ee out %al,(%dx) a: 03 80 3c 16 00 0f add 0xf00163c(%rax),%eax 10: 85 1d 09 00 00 48 test %ebx,0x48000009(%rip) # 0x4800001f 16: be 00 00 00 00 mov $0x0,%esi 1b: 00 fc add %bh,%ah 1d: ff lcall (bad) 1e: df 48 80 fisttps -0x80(%rax) [ 0.981234] RSP: 0018:ffff8880106d73f8 EFLAGS: 00010246 [ 0.981517] RAX: 0000000000000000 RBX: ffff88800c518000 RCX: ffff888010bc1358 [ 0.981943] RDX: 0000000000000000 RSI: dffffc0000000000 RDI: 0000000000000000 [ 0.982336] RBP: ffff888010bc1340 R08: ffff88800c518158 R09: ffff88800c518158 [ 0.982734] R10: 1ffff110018a302c R11: ffffffff89689156 R12: 0000000000000000 [ 0.983140] R13: ffff888010bc0180 R14: 0000000000000000 R15: ffff888010bc1350 [ 0.983521] FS: 0000000009737380(0000) GS:ffff8880bf000000(0000) knlGS:0000000000000000 [ 0.983955] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 0.984270] CR2: 00000000097c1000 CR3: 000000000ee7c004 CR4: 0000000000772ef0 [ 0.984654] PKRU: 55555554 [ 0.984804] Call Trace: [ 0.984957] <TASK> [ 0.985084] qfq_reset_qdisc+0x27c/0x3e0 [ 0.985316] ? __pfx_mutex_lock+0x10/0x10 [ 0.985541] qdisc_reset+0x9d/0x590 [ 0.985736] ? __tcf_block_put+0x2e/0x2b0 [ 0.985980] ? __pfx_mutex_unlock+0x10/0x10 [ 0.986237] ? __tcf_chain_put+0x4a/0x880 [ 0.986465] __qdisc_destroy+0xb2/0x280 [ 0.986686] tc_new_tfilter+0x9af/0x2180 [ 0.986932] ? __pfx_stack_trace_consume_entry+0x10/0x10 [ 0.987216] ? __pfx_stack_trace_consume_entry+0x10/0x10 [ 0.987505] ? __pfx_tc_new_tfilter+0x10/0x10 [ 0.987755] ? unwind_get_return_address+0x5e/0xa0 [ 0.988025] ? arch_stack_walk+0xac/0x100 [ 0.988241] ? stack_depot_save_flags+0x29/0x7e0 [ 0.988506] ? stack_trace_save+0x94/0xd0 [ 0.988722] ? security_capable+0xda/0x160 [ 0.988970] rtnetlink_rcv_msg+0x543/0xc50 [ 0.989204] ? __pfx_rtnetlink_rcv_msg+0x10/0x10 [ 0.989458] netlink_rcv_skb+0x134/0x370 [ 0.989676] ? __pfx_rtnetlink_rcv_msg+0x10/0x10 [ 0.989951] ? __pfx_netlink_rcv_skb+0x10/0x10 [ 0.990190] ? __pfx___netlink_lookup+0x10/0x10 [ 0.990440] ? kasan_save_track+0x14/0x30 [ 0.990659] ? _copy_from_iter+0x214/0x1100 [ 0.990886] netlink_unicast+0x6db/0xa20 [ 0.991116] ? __pfx_netlink_unicast+0x10/0x10 [ 0.991355] ? unwind_get_return_address+0x5e/0xa0 [ 0.991616] ? arch_stack_walk+0xac/0x100 [ 0.991854] ? __check_object_size+0x46c/0x690 [ 0.992091] netlink_sendmsg+0x72b/0xbd0 [ 0.992301] ? __pfx_netlink_sendmsg+0x10/0x10 [ 0.992545] ? __pfx_aa_file_perm+0x10/0x10 [ 0.992793] sock_write_iter+0x489/0x560 [ 0.993043] ? kmem_cache_free+0x249/0x4b0 [ 0.993282] ? __pfx_sock_write_iter+0x10/0x10 [ 0.993565] ? security_file_permission+0x7e/0xe0 [ 0.993922] ? rw_verify_area+0x70/0x4d0 [ 0.994192] vfs_write+0x930/0xea0 [ 0.994439] ? __pfx_vfs_write+0x10/0x10 [ 0.994642] ? fdget_pos+0x57/0x4f0 [ 0.994810] ? __call_rcu_common.constprop.0+0x247/0x7a0 [ 0.995105] ksys_write+0x17c/0x1d0 [ 0.995290] ? __pfx_ksys_write+0x10/0x10 [ 0.995511] ? __x64_sys_close+0x7c/0xd0 [ 0.995732] do_syscall_64+0x58/0x120 [ 0.995959] entry_SYSCALL_64_after_hwframe+0x76/0x7e [ 0.996233] RIP: 0033:0x424c34 [ 0.996397] Code: 89 02 48 c7 c0 ff ff ff ff eb bd 66 2e 0f 1f 84 00 00 00 00 00 90 f3 0f 1e fa 80 3d 2d 44 09 00 09 Code starting with the faulting instruction =========================================== 0: 89 02 mov %eax,(%rdx) 2: 48 c7 c0 ff ff ff ff mov $0xffffffffffffffff,%rax 9: eb bd jmp 0xffffffffffffffc8 b: 66 2e 0f 1f 84 00 00 cs nopw 0x0(%rax,%rax,1) 12: 00 00 00 15: 90 nop 16: f3 0f 1e fa endbr64 1a: 80 3d 2d 44 09 00 09 cmpb $0x9,0x9442d(%rip) # 0x9444e [ 0.997360] RSP: 002b:00007ffea27af418 EFLAGS: 00000202 ORIG_RAX: 0000000000000001 [ 0.997746] RAX: ffffffffffffffda RBX: 0000000000000001 RCX: 0000000000424c34 [ 0.998123] RDX: 000000000000003c RSI: 00000000097bf5d0 RDI: 0000000000000003 [ 0.998495] RBP: 00007ffea27af460 R08: 0000000000000000 R09: 0000000000000000 [ 0.998862] R10: 0000000000000001 R11: 0000000000000202 R12: 00007ffea27af5c8 [ 0.999224] R13: 00007ffea27af5d8 R14: 00000000004b3828 R15: 0000000000000001 [ 0.999592] </TASK> [ 0.999711] Modules linked in: [ 0.999899] ---[ end trace 0000000000000000 ]--- [ 1.000143] RIP: qfq_deactivate_agg+0x187/0xca0 [ 1.000396] Code: 00 fc ff df 48 89 fe 48 c1 ee 03 80 3c 16 00 0f 85 1d 09 00 00 48 be 00 00 00 00 00 fc ff df 48 80 Code starting with the faulting instruction =========================================== 0: 00 fc add %bh,%ah 2: ff lcall (bad) 3: df 48 89 fisttps -0x77(%rax) 6: fe 48 c1 decb -0x3f(%rax) 9: ee out %al,(%dx) a: 03 80 3c 16 00 0f add 0xf00163c(%rax),%eax 10: 85 1d 09 00 00 48 test %ebx,0x48000009(%rip) # 0x4800001f 16: be 00 00 00 00 mov $0x0,%esi 1b: 00 fc add %bh,%ah 1d: ff lcall (bad) 1e: df 48 80 fisttps -0x80(%rax) [ 1.001456] RSP: 0018:ffff8880106d73f8 EFLAGS: 00010246 [ 1.001735] RAX: 0000000000000000 RBX: ffff88800c518000 RCX: ffff888010bc1358 [ 1.002107] RDX: 0000000000000000 RSI: dffffc0000000000 RDI: 0000000000000000 [ 1.002478] RBP: ffff888010bc1340 R08: ffff88800c518158 R09: ffff88800c518158 [ 1.002853] R10: 1ffff110018a302c R11: ffffffff89689156 R12: 0000000000000000 [ 1.003204] R13: ffff888010bc0180 R14: 0000000000000000 R15: ffff888010bc1350 [ 1.003559] FS: 0000000009737380(0000) GS:ffff8880bf000000(0000) knlGS:0000000000000000 [ 1.003962] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 1.004243] CR2: 00000000097c1000 CR3: 000000000ee7c004 CR4: 0000000000772ef0 [ 1.004599] PKRU: 55555554 [ 1.004740] Kernel panic - not syncing: Fatal exception [ 1.005071] Kernel Offset: disabled Fixes: 0545a30 ("pkt_sched: QFQ - quick fair queue scheduler") Signed-off-by: Xiang Mei <xmei5@asu.edu> Signed-off-by: NipaLocal <nipa@local>

skbuff_fclone_cache was created without defining a usercopy region, [1] unlike skbuff_head_cache which properly whitelists the cb[] field. [2] This causes a usercopy BUG() when CONFIG_HARDENED_USERCOPY is enabled and the kernel attempts to copy sk_buff.cb data to userspace via sock_recv_errqueue() -> put_cmsg(). The crash occurs when: 1. TCP allocates an skb using alloc_skb_fclone() (from skbuff_fclone_cache) [1] 2. The skb is cloned via skb_clone() using the pre-allocated fclone [3] 3. The cloned skb is queued to sk_error_queue for timestamp reporting 4. Userspace reads the error queue via recvmsg(MSG_ERRQUEUE) 5. sock_recv_errqueue() calls put_cmsg() to copy serr->ee from skb->cb [4] 6. __check_heap_object() fails because skbuff_fclone_cache has no usercopy whitelist [5] When cloned skbs allocated from skbuff_fclone_cache are used in the socket error queue, accessing the sock_exterr_skb structure in skb->cb via put_cmsg() triggers a usercopy hardening violation: [ 5.379589] usercopy: Kernel memory exposure attempt detected from SLUB object 'skbuff_fclone_cache' (offset 296, size 16)! [ 5.382796] kernel BUG at mm/usercopy.c:102! [ 5.383923] Oops: invalid opcode: 0000 [#1] SMP KASAN NOPTI [ 5.384903] CPU: 1 UID: 0 PID: 138 Comm: poc_put_cmsg Not tainted 6.12.57 torvalds#7 [ 5.384903] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.16.3-0-ga6ed6b701f0a-prebuilt.qemu.org 04/01/2014 [ 5.384903] RIP: 0010:usercopy_abort+0x6c/0x80 [ 5.384903] Code: 1a 86 51 48 c7 c2 40 15 1a 86 41 52 48 c7 c7 c0 15 1a 86 48 0f 45 d6 48 c7 c6 80 15 1a 86 48 89 c1 49 0f 45 f3 e8 84 27 88 ff <0f> 0b 490 [ 5.384903] RSP: 0018:ffffc900006f77a8 EFLAGS: 00010246 [ 5.384903] RAX: 000000000000006f RBX: ffff88800f0ad2a8 RCX: 1ffffffff0f72e74 [ 5.384903] RDX: 0000000000000000 RSI: 0000000000000004 RDI: ffffffff87b973a0 [ 5.384903] RBP: 0000000000000010 R08: 0000000000000000 R09: fffffbfff0f72e74 [ 5.384903] R10: 0000000000000003 R11: 79706f6372657375 R12: 0000000000000001 [ 5.384903] R13: ffff88800f0ad2b8 R14: ffffea00003c2b40 R15: ffffea00003c2b00 [ 5.384903] FS: 0000000011bc4380(0000) GS:ffff8880bf100000(0000) knlGS:0000000000000000 [ 5.384903] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 5.384903] CR2: 000056aa3b8e5fe4 CR3: 000000000ea26004 CR4: 0000000000770ef0 [ 5.384903] PKRU: 55555554 [ 5.384903] Call Trace: [ 5.384903] <TASK> [ 5.384903] __check_heap_object+0x9a/0xd0 [ 5.384903] __check_object_size+0x46c/0x690 [ 5.384903] put_cmsg+0x129/0x5e0 [ 5.384903] sock_recv_errqueue+0x22f/0x380 [ 5.384903] tls_sw_recvmsg+0x7ed/0x1960 [ 5.384903] ? srso_alias_return_thunk+0x5/0xfbef5 [ 5.384903] ? schedule+0x6d/0x270 [ 5.384903] ? srso_alias_return_thunk+0x5/0xfbef5 [ 5.384903] ? mutex_unlock+0x81/0xd0 [ 5.384903] ? __pfx_mutex_unlock+0x10/0x10 [ 5.384903] ? __pfx_tls_sw_recvmsg+0x10/0x10 [ 5.384903] ? _raw_spin_lock_irqsave+0x8f/0xf0 [ 5.384903] ? _raw_read_unlock_irqrestore+0x20/0x40 [ 5.384903] ? srso_alias_return_thunk+0x5/0xfbef5 The crash offset 296 corresponds to skb2->cb within skbuff_fclones: - sizeof(struct sk_buff) = 232 - offsetof(struct sk_buff, cb) = 40 - offset of skb2.cb in fclones = 232 + 40 = 272 - crash offset 296 = 272 + 24 (inside sock_exterr_skb.ee) This patch uses a local stack variable as a bounce buffer to avoid the hardened usercopy check failure. [1] https://elixir.bootlin.com/linux/v6.12.62/source/net/ipv4/tcp.c#L885 [2] https://elixir.bootlin.com/linux/v6.12.62/source/net/core/skbuff.c#L5104 [3] https://elixir.bootlin.com/linux/v6.12.62/source/net/core/skbuff.c#L5566 [4] https://elixir.bootlin.com/linux/v6.12.62/source/net/core/skbuff.c#L5491 [5] https://elixir.bootlin.com/linux/v6.12.62/source/mm/slub.c#L5719 Fixes: 6d07d1c ("usercopy: Restrict non-usercopy caches to size 0") Reported-by: Xiang Mei <xmei5@asu.edu> Signed-off-by: Weiming Shi <bestswngs@gmail.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Signed-off-by: NipaLocal <nipa@local>

Fix assert lock warning while calling devl_param_driverinit_value_set() in ena. WARNING: net/devlink/core.c:261 at devl_assert_locked+0x62/0x90, CPU#0: kworker/0:0/9 CPU: 0 UID: 0 PID: 9 Comm: kworker/0:0 Not tainted 6.19.0-rc2+ #1 PREEMPT(lazy) Hardware name: Amazon EC2 m8i-flex.4xlarge/, BIOS 1.0 10/16/2017 Workqueue: events work_for_cpu_fn RIP: 0010:devl_assert_locked+0x62/0x90 Call Trace: <TASK> devl_param_driverinit_value_set+0x15/0x1c0 ena_devlink_alloc+0x18c/0x220 [ena] ? __pfx_ena_devlink_alloc+0x10/0x10 [ena] ? trace_hardirqs_on+0x18/0x140 ? lockdep_hardirqs_on+0x8c/0x130 ? __raw_spin_unlock_irqrestore+0x5d/0x80 ? __raw_spin_unlock_irqrestore+0x46/0x80 ? devm_ioremap_wc+0x9a/0xd0 ena_probe+0x4d2/0x1b20 [ena] ? __lock_acquire+0x56a/0xbd0 ? __pfx_ena_probe+0x10/0x10 [ena] ? local_clock+0x15/0x30 ? __lock_release.isra.0+0x1c9/0x340 ? mark_held_locks+0x40/0x70 ? lockdep_hardirqs_on_prepare.part.0+0x92/0x170 ? trace_hardirqs_on+0x18/0x140 ? lockdep_hardirqs_on+0x8c/0x130 ? __raw_spin_unlock_irqrestore+0x5d/0x80 ? __raw_spin_unlock_irqrestore+0x46/0x80 ? __pfx_ena_probe+0x10/0x10 [ena] ...... </TASK> Fixes: 816b526 ("net: ena: Control PHC enable through devlink") Signed-off-by: Frank Liang <xiliang@redhat.com> Reviewed-by: David Arinzon <darinzon@amazon.com> Signed-off-by: NipaLocal <nipa@local>

…te in qfq_reset `qfq_class->leaf_qdisc->q.qlen > 0` does not imply that the class itself is active. Two qfq_class objects may point to the same leaf_qdisc. This happens when: 1. one QFQ qdisc is attached to the dev as the root qdisc, and 2. another QFQ qdisc is temporarily referenced (e.g., via qdisc_get() / qdisc_put()) and is pending to be destroyed, as in function tc_new_tfilter. When packets are enqueued through the root QFQ qdisc, the shared leaf_qdisc->q.qlen increases. At the same time, the second QFQ qdisc triggers qdisc_put and qdisc_destroy: the qdisc enters qfq_reset() with its own q->q.qlen == 0, but its class's leaf qdisc->q.qlen > 0. Therefore, the qfq_reset would wrongly deactivate an inactive aggregate and trigger a null-deref in qfq_deactivate_agg: [ 0.977749] Oops: general protection fault, probably for non-canonical address 0xdffffc0000000000: 0000 [#1] SMP KAI [ 0.978440] KASAN: null-ptr-deref in range [0x0000000000000000-0x0000000000000007] [ 0.978875] CPU: 0 UID: 0 PID: 135 Comm: exploit Not tainted 6.12.57 #3 [ 0.979270] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.17.0-0-gb52ca86e094d-prebuilt.qemu.or4 [ 0.979913] RIP: qfq_deactivate_agg+0x187/0xca0 [ 0.980200] Code: 00 fc ff df 48 89 fe 48 c1 ee 03 80 3c 16 00 0f 85 1d 09 00 00 48 be 00 00 00 00 00 fc ff df 48 80 Code starting with the faulting instruction =========================================== 0: 00 fc add %bh,%ah 2: ff lcall (bad) 3: df 48 89 fisttps -0x77(%rax) 6: fe 48 c1 decb -0x3f(%rax) 9: ee out %al,(%dx) a: 03 80 3c 16 00 0f add 0xf00163c(%rax),%eax 10: 85 1d 09 00 00 48 test %ebx,0x48000009(%rip) # 0x4800001f 16: be 00 00 00 00 mov $0x0,%esi 1b: 00 fc add %bh,%ah 1d: ff lcall (bad) 1e: df 48 80 fisttps -0x80(%rax) [ 0.981234] RSP: 0018:ffff8880106d73f8 EFLAGS: 00010246 [ 0.981517] RAX: 0000000000000000 RBX: ffff88800c518000 RCX: ffff888010bc1358 [ 0.981943] RDX: 0000000000000000 RSI: dffffc0000000000 RDI: 0000000000000000 [ 0.982336] RBP: ffff888010bc1340 R08: ffff88800c518158 R09: ffff88800c518158 [ 0.982734] R10: 1ffff110018a302c R11: ffffffff89689156 R12: 0000000000000000 [ 0.983140] R13: ffff888010bc0180 R14: 0000000000000000 R15: ffff888010bc1350 [ 0.983521] FS: 0000000009737380(0000) GS:ffff8880bf000000(0000) knlGS:0000000000000000 [ 0.983955] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 0.984270] CR2: 00000000097c1000 CR3: 000000000ee7c004 CR4: 0000000000772ef0 [ 0.984654] PKRU: 55555554 [ 0.984804] Call Trace: [ 0.984957] <TASK> [ 0.985084] qfq_reset_qdisc+0x27c/0x3e0 [ 0.985316] ? __pfx_mutex_lock+0x10/0x10 [ 0.985541] qdisc_reset+0x9d/0x590 [ 0.985736] ? __tcf_block_put+0x2e/0x2b0 [ 0.985980] ? __pfx_mutex_unlock+0x10/0x10 [ 0.986237] ? __tcf_chain_put+0x4a/0x880 [ 0.986465] __qdisc_destroy+0xb2/0x280 [ 0.986686] tc_new_tfilter+0x9af/0x2180 [ 0.986932] ? __pfx_stack_trace_consume_entry+0x10/0x10 [ 0.987216] ? __pfx_stack_trace_consume_entry+0x10/0x10 [ 0.987505] ? __pfx_tc_new_tfilter+0x10/0x10 [ 0.987755] ? unwind_get_return_address+0x5e/0xa0 [ 0.988025] ? arch_stack_walk+0xac/0x100 [ 0.988241] ? stack_depot_save_flags+0x29/0x7e0 [ 0.988506] ? stack_trace_save+0x94/0xd0 [ 0.988722] ? security_capable+0xda/0x160 [ 0.988970] rtnetlink_rcv_msg+0x543/0xc50 [ 0.989204] ? __pfx_rtnetlink_rcv_msg+0x10/0x10 [ 0.989458] netlink_rcv_skb+0x134/0x370 [ 0.989676] ? __pfx_rtnetlink_rcv_msg+0x10/0x10 [ 0.989951] ? __pfx_netlink_rcv_skb+0x10/0x10 [ 0.990190] ? __pfx___netlink_lookup+0x10/0x10 [ 0.990440] ? kasan_save_track+0x14/0x30 [ 0.990659] ? _copy_from_iter+0x214/0x1100 [ 0.990886] netlink_unicast+0x6db/0xa20 [ 0.991116] ? __pfx_netlink_unicast+0x10/0x10 [ 0.991355] ? unwind_get_return_address+0x5e/0xa0 [ 0.991616] ? arch_stack_walk+0xac/0x100 [ 0.991854] ? __check_object_size+0x46c/0x690 [ 0.992091] netlink_sendmsg+0x72b/0xbd0 [ 0.992301] ? __pfx_netlink_sendmsg+0x10/0x10 [ 0.992545] ? __pfx_aa_file_perm+0x10/0x10 [ 0.992793] sock_write_iter+0x489/0x560 [ 0.993043] ? kmem_cache_free+0x249/0x4b0 [ 0.993282] ? __pfx_sock_write_iter+0x10/0x10 [ 0.993565] ? security_file_permission+0x7e/0xe0 [ 0.993922] ? rw_verify_area+0x70/0x4d0 [ 0.994192] vfs_write+0x930/0xea0 [ 0.994439] ? __pfx_vfs_write+0x10/0x10 [ 0.994642] ? fdget_pos+0x57/0x4f0 [ 0.994810] ? __call_rcu_common.constprop.0+0x247/0x7a0 [ 0.995105] ksys_write+0x17c/0x1d0 [ 0.995290] ? __pfx_ksys_write+0x10/0x10 [ 0.995511] ? __x64_sys_close+0x7c/0xd0 [ 0.995732] do_syscall_64+0x58/0x120 [ 0.995959] entry_SYSCALL_64_after_hwframe+0x76/0x7e [ 0.996233] RIP: 0033:0x424c34 [ 0.996397] Code: 89 02 48 c7 c0 ff ff ff ff eb bd 66 2e 0f 1f 84 00 00 00 00 00 90 f3 0f 1e fa 80 3d 2d 44 09 00 09 Code starting with the faulting instruction =========================================== 0: 89 02 mov %eax,(%rdx) 2: 48 c7 c0 ff ff ff ff mov $0xffffffffffffffff,%rax 9: eb bd jmp 0xffffffffffffffc8 b: 66 2e 0f 1f 84 00 00 cs nopw 0x0(%rax,%rax,1) 12: 00 00 00 15: 90 nop 16: f3 0f 1e fa endbr64 1a: 80 3d 2d 44 09 00 09 cmpb $0x9,0x9442d(%rip) # 0x9444e [ 0.997360] RSP: 002b:00007ffea27af418 EFLAGS: 00000202 ORIG_RAX: 0000000000000001 [ 0.997746] RAX: ffffffffffffffda RBX: 0000000000000001 RCX: 0000000000424c34 [ 0.998123] RDX: 000000000000003c RSI: 00000000097bf5d0 RDI: 0000000000000003 [ 0.998495] RBP: 00007ffea27af460 R08: 0000000000000000 R09: 0000000000000000 [ 0.998862] R10: 0000000000000001 R11: 0000000000000202 R12: 00007ffea27af5c8 [ 0.999224] R13: 00007ffea27af5d8 R14: 00000000004b3828 R15: 0000000000000001 [ 0.999592] </TASK> [ 0.999711] Modules linked in: [ 0.999899] ---[ end trace 0000000000000000 ]--- [ 1.000143] RIP: qfq_deactivate_agg+0x187/0xca0 [ 1.000396] Code: 00 fc ff df 48 89 fe 48 c1 ee 03 80 3c 16 00 0f 85 1d 09 00 00 48 be 00 00 00 00 00 fc ff df 48 80 Code starting with the faulting instruction =========================================== 0: 00 fc add %bh,%ah 2: ff lcall (bad) 3: df 48 89 fisttps -0x77(%rax) 6: fe 48 c1 decb -0x3f(%rax) 9: ee out %al,(%dx) a: 03 80 3c 16 00 0f add 0xf00163c(%rax),%eax 10: 85 1d 09 00 00 48 test %ebx,0x48000009(%rip) # 0x4800001f 16: be 00 00 00 00 mov $0x0,%esi 1b: 00 fc add %bh,%ah 1d: ff lcall (bad) 1e: df 48 80 fisttps -0x80(%rax) [ 1.001456] RSP: 0018:ffff8880106d73f8 EFLAGS: 00010246 [ 1.001735] RAX: 0000000000000000 RBX: ffff88800c518000 RCX: ffff888010bc1358 [ 1.002107] RDX: 0000000000000000 RSI: dffffc0000000000 RDI: 0000000000000000 [ 1.002478] RBP: ffff888010bc1340 R08: ffff88800c518158 R09: ffff88800c518158 [ 1.002853] R10: 1ffff110018a302c R11: ffffffff89689156 R12: 0000000000000000 [ 1.003204] R13: ffff888010bc0180 R14: 0000000000000000 R15: ffff888010bc1350 [ 1.003559] FS: 0000000009737380(0000) GS:ffff8880bf000000(0000) knlGS:0000000000000000 [ 1.003962] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 1.004243] CR2: 00000000097c1000 CR3: 000000000ee7c004 CR4: 0000000000772ef0 [ 1.004599] PKRU: 55555554 [ 1.004740] Kernel panic - not syncing: Fatal exception [ 1.005071] Kernel Offset: disabled Fixes: 0545a30 ("pkt_sched: QFQ - quick fair queue scheduler") Signed-off-by: Xiang Mei <xmei5@asu.edu> Signed-off-by: NipaLocal <nipa@local>

skbuff_fclone_cache was created without defining a usercopy region, [1] unlike skbuff_head_cache which properly whitelists the cb[] field. [2] This causes a usercopy BUG() when CONFIG_HARDENED_USERCOPY is enabled and the kernel attempts to copy sk_buff.cb data to userspace via sock_recv_errqueue() -> put_cmsg(). The crash occurs when: 1. TCP allocates an skb using alloc_skb_fclone() (from skbuff_fclone_cache) [1] 2. The skb is cloned via skb_clone() using the pre-allocated fclone [3] 3. The cloned skb is queued to sk_error_queue for timestamp reporting 4. Userspace reads the error queue via recvmsg(MSG_ERRQUEUE) 5. sock_recv_errqueue() calls put_cmsg() to copy serr->ee from skb->cb [4] 6. __check_heap_object() fails because skbuff_fclone_cache has no usercopy whitelist [5] When cloned skbs allocated from skbuff_fclone_cache are used in the socket error queue, accessing the sock_exterr_skb structure in skb->cb via put_cmsg() triggers a usercopy hardening violation: [ 5.379589] usercopy: Kernel memory exposure attempt detected from SLUB object 'skbuff_fclone_cache' (offset 296, size 16)! [ 5.382796] kernel BUG at mm/usercopy.c:102! [ 5.383923] Oops: invalid opcode: 0000 [#1] SMP KASAN NOPTI [ 5.384903] CPU: 1 UID: 0 PID: 138 Comm: poc_put_cmsg Not tainted 6.12.57 torvalds#7 [ 5.384903] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.16.3-0-ga6ed6b701f0a-prebuilt.qemu.org 04/01/2014 [ 5.384903] RIP: 0010:usercopy_abort+0x6c/0x80 [ 5.384903] Code: 1a 86 51 48 c7 c2 40 15 1a 86 41 52 48 c7 c7 c0 15 1a 86 48 0f 45 d6 48 c7 c6 80 15 1a 86 48 89 c1 49 0f 45 f3 e8 84 27 88 ff <0f> 0b 490 [ 5.384903] RSP: 0018:ffffc900006f77a8 EFLAGS: 00010246 [ 5.384903] RAX: 000000000000006f RBX: ffff88800f0ad2a8 RCX: 1ffffffff0f72e74 [ 5.384903] RDX: 0000000000000000 RSI: 0000000000000004 RDI: ffffffff87b973a0 [ 5.384903] RBP: 0000000000000010 R08: 0000000000000000 R09: fffffbfff0f72e74 [ 5.384903] R10: 0000000000000003 R11: 79706f6372657375 R12: 0000000000000001 [ 5.384903] R13: ffff88800f0ad2b8 R14: ffffea00003c2b40 R15: ffffea00003c2b00 [ 5.384903] FS: 0000000011bc4380(0000) GS:ffff8880bf100000(0000) knlGS:0000000000000000 [ 5.384903] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 5.384903] CR2: 000056aa3b8e5fe4 CR3: 000000000ea26004 CR4: 0000000000770ef0 [ 5.384903] PKRU: 55555554 [ 5.384903] Call Trace: [ 5.384903] <TASK> [ 5.384903] __check_heap_object+0x9a/0xd0 [ 5.384903] __check_object_size+0x46c/0x690 [ 5.384903] put_cmsg+0x129/0x5e0 [ 5.384903] sock_recv_errqueue+0x22f/0x380 [ 5.384903] tls_sw_recvmsg+0x7ed/0x1960 [ 5.384903] ? srso_alias_return_thunk+0x5/0xfbef5 [ 5.384903] ? schedule+0x6d/0x270 [ 5.384903] ? srso_alias_return_thunk+0x5/0xfbef5 [ 5.384903] ? mutex_unlock+0x81/0xd0 [ 5.384903] ? __pfx_mutex_unlock+0x10/0x10 [ 5.384903] ? __pfx_tls_sw_recvmsg+0x10/0x10 [ 5.384903] ? _raw_spin_lock_irqsave+0x8f/0xf0 [ 5.384903] ? _raw_read_unlock_irqrestore+0x20/0x40 [ 5.384903] ? srso_alias_return_thunk+0x5/0xfbef5 The crash offset 296 corresponds to skb2->cb within skbuff_fclones: - sizeof(struct sk_buff) = 232 - offsetof(struct sk_buff, cb) = 40 - offset of skb2.cb in fclones = 232 + 40 = 272 - crash offset 296 = 272 + 24 (inside sock_exterr_skb.ee) This patch uses a local stack variable as a bounce buffer to avoid the hardened usercopy check failure. [1] https://elixir.bootlin.com/linux/v6.12.62/source/net/ipv4/tcp.c#L885 [2] https://elixir.bootlin.com/linux/v6.12.62/source/net/core/skbuff.c#L5104 [3] https://elixir.bootlin.com/linux/v6.12.62/source/net/core/skbuff.c#L5566 [4] https://elixir.bootlin.com/linux/v6.12.62/source/net/core/skbuff.c#L5491 [5] https://elixir.bootlin.com/linux/v6.12.62/source/mm/slub.c#L5719 Fixes: 6d07d1c ("usercopy: Restrict non-usercopy caches to size 0") Reported-by: Xiang Mei <xmei5@asu.edu> Signed-off-by: Weiming Shi <bestswngs@gmail.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Signed-off-by: NipaLocal <nipa@local>

Fix assert lock warning while calling devl_param_driverinit_value_set() in ena. WARNING: net/devlink/core.c:261 at devl_assert_locked+0x62/0x90, CPU#0: kworker/0:0/9 CPU: 0 UID: 0 PID: 9 Comm: kworker/0:0 Not tainted 6.19.0-rc2+ #1 PREEMPT(lazy) Hardware name: Amazon EC2 m8i-flex.4xlarge/, BIOS 1.0 10/16/2017 Workqueue: events work_for_cpu_fn RIP: 0010:devl_assert_locked+0x62/0x90 Call Trace: <TASK> devl_param_driverinit_value_set+0x15/0x1c0 ena_devlink_alloc+0x18c/0x220 [ena] ? __pfx_ena_devlink_alloc+0x10/0x10 [ena] ? trace_hardirqs_on+0x18/0x140 ? lockdep_hardirqs_on+0x8c/0x130 ? __raw_spin_unlock_irqrestore+0x5d/0x80 ? __raw_spin_unlock_irqrestore+0x46/0x80 ? devm_ioremap_wc+0x9a/0xd0 ena_probe+0x4d2/0x1b20 [ena] ? __lock_acquire+0x56a/0xbd0 ? __pfx_ena_probe+0x10/0x10 [ena] ? local_clock+0x15/0x30 ? __lock_release.isra.0+0x1c9/0x340 ? mark_held_locks+0x40/0x70 ? lockdep_hardirqs_on_prepare.part.0+0x92/0x170 ? trace_hardirqs_on+0x18/0x140 ? lockdep_hardirqs_on+0x8c/0x130 ? __raw_spin_unlock_irqrestore+0x5d/0x80 ? __raw_spin_unlock_irqrestore+0x46/0x80 ? __pfx_ena_probe+0x10/0x10 [ena] ...... </TASK> Fixes: 816b526 ("net: ena: Control PHC enable through devlink") Signed-off-by: Frank Liang <xiliang@redhat.com> Reviewed-by: David Arinzon <darinzon@amazon.com> Reviewed-by: Jiri Pirko <jiri@nvidia.com> Signed-off-by: NipaLocal <nipa@local>