forked from torvalds/linux
-
Notifications
You must be signed in to change notification settings - Fork 6
adapt patches from ubuntu redfs fuse for RHEL 10 #13
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
When writing back pages while using writeback caching the code did a copy of data into temporary pages to avoid a deadlock in reclaiming of memory. This is an adaptation and backport of a patch by Joanne Koong joannelkoong@gmail.com. Since we use pinned memory with io_uring we don't need the temporary copies and we don't use the AS_WRITEBACK_MAY_DEADLOCK_ON_RECLAIM flag in the pagemap. Link: https://www.spinics.net/lists/linux-mm/msg407405.html Signed-off-by: Horst Birthelmer <hbirthelmer@ddn.com>
When calling the fuse server with a dlm request and the fuse server responds with some other error than ENOSYS most likely the lock size will be set to zero. In that case the kernel will abort the fuse connection. This is completely unnecessary. Signed-off-by: Horst Birthelmer <hbirthelmer@ddn.com>
Check whether dlm is still enabled when interpreting the returned error from fuse server. Signed-off-by: Horst Birthelmer <hbirthelmer@ddn.com>
Sometimes the file offset alignment needs to be opt-in to achieve the optimum performance at the backend store. For example when ErasureCode [1] is used at the backend store, the optimum write performance is achieved when the WRITE request is aligned with the stripe size of ErasureCode. Otherwise a non-aligned WRITE request needs to be split at the stripe size boundary. It is quite costly to handle these split partial requests, as firstly the whole stripe to which the split partial request belongs needs to be read out, then overwrite the read stripe buffer with the request, and finally write the whole stripe back to the persistent storage. Thus the backend store can suffer severe performance degradation when WRITE requests can not fit into one stripe exactly. The write performance can be 10x slower when the request is 256KB in size given 4MB stripe size. Also there can be 50% performance degradation in theory if the request is not stripe boundary aligned. Besides, the conveyed test indicates that, the non-alignment issue becomes more severe when decreasing fuse's max_ratio, maybe partly because the background writeback now is more likely to run parallelly with the dirtier. fuse's max_ratio ratio of aligned WRITE requests ---------------- ------------------------------- 70 99.9% 40 74% 20 45% 10 20% With the patched version, which makes the alignment constraint opt-in when constructing WRITE requests, the ratio of aligned WRITE requests increases to 98% (previously 20%) when fuse's max_ratio is 10. fuse: fix alignment to work with redfs ubuntu - small fix to make the fuse alignment patch work with redfs ubuntu 6.8.x - add writeback_control to fuse_writepage_need_send() to make more accurate decisions about when to skip sending data - fix shift number for FUSE_ALIGN_PG_ORDER - remove test code [1] https://lore.kernel.org/linux-fsdevel/20240124070512.52207-1-jefflexu@linux.alibaba.com/T/#m9bce469998ea6e4f911555c6f7be1e077ce3d8b4 Signed-off-by: Jingbo Xu <jefflexu@linux.alibaba.com> Signed-off-by: Bernd Schubert <bschubert@ddn.com> Signed-off-by: Horst Birthelmer <hbirthelmer@ddn.com>
Collaborator
|
Would be good to change "fuse: fix commit 5c021bd to work for RHEL10" to be more descriptive - the commit might changes on rebase. |
Collaborator
Author
oh yes ... forgot about that .. done! |
bsbernd
previously approved these changes
Sep 10, 2025
Fuse in the linux kernel changed slightly since the patches to remove tmp page copies were developed for linux 6.8. The main logic did not change only some convenience functions were added to the kernel which the patch did not use for compatibility reasons since they were not present in lunx 6.8. - use fuse_writepage_finish_stat() where appropriate - use fuse_writepage_args_setup() to setup the args Signed-off-by: Horst Birthelmer <hbirthelmer@ddn.com>
b871d69 to
dc6429f
Compare
bsbernd
pushed a commit
that referenced
this pull request
Nov 7, 2025
jira LE-1907 cve CVE-2024-41041 Rebuild_History Non-Buildable kernel-5.14.0-427.33.1.el9_4 commit-author Kuniyuki Iwashima <kuniyu@amazon.com> commit 5c0b485 syzkaller triggered the warning [0] in udp_v4_early_demux(). In udp_v[46]_early_demux() and sk_lookup(), we do not touch the refcount of the looked-up sk and use sock_pfree() as skb->destructor, so we check SOCK_RCU_FREE to ensure that the sk is safe to access during the RCU grace period. Currently, SOCK_RCU_FREE is flagged for a bound socket after being put into the hash table. Moreover, the SOCK_RCU_FREE check is done too early in udp_v[46]_early_demux() and sk_lookup(), so there could be a small race window: CPU1 CPU2 ---- ---- udp_v4_early_demux() udp_lib_get_port() | |- hlist_add_head_rcu() |- sk = __udp4_lib_demux_lookup() | |- DEBUG_NET_WARN_ON_ONCE(sk_is_refcounted(sk)); `- sock_set_flag(sk, SOCK_RCU_FREE) We had the same bug in TCP and fixed it in commit 871019b ("net: set SOCK_RCU_FREE before inserting socket into hashtable"). Let's apply the same fix for UDP. [0]: WARNING: CPU: 0 PID: 11198 at net/ipv4/udp.c:2599 udp_v4_early_demux+0x481/0xb70 net/ipv4/udp.c:2599 Modules linked in: CPU: 0 PID: 11198 Comm: syz-executor.1 Not tainted 6.9.0-g93bda33046e7 #13 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.16.0-0-gd239552ce722-prebuilt.qemu.org 04/01/2014 RIP: 0010:udp_v4_early_demux+0x481/0xb70 net/ipv4/udp.c:2599 Code: c5 7a 15 fe bb 01 00 00 00 44 89 e9 31 ff d3 e3 81 e3 bf ef ff ff 89 de e8 2c 74 15 fe 85 db 0f 85 02 06 00 00 e8 9f 7a 15 fe <0f> 0b e8 98 7a 15 fe 49 8d 7e 60 e8 4f 39 2f fe 49 c7 46 60 20 52 RSP: 0018:ffffc9000ce3fa58 EFLAGS: 00010293 RAX: 0000000000000000 RBX: 0000000000000000 RCX: ffffffff8318c92c RDX: ffff888036ccde00 RSI: ffffffff8318c2f1 RDI: 0000000000000001 RBP: ffff88805a2dd6e0 R08: 0000000000000001 R09: 0000000000000000 R10: 0000000000000000 R11: 0001ffffffffffff R12: ffff88805a2dd680 R13: 0000000000000007 R14: ffff88800923f900 R15: ffff88805456004e FS: 00007fc449127640(0000) GS:ffff88807dc00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007fc449126e38 CR3: 000000003de4b002 CR4: 0000000000770ef0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000600 PKRU: 55555554 Call Trace: <TASK> ip_rcv_finish_core.constprop.0+0xbdd/0xd20 net/ipv4/ip_input.c:349 ip_rcv_finish+0xda/0x150 net/ipv4/ip_input.c:447 NF_HOOK include/linux/netfilter.h:314 [inline] NF_HOOK include/linux/netfilter.h:308 [inline] ip_rcv+0x16c/0x180 net/ipv4/ip_input.c:569 __netif_receive_skb_one_core+0xb3/0xe0 net/core/dev.c:5624 __netif_receive_skb+0x21/0xd0 net/core/dev.c:5738 netif_receive_skb_internal net/core/dev.c:5824 [inline] netif_receive_skb+0x271/0x300 net/core/dev.c:5884 tun_rx_batched drivers/net/tun.c:1549 [inline] tun_get_user+0x24db/0x2c50 drivers/net/tun.c:2002 tun_chr_write_iter+0x107/0x1a0 drivers/net/tun.c:2048 new_sync_write fs/read_write.c:497 [inline] vfs_write+0x76f/0x8d0 fs/read_write.c:590 ksys_write+0xbf/0x190 fs/read_write.c:643 __do_sys_write fs/read_write.c:655 [inline] __se_sys_write fs/read_write.c:652 [inline] __x64_sys_write+0x41/0x50 fs/read_write.c:652 x64_sys_call+0xe66/0x1990 arch/x86/include/generated/asm/syscalls_64.h:2 do_syscall_x64 arch/x86/entry/common.c:52 [inline] do_syscall_64+0x4b/0x110 arch/x86/entry/common.c:83 entry_SYSCALL_64_after_hwframe+0x4b/0x53 RIP: 0033:0x7fc44a68bc1f Code: 89 54 24 18 48 89 74 24 10 89 7c 24 08 e8 e9 cf f5 ff 48 8b 54 24 18 48 8b 74 24 10 41 89 c0 8b 7c 24 08 b8 01 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 31 44 89 c7 48 89 44 24 08 e8 3c d0 f5 ff 48 RSP: 002b:00007fc449126c90 EFLAGS: 00000293 ORIG_RAX: 0000000000000001 RAX: ffffffffffffffda RBX: 00000000004bc050 RCX: 00007fc44a68bc1f RDX: 0000000000000032 RSI: 00000000200000c0 RDI: 00000000000000c8 RBP: 00000000004bc050 R08: 0000000000000000 R09: 0000000000000000 R10: 0000000000000032 R11: 0000000000000293 R12: 0000000000000000 R13: 000000000000000b R14: 00007fc44a5ec530 R15: 0000000000000000 </TASK> Fixes: 6acc9b4 ("bpf: Add helper to retrieve socket in BPF") Reported-by: syzkaller <syzkaller@googlegroups.com> Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Link: https://patch.msgid.link/20240709191356.24010-1-kuniyu@amazon.com Signed-off-by: Paolo Abeni <pabeni@redhat.com> (cherry picked from commit 5c0b485) Signed-off-by: Jonathan Maple <jmaple@ciq.com>
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
No description provided.