Skip to content

Conversation

@avagin
Copy link
Owner

@avagin avagin commented Nov 20, 2025

No description provided.

avagin pushed a commit that referenced this pull request Nov 22, 2025
Handle skb allocation failures in RX path, to avoid NULL pointer
dereference and RX stalls under memory pressure. If the refill fails
with -ENOMEM, complete napi polling and wake up later to retry via timer.
Also explicitly re-enable RX DMA after oom, so the dmac doesn't remain
stopped in this situation.

Previously, memory pressure could lead to skb allocation failures and
subsequent Oops like:

	Oops: Kernel access of bad area, sig: 11 [#2]
	Hardware name: SonyPS3 Cell Broadband Engine 0x701000 PS3
	NIP [c0003d0000065900] gelic_net_poll+0x6c/0x2d0 [ps3_gelic] (unreliable)
	LR [c0003d00000659c4] gelic_net_poll+0x130/0x2d0 [ps3_gelic]
	Call Trace:
	  gelic_net_poll+0x130/0x2d0 [ps3_gelic] (unreliable)
	  __napi_poll+0x44/0x168
	  net_rx_action+0x178/0x290

Steps to reproduce the issue:
	1. Start a continuous network traffic, like scp of a 20GB file
	2. Inject failslab errors using the kernel fault injection:
	    echo -1 > /sys/kernel/debug/failslab/times
	    echo 30 > /sys/kernel/debug/failslab/interval
	    echo 100 > /sys/kernel/debug/failslab/probability
	3. After some time, traces start to appear, kernel Oopses
	   and the system stops

Step 2 is not always necessary, as it is usually already triggered by
the transfer of a big enough file.

Fixes: 02c1889 ("ps3: gigabit ethernet driver for PS3, take3")
Signed-off-by: Florian Fuchs <fuchsfl@gmail.com>
Link: https://patch.msgid.link/20251113181000.3914980-1-fuchsfl@gmail.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
@avagin avagin force-pushed the wip/misc-cgroup-hwcap-mask branch 3 times, most recently from 51f4ff5 to 5ab372a Compare December 4, 2025 23:47
avagin pushed a commit that referenced this pull request Dec 4, 2025
It's possible that the auxiliary proxy device we add when setting up the
GPIO controller exposing shared pins, will get matched and probed
immediately. The gpio-proxy-driver will then retrieve the shared
descriptor structure. That will cause a recursive mutex locking and
a deadlock because we're already holding the gpio_shared_lock in
gpio_device_setup_shared() and try to take it again in
devm_gpiod_shared_get() like this:

[    4.298346] gpiolib_shared: GPIO 130 owned by f100000.pinctrl is shared by multiple consumers
[    4.307157] gpiolib_shared: Setting up a shared GPIO entry for speaker@0,3
[    4.314604]
[    4.316146] ============================================
[    4.321600] WARNING: possible recursive locking detected
[    4.327054] 6.18.0-rc7-next-20251125-g3f300d0674f6-dirty #3887 Not tainted
[    4.334115] --------------------------------------------
[    4.339566] kworker/u32:3/71 is trying to acquire lock:
[    4.344931] ffffda019ba71850 (gpio_shared_lock){+.+.}-{4:4}, at: devm_gpiod_shared_get+0x34/0x2e0
[    4.354057]
[    4.354057] but task is already holding lock:
[    4.360041] ffffda019ba71850 (gpio_shared_lock){+.+.}-{4:4}, at: gpio_device_setup_shared+0x30/0x268
[    4.369421]
[    4.369421] other info that might help us debug this:
[    4.376126]  Possible unsafe locking scenario:
[    4.376126]
[    4.382198]        CPU0
[    4.384711]        ----
[    4.387223]   lock(gpio_shared_lock);
[    4.390992]   lock(gpio_shared_lock);
[    4.394761]
[    4.394761]  *** DEADLOCK ***
[    4.394761]
[    4.400832]  May be due to missing lock nesting notation
[    4.400832]
[    4.407802] 5 locks held by kworker/u32:3/71:
[    4.412279]  #0: ffff000080020948 ((wq_completion)events_unbound){+.+.}-{0:0}, at: process_one_work+0x194/0x64c
[    4.422650]  #1: ffff800080963d60 (deferred_probe_work){+.+.}-{0:0}, at: process_one_work+0x1bc/0x64c
[    4.432117]  #2: ffff00008165c8f8 (&dev->mutex){....}-{4:4}, at: __device_attach+0x3c/0x198
[    4.440700]  #3: ffffda019ba71850 (gpio_shared_lock){+.+.}-{4:4}, at: gpio_device_setup_shared+0x30/0x268
[    4.450523]  #4: ffff0000810fe918 (&dev->mutex){....}-{4:4}, at: __device_attach+0x3c/0x198
[    4.459103]
[    4.459103] stack backtrace:
[    4.463581] CPU: 6 UID: 0 PID: 71 Comm: kworker/u32:3 Not tainted 6.18.0-rc7-next-20251125-g3f300d0674f6-dirty #3887 PREEMPT
[    4.463589] Hardware name: Qualcomm Technologies, Inc. Robotics RB5 (DT)
[    4.463593] Workqueue: events_unbound deferred_probe_work_func
[    4.463602] Call trace:
[    4.463604]  show_stack+0x18/0x24 (C)
[    4.463617]  dump_stack_lvl+0x70/0x98
[    4.463627]  dump_stack+0x18/0x24
[    4.463636]  print_deadlock_bug+0x224/0x238
[    4.463643]  __lock_acquire+0xe4c/0x15f0
[    4.463648]  lock_acquire+0x1cc/0x344
[    4.463653]  __mutex_lock+0xb8/0x840
[    4.463661]  mutex_lock_nested+0x24/0x30
[    4.463667]  devm_gpiod_shared_get+0x34/0x2e0
[    4.463674]  gpio_shared_proxy_probe+0x18/0x138
[    4.463682]  auxiliary_bus_probe+0x40/0x78
[    4.463688]  really_probe+0xbc/0x2c0
[    4.463694]  __driver_probe_device+0x78/0x120
[    4.463701]  driver_probe_device+0x3c/0x160
[    4.463708]  __device_attach_driver+0xb8/0x140
[    4.463716]  bus_for_each_drv+0x88/0xe8
[    4.463723]  __device_attach+0xa0/0x198
[    4.463729]  device_initial_probe+0x14/0x20
[    4.463737]  bus_probe_device+0xb4/0xc0
[    4.463743]  device_add+0x578/0x76c
[    4.463747]  __auxiliary_device_add+0x40/0xac
[    4.463752]  gpio_device_setup_shared+0x1f8/0x268
[    4.463758]  gpiochip_add_data_with_key+0xdac/0x10ac
[    4.463763]  devm_gpiochip_add_data_with_key+0x30/0x80
[    4.463768]  msm_pinctrl_probe+0x4b0/0x5e0
[    4.463779]  sm8250_pinctrl_probe+0x18/0x40
[    4.463784]  platform_probe+0x5c/0xa4
[    4.463793]  really_probe+0xbc/0x2c0
[    4.463800]  __driver_probe_device+0x78/0x120
[    4.463807]  driver_probe_device+0x3c/0x160
[    4.463814]  __device_attach_driver+0xb8/0x140
[    4.463821]  bus_for_each_drv+0x88/0xe8
[    4.463827]  __device_attach+0xa0/0x198
[    4.463834]  device_initial_probe+0x14/0x20
[    4.463841]  bus_probe_device+0xb4/0xc0
[    4.463847]  deferred_probe_work_func+0x90/0xcc
[    4.463854]  process_one_work+0x214/0x64c
[    4.463860]  worker_thread+0x1bc/0x360
[    4.463866]  kthread+0x14c/0x220
[    4.463871]  ret_from_fork+0x10/0x20
[   77.265041] random: crng init done

Fortunately, at the time of creating of the auxiliary device, we already
know the correct entry so let's store it as the device's platform data.
We don't need to hold gpio_shared_lock in devm_gpiod_shared_get() as
we're not removing the entry or traversing the list anymore but we still
need to protect it from concurrent modification of its fields so add a
more fine-grained mutex.

Fixes: a060b8c ("gpiolib: implement low-level, shared GPIO support")
Reported-by: Dmitry Baryshkov <dmitry.baryshkov@oss.qualcomm.com>
Closes: https://lore.kernel.org/all/fimuvblfy2cmn7o4wzcxjzrux5mwhvlvyxfsgeqs6ore2xg75i@ax46d3sfmdux/
Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov@oss.qualcomm.com>
Tested-by: Dmitry Baryshkov <dmitry.baryshkov@oss.qualcomm.com>
Link: https://lore.kernel.org/r/20251128-gpio-shared-deadlock-v2-1-9f3ae8ddcb09@linaro.org
Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
@avagin avagin force-pushed the wip/misc-cgroup-hwcap-mask branch 2 times, most recently from 21ee7c7 to 13629ad Compare December 6, 2025 05:13
jhovold and others added 23 commits December 23, 2025 15:59
This reverts commit 775fae5.

The new buffer management code that this relies on is broken so revert
for now.

It also looks like the error handling needs some more thought as the
message out size is not reset on errors.

Signed-off-by: Johan Hovold <johan@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Link: https://patch.msgid.link/20251222152204.2846-3-johan@kernel.org
This reverts commit db00286.

The new buffer management code that this feature relies on is broken so
revert for now.

As for the in buffer, nothing prevents the out message size and buffer
from being modified while the message is being processed due to lack of
serialisation.

Signed-off-by: Johan Hovold <johan@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Link: https://patch.msgid.link/20251222152204.2846-4-johan@kernel.org
…d message out fields"

This reverts commit 3e08297.

The new buffer management code has not been tested or reviewed properly
and breaks boot of machines like the Lenovo ThinkPad X13s:

    Unable to handle kernel NULL pointer dereference at virtual address
    0000000000000000

    CPU: 0 UID: 0 PID: 813 Comm: kworker/0:3 Not tainted 6.19.0-rc2 torvalds#26 PREEMPT
    Hardware name: LENOVO 21BYZ9SRUS/21BYZ9SRUS, BIOS N3HET87W (1.59 ) 12/05/2023
    Workqueue: events ucsi_handle_connector_change [typec_ucsi]

    Call trace:
     ucsi_sync_control_common+0xe4/0x1ec [typec_ucsi] (P)
     ucsi_run_command+0xcc/0x194 [typec_ucsi]
     ucsi_send_command_common+0x84/0x2a0 [typec_ucsi]
     ucsi_get_connector_status+0x48/0x78 [typec_ucsi]
     ucsi_handle_connector_change+0x5c/0x4f4 [typec_ucsi]
     process_one_work+0x208/0x60c
     worker_thread+0x244/0x388

The new code completely ignores concurrency so that the message length
can be updated while a transaction is ongoing. In the above case, the
length ends up being modified by another thread while processing an ack
so that the NULL cci pointer is dereferenced.

Fixing this will require designing a proper interface for managing these
transactions, something which most likely involves reverting most of the
offending commit anyway.

Revert the broken code to fix the regression and let Intel come up with
a properly tested implementation for a later kernel.

Fixes: 3e08297 ("usb: typec: ucsi: Update UCSI structure to have message in and message out fields")
Signed-off-by: Johan Hovold <johan@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Link: https://patch.msgid.link/20251222152204.2846-5-johan@kernel.org
Johan Hovold <johan@kernel.org> says:

The new buffer management code has not been tested or reviewed properly
and breaks boot of machines like the Lenovo ThinkPad X13s.

Fixing this will require designing a proper interface for managing these
transactions, something which most likely involves reverting most of the
offending commit anyway.

Revert the broken code to fix the regression and let Intel come up with
a properly tested implementation for a later kernel.

Link: https://lore.kernel.org/r/20251222152204.2846-1-johan@kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
There exists a null pointer dereference in simple_xattrs_free() as
part of the __kernfs_new_node() routine. Within __kernfs_new_node(),
err_out4 calls simple_xattr_free(), but kn->iattr may be NULL if
__kernfs_setattr() was never called. As a result, the first argument to
simple_xattrs_free() may be NULL + 0x38, and no NULL check is done
internally, causing an incorrect pointer dereference.

Add a check to ensure kn->iattr is not NULL, meaning __kernfs_setattr()
has been called and kn->iattr is allocated. Note that struct kernfs_node
kn is allocated with kmem_cache_zalloc, so we can assume kn->iattr will
be NULL if not allocated.

An alternative fix could be to not call simple_xattrs_free() at all. As
was previously discussed during the initial patch, simple_xattrs_free()
is not strictly needed and is included to be consistent with
kernfs_free_rcu(), which also helps the function maintain correctness if
changes are made in __kernfs_new_node().

Reported-by: syzbot+6aaf7f48ae034ab0ea97@syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=6aaf7f48ae034ab0ea97
Fixes: 382b1e8 ("kernfs: fix memory leak of kernfs_iattrs in __kernfs_new_node")
Signed-off-by: Will Rosenberg <whrosenb@asu.edu>
Link: https://patch.msgid.link/20251217060107.4171558-1-whrosenb@asu.edu
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Merge series from Mateusz Litwin <mateusz.litwin@nokia.com>:

On the Stratix10 platform, indirect reads can become very slow due to lost
interrupts and/or missed `complete()` calls, causing
`wait_for_completion_timeout()` to expire.

Three issues were identified:
1) A race condition exists between the read loop and IRQ `complete()`
   call:
   An IRQ can call `complete()` after the inner loop ends, but before
   `reinit_completion()`, losing the completion event and leading to
   `wait_for_completion_timeout()` expire. This function will not return
   an error because `bytes_to_read` > 0 (indicating data is already in the
   FIFO) and the final `ret` value is overwritten by
   `cqspi_wait_for_bit()` return value (indicating request completion),
   masking the timeout.

   For test purpose, logging was added to print the count of timeouts and
   the outer loop count.
   $ dd if=/dev/mtd0 of=/dev/null bs=64M count=1
   [ 2232.925219] cadence-qspi ff8d2000.spi: Indirect read error timeout
    (1) loop (12472)
   [ 2236.200391] cadence-qspi ff8d2000.spi: Indirect read error timeout
    (1) loop (12460)
   [ 2239.482836] cadence-qspi ff8d2000.spi: Indirect read error timeout
    (5) loop (12450)
   This indicates that such an event is rare, but possible.
   Tested on the Stratix10 platform.

2) The quirk assumes the indirect read path never leaves the inner loop on
   SoCFPGA. This assumption is incorrect when using slow flash. Disabling
   IRQs in the inner loop can cause lost interrupts.

3) The `CQSPI_SLOW_SRAM` quirk disables `CQSPI_REG_IRQ_IND_COMP` (indirect
   completion) interrupt, relying solely on the `CQSPI_REG_IRQ_WATERMARK`
   (FIFO watermark) interrupt. For small transfers sizes, the final data
   read might not fill the FIFO sufficiently to trigger the watermark,
   preventing completion and leading to wait_for_completion_timeout()
   expiration.

Two patches have been prepared to resolve these issues.
-  [1/2] spi: cadence-quadspi: Prevent lost complete() call during
   indirect read
   Moving `reinit_completion()` before the inner loop prevents a race
   condition. This might cause a premature IRQ complete() call to occur;
   however, in the worst case, this will result in a spurious wakeup and
   another wait cycle, which is preferable to waiting for a timeout.

-  [2/2] spi: cadence-quadspi: Improve CQSPI_SLOW_SRAM quirk if flash is
   slow
   Re-enabling `CQSPI_REG_IRQ_IND_COMP` interrupt resolves the problem for
   small reads and removes the disabling of interrupts, addressing the
   issue with lost interrupts. This marginally increases the IRQ count.

   Test:
   $ dd if=/dev/mtd0 of=/dev/null bs=1M count=64
   Results from the Stratix10 platform with mt25qu02g flash.
   FIFO size in all tests: 128

   Serviced interrupt call counts:
   Without `CQSPI_SLOW_SRAM` quirk: 16 668 850
   With `CQSPI_SLOW_SRAM` quirk: 204 176
   With `CQSPI_SLOW_SRAM` and this patch: 224 528

Patch 2/2: Delivers a substantial read‑performance improvement for the
Cadence QSPI controller on the Stratix10 platform. Patch 1/2: Applies to
all platforms and should yield a modest performance gain, most noticeable
with large `CQSPI_READ_TIMEOUT_MS` values and workloads dominated by many
small reads.
…el/git/tiwai/sound

Pull sound fixes from Takashi Iwai:
 "Likely the last pull request in 2025, again a collection of lots of
  small fixes. Most of them are various device-specific small fixes:

   - An ASoC core fix for correcting the clamping behavior of *_SX mixer
     elements

   - Various fixes for ASoC fsl, SOF, etc

   - Usual HD- and USB-audio quirks / fix-ups

   - A couple of error-handling fixes for legacy PCMCIA drivers"

* tag 'sound-6.19-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound: (35 commits)
  ALSA: hda/realtek: fix PCI SSID for one of the HP 200 G2i laptop
  ASoC: ops: fix snd_soc_get_volsw for sx controls
  ALSA: hda/realtek: Add Asus quirk for TAS amplifiers
  ASoC: Intel: soc-acpi-intel-mtl-match: Add 6 amp CS35L63 with feedback
  ASoC: Intel: soc-acpi-intel-mtl-match: Add 6 amp CS35L56 with feedback
  ASoC: fsl-asoc-card: Use of_property_present() for non-boolean properties
  ASoC: rt1320: update VC blind write settings
  ASoC: fsl_xcvr: provide regmap names
  ASoC: fsl_sai: Add missing registers to cache default
  ASoC: ak4458: remove the reset operation in probe and remove
  ASoC: fsl_asrc_dma: fix duplicate debugfs directory error
  ASoC: fsl_easrc: fix duplicate debugfs directory error
  ALSA: hda/realtek: fix micmute LED reversed on HP Abe and Bantie
  ALSA: hda/realtek: Add support for HP Clipper Laptop
  ALSA: hda/realtek: Add support for HP Trekker Laptop
  ALSA: usb-mixer: us16x08: validate meter packet indices
  ASoC: Intel: soc-acpi-intel-nvl-match: Drop rt722 l3 from the match table
  ASoC: soc-acpi / SOF: Add best_effort flag to get_function_tplg_files op
  ASoC: SOF: Intel: pci-mtl: Change the topology path to intel/sof-ipc4-tplg
  ASoC: SOF: ipc4-topology: set playback channel mask
  ...
Ensure the perf.data output when checking permissions is written to
/dev/null so that it isn't left in the directory the test is run.

Fixes: b582615 ("perf test kvm: Add some basic perf kvm test coverage")
Signed-off-by: Ian Rogers <irogers@google.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
With sufficient tests running the load causes the top test fails with:
```
123: perf top tests                                                  : FAILED!
 --- start ---
test child forked, pid 629856
Basic perf top test
Basic perf top test [Failed: no sample percentage found]
 ---- end(-1) ----
```
Mark the test exclusive to avoid flakes.

Fixes: 75e9617 ("perf tests top: Add basic perf top coverage test")
Signed-off-by: Ian Rogers <irogers@google.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Add the part number and MIDR for NVIDIA Olympus.

Signed-off-by: Besar Wicaksono <bwicaksono@nvidia.com>
Reviewed-by: Leo Yan <leo.yan@arm.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Add NVIDIA Olympus MIDR to neoverse_spe range list.

Signed-off-by: Besar Wicaksono <bwicaksono@nvidia.com>
Reviewed-by: Leo Yan <leo.yan@arm.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Link: https://lkml.kernel.org/r/20251125132908.847055-1-rppt@kernel.org
Signed-off-by: Mike Rapoport (Microsoft) <rppt@kernel.org>
Acked-by: David Hildenbrand (Red Hat) <david@kernel.org>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Acked-by: Peter Xu <peterx@redhat.com>
Acked-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Cc: Liam Howlett <liam.howlett@oracle.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Mike Rapoport <rppt@kernel.org>
Cc: Suren Baghdasaryan <surenb@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
WARNING: include/linux/genalloc.h:52 function parameter 'start_addr' not described in 'genpool_algo_t'

Fixes: 52fbf11 ("lib/genalloc.c: fix allocation of aligned buffer from non-aligned chunk")
Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Closes: https://lkml.kernel.org/r/20251127130624.563597e3@canb.auug.org.au
Acked-by: Randy Dunlap <rdunlap@infradead.org>
Tested-by: Randy Dunlap <rdunlap@infradead.org>
Cc: Alexey Skidanov <alexey.skidanov@intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
My linaro address will stop working tonight. Update my mailmap entry.

Link: https://lkml.kernel.org/r/20251128133318.44912-1-brgl@bgdev.pl
Signed-off-by: Bartosz Golaszewski <brgl@kernel.org>
Cc: Hans Verkuil <hverkuil@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
If you use an IDR with a non-zero base, and specify a range that lies
entirely below the base, 'max - base' becomes very large and
idr_get_free() can return an ID that lies outside of the requested range.

Link: https://lkml.kernel.org/r/20251128161853.3200058-1-willy@infradead.org
Fixes: 6ce711f ("idr: Make 1-based IDRs more efficient")
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Reported-by: Jan Sokolowski <jan.sokolowski@intel.com>
Reported-by: Koen Koning <koen.koning@intel.com>
Reported-by: Peter Senna Tschudin <peter.senna@linux.intel.com>
Closes: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/6449
Reviewed-by: Christian König <christian.koenig@amd.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Patch series "kasan: vmalloc: Fixes for the percpu allocator and
vrealloc", v3.

Patches fix two issues related to KASAN and vmalloc.

The first one, a KASAN tag mismatch, possibly resulting in a kernel panic,
can be observed on systems with a tag-based KASAN enabled and with
multiple NUMA nodes.  Initially it was only noticed on x86 [1] but later a
similar issue was also reported on arm64 [2].

Specifically the problem is related to how vm_structs interact with
pcpu_chunks - both when they are allocated, assigned and when pcpu_chunk
addresses are derived.

When vm_structs are allocated they are unpoisoned, each with a different
random tag, if vmalloc support is enabled along the KASAN mode.  Later
when first pcpu chunk is allocated it gets its 'base_addr' field set to
the first allocated vm_struct.  With that it inherits that vm_struct's
tag.

When pcpu_chunk addresses are later derived (by pcpu_chunk_addr(), for
example in pcpu_alloc_noprof()) the base_addr field is used and offsets
are added to it.  If the initial conditions are satisfied then some of the
offsets will point into memory allocated with a different vm_struct.  So
while the lower bits will get accurately derived the tag bits in the top
of the pointer won't match the shadow memory contents.

The solution (proposed at v2 of the x86 KASAN series [3]) is to unpoison
the vm_structs with the same tag when allocating them for the per cpu
allocator (in pcpu_get_vm_areas()).

The second one reported by syzkaller [4] is related to vrealloc and
happens because of random tag generation when unpoisoning memory without
allocating new pages.  This breaks shadow memory tracking and needs to
reuse the existing tag instead of generating a new one.  At the same time
an inconsistency in used flags is corrected.


This patch (of 3):

Syzkaller reported a memory out-of-bounds bug [4].  This patch fixes two
issues:

1. In vrealloc the KASAN_VMALLOC_VM_ALLOC flag is missing when
   unpoisoning the extended region. This flag is required to correctly
   associate the allocation with KASAN's vmalloc tracking.

   Note: In contrast, vzalloc (via __vmalloc_node_range_noprof)
   explicitly sets KASAN_VMALLOC_VM_ALLOC and calls
   kasan_unpoison_vmalloc() with it.  vrealloc must behave consistently --
   especially when reusing existing vmalloc regions -- to ensure KASAN can
   track allocations correctly.

2. When vrealloc reuses an existing vmalloc region (without allocating
   new pages) KASAN generates a new tag, which breaks tag-based memory
   access tracking.

Introduce KASAN_VMALLOC_KEEP_TAG, a new KASAN flag that allows reusing the
tag already attached to the pointer, ensuring consistent tag behavior
during reallocation.

Pass KASAN_VMALLOC_KEEP_TAG and KASAN_VMALLOC_VM_ALLOC to the
kasan_unpoison_vmalloc inside vrealloc_node_align_noprof().

Link: https://lkml.kernel.org/r/cover.1765978969.git.m.wieczorretman@pm.me
Link: https://lkml.kernel.org/r/38dece0a4074c43e48150d1e242f8242c73bf1a5.1764874575.git.m.wieczorretman@pm.me
Link: https://lore.kernel.org/all/e7e04692866d02e6d3b32bb43b998e5d17092ba4.1738686764.git.maciej.wieczor-retman@intel.com/ [1]
Link: https://lore.kernel.org/all/aMUrW1Znp1GEj7St@MiWiFi-R3L-srv/ [2]
Link: https://lore.kernel.org/all/CAPAsAGxDRv_uFeMYu9TwhBVWHCCtkSxoWY4xmFB_vowMbi8raw@mail.gmail.com/ [3]
Link: https://syzkaller.appspot.com/bug?extid=997752115a851cb0cf36 [4]
Fixes: a0309fa ("mm: vmalloc: support more granular vrealloc() sizing")
Signed-off-by: Jiayuan Chen <jiayuan.chen@linux.dev>
Co-developed-by: Maciej Wieczor-Retman <maciej.wieczor-retman@intel.com>
Signed-off-by: Maciej Wieczor-Retman <maciej.wieczor-retman@intel.com>
Reported-by: syzbot+997752115a851cb0cf36@syzkaller.appspotmail.com
Closes: https://lore.kernel.org/all/68e243a2.050a0220.1696c6.007d.GAE@google.com/T/
Reviewed-by: Andrey Konovalov <andreyknvl@gmail.com>
Cc: Alexander Potapenko <glider@google.com>
Cc: Andrey Ryabinin <ryabinin.a.a@gmail.com>
Cc: Danilo Krummrich <dakr@kernel.org>
Cc: Dmitriy Vyukov <dvyukov@google.com>
Cc: Kees Cook <kees@kernel.org>
Cc: Marco Elver <elver@google.com>
Cc: "Uladzislau Rezki (Sony)" <urezki@gmail.com>
Cc: Vincenzo Frascino <vincenzo.frascino@arm.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
A KASAN tag mismatch, possibly causing a kernel panic, can be observed
on systems with a tag-based KASAN enabled and with multiple NUMA nodes.
It was reported on arm64 and reproduced on x86. It can be explained in
the following points:

1. There can be more than one virtual memory chunk.
2. Chunk's base address has a tag.
3. The base address points at the first chunk and thus inherits
   the tag of the first chunk.
4. The subsequent chunks will be accessed with the tag from the
   first chunk.
5. Thus, the subsequent chunks need to have their tag set to
   match that of the first chunk.

Refactor code by reusing __kasan_unpoison_vmalloc in a new helper in
preparation for the actual fix.

Link: https://lkml.kernel.org/r/eb61d93b907e262eefcaa130261a08bcb6c5ce51.1764874575.git.m.wieczorretman@pm.me
Fixes: 1d96320 ("kasan, vmalloc: add vmalloc tagging for SW_TAGS")
Signed-off-by: Maciej Wieczor-Retman <maciej.wieczor-retman@intel.com>
Reviewed-by: Andrey Konovalov <andreyknvl@gmail.com>
Cc: Alexander Potapenko <glider@google.com>
Cc: Andrey Ryabinin <ryabinin.a.a@gmail.com>
Cc: Danilo Krummrich <dakr@kernel.org>
Cc: Dmitriy Vyukov <dvyukov@google.com>
Cc: Jiayuan Chen <jiayuan.chen@linux.dev>
Cc: Kees Cook <kees@kernel.org>
Cc: Marco Elver <elver@google.com>
Cc: "Uladzislau Rezki (Sony)" <urezki@gmail.com>
Cc: Vincenzo Frascino <vincenzo.frascino@arm.com>
Cc: <stable@vger.kernel.org>	[6.1+]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
A KASAN tag mismatch, possibly causing a kernel panic, can be observed on
systems with a tag-based KASAN enabled and with multiple NUMA nodes.  It
was reported on arm64 and reproduced on x86.  It can be explained in the
following points:

1. There can be more than one virtual memory chunk.
2. Chunk's base address has a tag.
3. The base address points at the first chunk and thus inherits
   the tag of the first chunk.
4. The subsequent chunks will be accessed with the tag from the
   first chunk.
5. Thus, the subsequent chunks need to have their tag set to
   match that of the first chunk.

Use the new vmalloc flag that disables random tag assignment in
__kasan_unpoison_vmalloc() - pass the same random tag to all the
vm_structs by tagging the pointers before they go inside
__kasan_unpoison_vmalloc().  Assigning a common tag resolves the pcpu
chunk address mismatch.

[akpm@linux-foundation.org: use WARN_ON_ONCE(), per Andrey]
  Link: https://lkml.kernel.org/r/CA+fCnZeuGdKSEm11oGT6FS71_vGq1vjq-xY36kxVdFvwmag2ZQ@mail.gmail.com
[maciej.wieczor-retman@intel.com: remove unneeded pr_warn()]
  Link: https://lkml.kernel.org/r/919897daaaa3c982a27762a2ee038769ad033991.1764945396.git.m.wieczorretman@pm.me
Link: https://lkml.kernel.org/r/873821114a9f722ffb5d6702b94782e902883fdf.1764874575.git.m.wieczorretman@pm.me
Fixes: 1d96320 ("kasan, vmalloc: add vmalloc tagging for SW_TAGS")
Signed-off-by: Maciej Wieczor-Retman <maciej.wieczor-retman@intel.com>
Reviewed-by: Andrey Konovalov <andreyknvl@gmail.com>
Cc: Alexander Potapenko <glider@google.com>
Cc: Andrey Ryabinin <ryabinin.a.a@gmail.com>
Cc: Danilo Krummrich <dakr@kernel.org>
Cc: Dmitriy Vyukov <dvyukov@google.com>
Cc: Jiayuan Chen <jiayuan.chen@linux.dev>
Cc: Kees Cook <kees@kernel.org>
Cc: Marco Elver <elver@google.com>
Cc: "Uladzislau Rezki (Sony)" <urezki@gmail.com>
Cc: Vincenzo Frascino <vincenzo.frascino@arm.com>
Cc: <stable@vger.kernel.org>	[6.1+]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Modify the kernel-doc function parameter names to prevent kernel-doc
warnings:

Warning: include/linux/leafops.h:135 function parameter 'entry' not
 described in 'leafent_type'
Warning: include/linux/leafops.h:540 function parameter 'pte' not
 described in 'pte_is_uffd_marker'

Link: https://lkml.kernel.org/r/20251214201517.2187051-1-rdunlap@infradead.org
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Reviewed-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Cc: Liam Howlett <liam.howlett@oracle.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Mike Rapoport <rppt@kernel.org>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
When a page is freed it coalesces with a buddy into a higher order page
while possible.  When the buddy page migrate type differs, it is expected
to be updated to match the one of the page being freed.

However, only the first pageblock of the buddy page is updated, while the
rest of the pageblocks are left unchanged.

That causes warnings in later expand() and other code paths (like below),
since an inconsistency between migration type of the list containing the
page and the page-owned pageblocks migration types is introduced.

[  308.986589] ------------[ cut here ]------------
[  308.987227] page type is 0, passed migratetype is 1 (nr=256)
[  308.987275] WARNING: CPU: 1 PID: 5224 at mm/page_alloc.c:812 expand+0x23c/0x270
[  308.987293] Modules linked in: algif_hash(E) af_alg(E) nft_fib_inet(E) nft_fib_ipv4(E) nft_fib_ipv6(E) nft_fib(E) nft_reject_inet(E) nf_reject_ipv4(E) nf_reject_ipv6(E) nft_reject(E) nft_ct(E) nft_chain_nat(E) nf_nat(E) nf_conntrack(E) nf_defrag_ipv6(E) nf_defrag_ipv4(E) nf_tables(E) s390_trng(E) vfio_ccw(E) mdev(E) vfio_iommu_type1(E) vfio(E) sch_fq_codel(E) drm(E) i2c_core(E) drm_panel_orientation_quirks(E) loop(E) nfnetlink(E) vsock_loopback(E) vmw_vsock_virtio_transport_common(E) vsock(E) ctcm(E) fsm(E) diag288_wdt(E) watchdog(E) zfcp(E) scsi_transport_fc(E) ghash_s390(E) prng(E) aes_s390(E) des_generic(E) des_s390(E) libdes(E) sha3_512_s390(E) sha3_256_s390(E) sha_common(E) paes_s390(E) crypto_engine(E) pkey_cca(E) pkey_ep11(E) zcrypt(E) rng_core(E) pkey_pckmo(E) pkey(E) autofs4(E)
[  308.987439] Unloaded tainted modules: hmac_s390(E):2
[  308.987650] CPU: 1 UID: 0 PID: 5224 Comm: mempig_verify Kdump: loaded Tainted: G            E       6.18.0-gcc-bpf-debug torvalds#431 PREEMPT
[  308.987657] Tainted: [E]=UNSIGNED_MODULE
[  308.987661] Hardware name: IBM 3906 M04 704 (z/VM 7.3.0)
[  308.987666] Krnl PSW : 0404f00180000000 00000349976fa600 (expand+0x240/0x270)
[  308.987676]            R:0 T:1 IO:0 EX:0 Key:0 M:1 W:0 P:0 AS:3 CC:3 PM:0 RI:0 EA:3
[  308.987682] Krnl GPRS: 0000034980000004 0000000000000005 0000000000000030 000003499a0e6d88
[  308.987688]            0000000000000005 0000034980000005 000002be803ac000 0000023efe6c8300
[  308.987692]            0000000000000008 0000034998d57290 000002be00000100 0000023e00000008
[  308.987696]            0000000000000000 0000000000000000 00000349976fa5fc 000002c99b1eb6f0
[  308.987708] Krnl Code: 00000349976fa5f0: c020008a02f2	larl	%r2,000003499883abd4
                          00000349976fa5f6: c0e5ffe3f4b5	brasl	%r14,0000034997378f60
                         #00000349976fa5fc: af000000		mc	0,0
                         >00000349976fa600: a7f4ff4c		brc	15,00000349976fa498
                          00000349976fa604: b9040026		lgr	%r2,%r6
                          00000349976fa608: c0300088317f	larl	%r3,0000034998800906
                          00000349976fa60e: c0e5fffdb6e1	brasl	%r14,00000349976b13d0
                          00000349976fa614: af000000		mc	0,0
[  308.987734] Call Trace:
[  308.987738]  [<00000349976fa600>] expand+0x240/0x270
[  308.987744] ([<00000349976fa5fc>] expand+0x23c/0x270)
[  308.987749]  [<00000349976ff95e>] rmqueue_bulk+0x71e/0x940
[  308.987754]  [<00000349976ffd7e>] __rmqueue_pcplist+0x1fe/0x2a0
[  308.987759]  [<0000034997700966>] rmqueue.isra.0+0xb46/0xf40
[  308.987763]  [<0000034997703ec8>] get_page_from_freelist+0x198/0x8d0
[  308.987768]  [<0000034997706fa8>] __alloc_frozen_pages_noprof+0x198/0x400
[  308.987774]  [<00000349977536f8>] alloc_pages_mpol+0xb8/0x220
[  308.987781]  [<0000034997753bf6>] folio_alloc_mpol_noprof+0x26/0xc0
[  308.987786]  [<0000034997753e4c>] vma_alloc_folio_noprof+0x6c/0xa0
[  308.987791]  [<0000034997775b22>] vma_alloc_anon_folio_pmd+0x42/0x240
[  308.987799]  [<000003499777bfea>] __do_huge_pmd_anonymous_page+0x3a/0x210
[  308.987804]  [<00000349976cb08e>] __handle_mm_fault+0x4de/0x500
[  308.987809]  [<00000349976cb14c>] handle_mm_fault+0x9c/0x3a0
[  308.987813]  [<000003499734d70e>] do_exception+0x1de/0x540
[  308.987822]  [<0000034998387390>] __do_pgm_check+0x130/0x220
[  308.987830]  [<000003499839a934>] pgm_check_handler+0x114/0x160
[  308.987838] 3 locks held by mempig_verify/5224:
[  308.987842]  #0: 0000023ea44c1e08 (vm_lock){++++}-{0:0}, at: lock_vma_under_rcu+0xb2/0x2a0
[  308.987859]  #1: 0000023ee4d41b18 (&pcp->lock){+.+.}-{2:2}, at: rmqueue.isra.0+0xad6/0xf40
[  308.987871]  #2: 0000023efe6c8998 (&zone->lock){..-.}-{2:2}, at: rmqueue_bulk+0x5a/0x940
[  308.987886] Last Breaking-Event-Address:
[  308.987890]  [<0000034997379096>] __warn_printk+0x136/0x140
[  308.987897] irq event stamp: 52330356
[  308.987901] hardirqs last  enabled at (52330355): [<000003499838742e>] __do_pgm_check+0x1ce/0x220
[  308.987907] hardirqs last disabled at (52330356): [<000003499839932e>] _raw_spin_lock_irqsave+0x9e/0xe0
[  308.987913] softirqs last  enabled at (52329882): [<0000034997383786>] handle_softirqs+0x2c6/0x530
[  308.987922] softirqs last disabled at (52329859): [<0000034997382f86>] __irq_exit_rcu+0x126/0x140
[  308.987929] ---[ end trace 0000000000000000 ]---
[  308.987936] ------------[ cut here ]------------
[  308.987940] page type is 0, passed migratetype is 1 (nr=256)
[  308.987951] WARNING: CPU: 1 PID: 5224 at mm/page_alloc.c:860 __del_page_from_free_list+0x1be/0x1e0
[  308.987960] Modules linked in: algif_hash(E) af_alg(E) nft_fib_inet(E) nft_fib_ipv4(E) nft_fib_ipv6(E) nft_fib(E) nft_reject_inet(E) nf_reject_ipv4(E) nf_reject_ipv6(E) nft_reject(E) nft_ct(E) nft_chain_nat(E) nf_nat(E) nf_conntrack(E) nf_defrag_ipv6(E) nf_defrag_ipv4(E) nf_tables(E) s390_trng(E) vfio_ccw(E) mdev(E) vfio_iommu_type1(E) vfio(E) sch_fq_codel(E) drm(E) i2c_core(E) drm_panel_orientation_quirks(E) loop(E) nfnetlink(E) vsock_loopback(E) vmw_vsock_virtio_transport_common(E) vsock(E) ctcm(E) fsm(E) diag288_wdt(E) watchdog(E) zfcp(E) scsi_transport_fc(E) ghash_s390(E) prng(E) aes_s390(E) des_generic(E) des_s390(E) libdes(E) sha3_512_s390(E) sha3_256_s390(E) sha_common(E) paes_s390(E) crypto_engine(E) pkey_cca(E) pkey_ep11(E) zcrypt(E) rng_core(E) pkey_pckmo(E) pkey(E) autofs4(E)
[  308.988070] Unloaded tainted modules: hmac_s390(E):2
[  308.988087] CPU: 1 UID: 0 PID: 5224 Comm: mempig_verify Kdump: loaded Tainted: G        W   E       6.18.0-gcc-bpf-debug torvalds#431 PREEMPT
[  308.988095] Tainted: [W]=WARN, [E]=UNSIGNED_MODULE
[  308.988100] Hardware name: IBM 3906 M04 704 (z/VM 7.3.0)
[  308.988105] Krnl PSW : 0404f00180000000 00000349976f9e32 (__del_page_from_free_list+0x1c2/0x1e0)
[  308.988118]            R:0 T:1 IO:0 EX:0 Key:0 M:1 W:0 P:0 AS:3 CC:3 PM:0 RI:0 EA:3
[  308.988127] Krnl GPRS: 0000034980000004 0000000000000005 0000000000000030 000003499a0e6d88
[  308.988133]            0000000000000005 0000034980000005 0000034998d57290 0000023efe6c8300
[  308.988139]            0000000000000001 0000000000000008 000002be00000100 000002be803ac000
[  308.988144]            0000000000000000 0000000000000001 00000349976f9e2e 000002c99b1eb728
[  308.988153] Krnl Code: 00000349976f9e22: c020008a06d9	larl	%r2,000003499883abd4
                          00000349976f9e28: c0e5ffe3f89c	brasl	%r14,0000034997378f60
                         #00000349976f9e2e: af000000		mc	0,0
                         >00000349976f9e32: a7f4ff4e		brc	15,00000349976f9cce
                          00000349976f9e36: b904002b		lgr	%r2,%r11
                          00000349976f9e3a: c030008a06e7	larl	%r3,000003499883ac08
                          00000349976f9e40: c0e5fffdbac8	brasl	%r14,00000349976b13d0
                          00000349976f9e46: af000000		mc	0,0
[  308.988184] Call Trace:
[  308.988188]  [<00000349976f9e32>] __del_page_from_free_list+0x1c2/0x1e0
[  308.988195] ([<00000349976f9e2e>] __del_page_from_free_list+0x1be/0x1e0)
[  308.988202]  [<00000349976ff946>] rmqueue_bulk+0x706/0x940
[  308.988208]  [<00000349976ffd7e>] __rmqueue_pcplist+0x1fe/0x2a0
[  308.988214]  [<0000034997700966>] rmqueue.isra.0+0xb46/0xf40
[  308.988221]  [<0000034997703ec8>] get_page_from_freelist+0x198/0x8d0
[  308.988227]  [<0000034997706fa8>] __alloc_frozen_pages_noprof+0x198/0x400
[  308.988233]  [<00000349977536f8>] alloc_pages_mpol+0xb8/0x220
[  308.988240]  [<0000034997753bf6>] folio_alloc_mpol_noprof+0x26/0xc0
[  308.988247]  [<0000034997753e4c>] vma_alloc_folio_noprof+0x6c/0xa0
[  308.988253]  [<0000034997775b22>] vma_alloc_anon_folio_pmd+0x42/0x240
[  308.988260]  [<000003499777bfea>] __do_huge_pmd_anonymous_page+0x3a/0x210
[  308.988267]  [<00000349976cb08e>] __handle_mm_fault+0x4de/0x500
[  308.988273]  [<00000349976cb14c>] handle_mm_fault+0x9c/0x3a0
[  308.988279]  [<000003499734d70e>] do_exception+0x1de/0x540
[  308.988286]  [<0000034998387390>] __do_pgm_check+0x130/0x220
[  308.988293]  [<000003499839a934>] pgm_check_handler+0x114/0x160
[  308.988300] 3 locks held by mempig_verify/5224:
[  308.988305]  #0: 0000023ea44c1e08 (vm_lock){++++}-{0:0}, at: lock_vma_under_rcu+0xb2/0x2a0
[  308.988322]  #1: 0000023ee4d41b18 (&pcp->lock){+.+.}-{2:2}, at: rmqueue.isra.0+0xad6/0xf40
[  308.988334]  #2: 0000023efe6c8998 (&zone->lock){..-.}-{2:2}, at: rmqueue_bulk+0x5a/0x940
[  308.988346] Last Breaking-Event-Address:
[  308.988350]  [<0000034997379096>] __warn_printk+0x136/0x140
[  308.988356] irq event stamp: 52330356
[  308.988360] hardirqs last  enabled at (52330355): [<000003499838742e>] __do_pgm_check+0x1ce/0x220
[  308.988366] hardirqs last disabled at (52330356): [<000003499839932e>] _raw_spin_lock_irqsave+0x9e/0xe0
[  308.988373] softirqs last  enabled at (52329882): [<0000034997383786>] handle_softirqs+0x2c6/0x530
[  308.988380] softirqs last disabled at (52329859): [<0000034997382f86>] __irq_exit_rcu+0x126/0x140
[  308.988388] ---[ end trace 0000000000000000 ]---

Link: https://lkml.kernel.org/r/20251215081002.3353900A9c-agordeev@linux.ibm.com
Link: https://lkml.kernel.org/r/20251212151457.3898073Add-agordeev@linux.ibm.com
Fixes: e6cf9e1 ("mm: page_alloc: fix up block types when merging compatible blocks")
Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
Reported-by: Marc Hartmayer <mhartmay@linux.ibm.com>
Closes: https://lore.kernel.org/linux-mm/87wmalyktd.fsf@linux.ibm.com/
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
Reviewed-by: Wei Yang <richard.weiyang@gmail.com>
Cc: Marc Hartmayer <mhartmay@linux.ibm.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
The entry for the Qualcomm bluetooth driver only now got sent upstream and
still has my old address.  Update it to use the kernel.org one.

Link: https://lkml.kernel.org/r/20251204104531.22045-1-bartosz.golaszewski@oss.qualcomm.com
Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@oss.qualcomm.com>
Cc: Hans Verkuil <hverkuil@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
…entry()

If the PTE page table lock is acquired by pte_offset_map_lock(), the lock
must be released via pte_unmap_unlock().

However, in damos_va_migrate_pmd_entry(), if damos_va_filter_out() returns
true, it immediately returns without releasing the lock.

This fixes the issue by not stopping page table traversal when
damos_va_filter_out() returns true and ensuring that the lock is released.

Link: https://lkml.kernel.org/r/20251209151034.77221-1-akinobu.mita@gmail.com
Fixes: 09efc56 ("mm/damon/vaddr: consistently use only pmd_entry for damos_migrate")
Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com>
Reviewed-by: SeongJae Park <sj@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Since commit 01ef029 (".mailmap: add entry for WangYuli") was merged
into mainline, I've received feedback from former colleagues: They believe
the change to .mailmap affects git log based statistics, which in turn
reduces the reported “contributions from uniontech” in the Linux
commit tree, and they think it's difficult to explain to everyone that
future statistics must be generated with the --no-use-mailmap option.

I don't have a strong opinion either way, but since my commit has caused
them trouble, I'm now requesting that this line be removed to bring a
little more LOVE AND PEACE to the world :-)

Link: https://lkml.kernel.org/r/20251208025730.33881-1-wangyuli@aosc.io
Signed-off-by: WangYuli <wangyl5933@chinaunicom.cn>
Cc: Carlos Bilbao <carlos.bilbao@kernel.org>
Cc: Hans Verkuil <hverkuil@kernel.org>
Cc: Martin Kepplinger <martink@posteo.de>
Cc: Shannon Nelson <sln@onemain.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
calebsander and others added 15 commits January 1, 2026 08:16
…st()

io_uring_validate_mmap_request() doesn't use its size_t sz argument, so
remove it.

Signed-off-by: Caleb Sander Mateos <csander@purestorage.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
The function bfq_bfqq_may_idle() was renamed as bfq_better_to_idle()
in commit 277a4a9 ("block, bfq: give a better name to
bfq_bfqq_may_idle").  Update the comment accordingly.

Signed-off-by: Julia Lawall <Julia.Lawall@inria.fr>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
…h/cifs-2.6

Pull smb client fixes from Steve French:

 - Fix array out of bounds error in copy_file_range

 - Add tracepoint to help debug ioctl failures

* tag 'v6.19-rc3-smb3-client-fixes' of git://git.samba.org/sfrench/cifs-2.6:
  smb: client: fix UBSAN array-index-out-of-bounds in smb2_copychunk_range
  smb3 client: add missing tracepoint for unsupported ioctls
Pull smb server fixes from Steve French:

 - Fix memory leak

 - Fix two refcount leaks

 - Fix error path in create_smb2_pipe

* tag 'v6.19-rc3-smb3-server-fixes' of git://git.samba.org/ksmbd:
  smb/server: fix refcount leak in smb2_open()
  smb/server: fix refcount leak in parse_durable_handle_context()
  smb/server: call ksmbd_session_rpc_close() on error path in create_smb2_pipe()
  ksmbd: Fix memory leak in get_file_all_info()
…m/kernel

Pull drm fixes from Dave Airlie:
 "Happy New Year, jetlagged fixes from me, still pretty quiet, xe is
  most of this, with i915/nouveau/imagination fixes and some shmem
  cleanups.

  shmem:
   - docs and MODULE_LICENSE fix

  xe:
   - Ensure svm device memory is idle before migration completes
   - Fix a SVM debug printout
   - Use READ_ONCE() / WRITE_ONCE() for g2h_fence

  i915:
   - Fix eb_lookup_vmas() failure path

  nouveau:
   - fix prepare_fb warnings

  imagination:
   - prevent export of protected objects"

* tag 'drm-fixes-2026-01-02' of https://gitlab.freedesktop.org/drm/kernel:
  drm/i915/gem: Zero-initialize the eb.vma array in i915_gem_do_execbuffer
  drm/xe/guc: READ/WRITE_ONCE g2h_fence->done
  drm/pagemap, drm/xe: Ensure that the devmem allocation is idle before use
  drm/xe/svm: Fix a debug printout
  drm/gem-shmem: Fix the MODULE_LICENSE() string
  drm/gem-shmem: Fix typos in documentation
  drm/nouveau/dispnv50: Don't call drm_atomic_get_crtc_state() in prepare_fb
  drm/imagination: Disallow exporting of PM/FW protected objects
…nux/kernel/git/chenhuacai/linux-loongson

Pull LoongArch fixes from Huacai Chen:
 "Complete CPUCFG registers definition, set correct protection_map[] for
  VM_NONE/VM_SHARED, fix some bugs in the orc stack unwinder, ftrace and
  BPF JIT"

* tag 'loongarch-fixes-6.19-1' of git://git.kernel.org/pub/scm/linux/kernel/git/chenhuacai/linux-loongson:
  samples/ftrace: Adjust LoongArch register restore order in direct calls
  LoongArch: BPF: Enhance the bpf_arch_text_poke() function
  LoongArch: BPF: Enable trampoline-based tracing for module functions
  LoongArch: BPF: Adjust the jump offset of tail calls
  LoongArch: BPF: Save return address register ra to t0 before trampoline
  LoongArch: BPF: Zero-extend bpf_tail_call() index
  LoongArch: BPF: Sign extend kfunc call arguments
  LoongArch: Refactor register restoration in ftrace_common_return
  LoongArch: Enable exception fixup for specific ADE subcode
  LoongArch: Remove unnecessary checks for ORC unwinder
  LoongArch: Remove is_entry_func() and kernel_entry_end
  LoongArch: Use UNWIND_HINT_END_OF_STACK for entry points
  LoongArch: Set correct protection_map[] for VM_NONE/VM_SHARED
  LoongArch: Complete CPUCFG registers definition
…ux/kernel/git/tip/tip

Pull x86 fix from Ingo Molnar:
 "Fix the AMD microcode Entrysign signature checking code to include
  more models"

* tag 'x86-urgent-2026-01-02' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/microcode/AMD: Fix Entrysign revision check for Zen5/Strix Halo
…nux/kernel/git/axboe/linux

Pull io_uring fixes from Jens Axboe:

 - Removed dead argument length for io_uring_validate_mmap_request()

 - Use GFP_NOWAIT for overflow CQEs on legacy ring setups rather than
   GFP_ATOMIC, which makes it play nicer with memcg limits

 - Fix a potential circular locking issue with tctx node removal and
   exec based cancelations

* tag 'io_uring-6.19-20260102' of git://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux:
  io_uring/memmap: drop unused sz param in io_uring_validate_mmap_request()
  io_uring/tctx: add separate lock for list of tctx's in ctx
  io_uring: use GFP_NOWAIT for overflow CQEs on legacy rings
…/kernel/git/axboe/linux

Pull block fixes from Jens Axboe:

 - Scan partition tables asynchronously for ublk, similarly to how nvme
   does it. This avoids potential deadlocks, which is why nvme does it
   that way too. Includes a set of selftests as well.

 - MD pull request via Yu:
     - Fix null-pointer dereference in raid5 sysfs group_thread_cnt
       store (Tuo Li)
     - Fix possible mempool corruption during raid1 raid_disks update
       via sysfs (FengWei Shih)
     - Fix logical_block_size configuration being overwritten during
       super_1_validate() (Li Nan)
     - Fix forward incompatibility with configurable logical block size:
       arrays assembled on new kernels could not be assembled on older
       kernels (v6.18 and before) due to non-zero reserved pad rejection
       (Li Nan)
     - Fix static checker warning about iterator not incremented (Li Nan)

 - Skip CPU offlining notifications on unmapped hardware queues

 - bfq-iosched block stats fix

 - Fix outdated comment in bfq-iosched

* tag 'block-6.19-20260102' of git://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux:
  block, bfq: update outdated comment
  blk-mq: skip CPU offline notify on unmapped hctx
  selftests/ublk: fix Makefile to rebuild on header changes
  selftests/ublk: add test for async partition scan
  ublk: scan partition in async way
  block,bfq: fix aux stat accumulation destination
  md: Fix forward incompatibility from configurable logical block size
  md: Fix logical_block_size configuration being overwritten
  md: suspend array while updating raid_disks via sysfs
  md/raid5: fix possible null-pointer dereferences in raid5_store_group_thread_cnt()
  md: Fix static checker warning in analyze_sbs
…b/scm/linux/kernel/git/shuah/linux-kselftest

Pull kselftest fixes from Shuah Khan:

 - Fix for build failures in tests that use an empty FIXTURE() seen in
   Android's build environment, which uses -D_FORTIFY_SOURCE=3, a build
   failure occurs in tests that use an empty FIXTURE()

 - Fix func_traceonoff_triggers.tc sometimes failures on Kunpeng-920
   board resulting from including transient trace file name in checksum
   compare

 - Fix to remove available_events requirement from toplevel-enable for
   instance as it isn't a valid requirement for this test

* tag 'linux_kselftest-fixes-6.19-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest:
  kselftest/harness: Use helper to avoid zero-size memset warning
  selftests/ftrace: Test toplevel-enable for instance
  selftests/ftrace: traceonoff_triggers: strip off names
…t/rdma/rdma

Pull rdma fixes from Jason Gunthorpe:

 - Fix several syzkaller found bugs:
    - Poor parsing of the RDMA_NL_LS_OP_IP_RESOLVE netlink
    - GID entry refcount leaking when CM destruction races with
      multicast establishment
    - Missing refcount put in ib_del_sub_device_and_put()

 - Fixup recently introduced uABI padding for 32 bit consistency

 - Avoid user triggered math overflow in MANA and AFA

 - Reading invalid netdev data during an event

 - kdoc fixes

 - Fix never-working gid copying in ib_get_gids_from_rdma_hdr

 - Typo in bnxt when validating the BAR

 - bnxt mis-parsed IB_SEND_IP_CSUM so it didn't work always

 - bnxt out of bounds access in bnxt related to the counters on new
   devices

 - Allocate the bnxt PDE table with the right sizing

 - Use dma_free_coherent() correctly in bnxt

 - Allow rxe to be unloadable when CONFIG_PROVE_LOCKING by adjusting the
   tracking of the global sockets it uses

 - Missing unlocking on error path in rxe

 - Compute the right number of pages in a MR in rtrs

* tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma:
  RDMA/bnxt_re: fix dma_free_coherent() pointer
  RDMA/rtrs: Fix clt_path::max_pages_per_mr calculation
  IB/rxe: Fix missing umem_odp->umem_mutex unlock on error path
  RDMA/bnxt_re: Fix to use correct page size for PDE table
  RDMA/bnxt_re: Fix OOB write in bnxt_re_copy_err_stats()
  RDMA/bnxt_re: Fix IB_SEND_IP_CSUM handling in post_send
  RDMA/core: always drop device refcount in ib_del_sub_device_and_put()
  RDMA/rxe: let rxe_reclassify_recv_socket() call sk_owner_put()
  RDMA/bnxt_re: Fix incorrect BAR check in bnxt_qplib_map_creq_db()
  RDMA/core: Fix logic error in ib_get_gids_from_rdma_hdr()
  RDMA/efa: Remove possible negative shift
  RTRS/rtrs: clean up rtrs headers kernel-doc
  RDMA/irdma: avoid invalid read in irdma_net_event
  RDMA/mana_ib: check cqe length for kernel CQs
  RDMA/irdma: Fix irdma_alloc_ucontext_resp padding
  RDMA/ucma: Fix rdma_ucm_query_ib_service_resp struct padding
  RDMA/cm: Fix leaking the multicast GID table reference
  RDMA/core: Check for the presence of LS_NLA_TYPE_DGID correctly
…/linux/kernel/git/ebiggers/linux

Pull crypto library fix from Eric Biggers:
 "Fix the kunit_run_irq_test() function (which I recently added for the
  CRC and crypto tests) to be less timing-dependent.

  This fixes flakiness in the polyval kunit test suite"

* tag 'libcrypto-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiggers/linux:
  kunit: Enforce task execution in {soft,hard}irq contexts
…git/rafael/linux-pm

Pull power management fix from Rafael Wysocki:
 "Fix a recent regression that affects system suspend testing
  at the 'core' level (Rafael Wysocki)"

* tag 'pm-6.19-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
  PM: sleep: Fix suspend_test() at the TEST_CORE level
In unittest_data_add(), if of_resolve_phandles() fails, the allocated
unittest_data is not freed, leading to a memory leak.

Fix this by using scope-based cleanup helper __free(kfree) for automatic
resource cleanup. This ensures unittest_data is automatically freed when
it goes out of scope in error paths.

For the success path, use retain_and_null_ptr() to transfer ownership
of the memory to the device tree and prevent double freeing.

Fixes: 2eb46da ("of/selftest: Use the resolver to fixup phandles")
Suggested-by: Rob Herring <robh@kernel.org>
Co-developed-by: Jianhao Xu <jianhao.xu@seu.edu.cn>
Signed-off-by: Jianhao Xu <jianhao.xu@seu.edu.cn>
Signed-off-by: Zilin Guan <zilin@seu.edu.cn>
Link: https://patch.msgid.link/20251231114915.234638-1-zilin@seu.edu.cn
Signed-off-by: Rob Herring (Arm) <robh@kernel.org>
….org/pub/scm/linux/kernel/git/perf/perf-tools

Pull perf tool fixes and from Namhyung Kim:

 - skip building BPF skeletons if libopenssl is missing

 - a couple of test updates

 - handle error cases of filename__read_build_id()

 - support NVIDIA Olympus for ARM SPE profiling

 - update tool headers to sync with the kernel

* tag 'perf-tools-fixes-for-v6.19-2026-01-02' of git://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools:
  tools build: Fix the common set of features test wrt libopenssl
  tools headers: Sync syscall table with kernel sources
  tools headers: Sync linux/socket.h with kernel sources
  tools headers: Sync linux/gfp_types.h with kernel sources
  tools headers: Sync arm64 headers with kernel sources
  tools headers: Sync x86 headers with kernel sources
  tools headers: Sync UAPI sound/asound.h with kernel sources
  tools headers: Sync UAPI linux/mount.h with kernel sources
  tools headers: Sync UAPI linux/fs.h with kernel sources
  tools headers: Sync UAPI linux/fcntl.h with kernel sources
  tools headers: Sync UAPI KVM headers with kernel sources
  tools headers: Sync UAPI drm/drm.h with kernel sources
  perf arm-spe: Add NVIDIA Olympus to neoverse list
  tools headers arm64: Add NVIDIA Olympus part
  perf tests top: Make the test exclusive
  perf tests kvm: Avoid leaving perf.data.guest file around
  perf symbol: Fix ENOENT case for filename__read_build_id
  perf tools: Disable BPF skeleton if no libopenssl found
  tools/build: Add a feature test for libopenssl
@avagin avagin force-pushed the wip/misc-cgroup-hwcap-mask branch from cef7fc3 to 5cceb81 Compare January 2, 2026 23:55
torvalds and others added 5 commits January 3, 2026 09:18
…kernel/git/ulfh/linux-pm

Pull pmdomain fixes from Ulf Hansson:

 - mediatek: Fix spinlock recursion fix during probe

 - imx: Fix reference count leak during probe

* tag 'pmdomain-v6.19-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/linux-pm:
  pmdomain: imx: Fix reference count leak in imx_gpc_probe()
  pmdomain: mtk-pm-domains: Fix spinlock recursion fix in probe
…/linux/kernel/git/tip/tip

Pull core entry fix from Borislav Petkov:

 - Make sure clang inlines trivial local_irq_* helpers

* tag 'core_urgent_for_v6.19_rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  entry: Always inline local_irq_{enable,disable}_exit_to_user()
…cm/linux/kernel/git/robh/linux

Pull devicetree fixes from Rob Herring:

 - Fix an error path memory leak in DT unittest

 - Update Saravana's bouncing email

* tag 'devicetree-fixes-for-6.19-2' of git://git.kernel.org/pub/scm/linux/kernel/git/robh/linux:
  of: unittest: Fix memory leak in unittest_data_add()
  MAINTAINERS: Update Saravana Kannan's email address
The PR_SET_MM_AUXV option allows a process to update its auxiliary vector,
which is used to communicate kernel/platform information to userspace.

The current requirement for CAP_SYS_RESOURCE is unnecessarily
restrictive.  Modifying the auxiliary vector only affects userspace's
view of its environment and does not affect the kernel behaviour. Plus,
PR_SET_MM_MAP already allows processes to update their auxiliary vector
without CAP_SYS_RESOURCE.

Align PR_SET_MM_AUXV with PR_SET_MM_MAP by moving the AUVX handler
before the capability check in prctl_set_mm().

Signed-off-by: Andrei Vagin <avagin@google.com>
@avagin avagin force-pushed the wip/misc-cgroup-hwcap-mask branch 4 times, most recently from 474938e to 45979f5 Compare January 7, 2026 00:57
avagin added 4 commits January 7, 2026 01:20
…CAP4

Commit 4e6e8c2 ("binfmt_elf: Wire up AT_HWCAP3 at AT_HWCAP4") added
support for AT_HWCAP3 and AT_HWCAP4, but it missed updating the AUX
vector size calculation in create_elf_fdpic_tables() and
AT_VECTOR_SIZE_BASE in include/linux/auxvec.h.

Similar to the fix for ELF_HWCAP2 in commit c6a09e3
("binfmt_elf_fdpic: fix AUXV size calculation when ELF_HWCAP2 is defined"),
this omission leads to a mismatch between the reserved space and the
actual number of AUX entries, eventually triggering a kernel BUG_ON(csp != sp).

Fix this by incrementing nitems when ELF_HWCAP3 or ELF_HWCAP4 are defined
and updating AT_VECTOR_SIZE_BASE.

Cc: Mark Brown <broonie@kernel.org>
Cc: Max Filippov <jcmvbkbc@gmail.com>
Fixes: 4e6e8c2 ("binfmt_elf: Wire up AT_HWCAP3 at AT_HWCAP4")
Signed-off-by: Andrei Vagin <avagin@google.com>
Introduces a mechanism to inherit hardware capabilities (AT_HWCAP,
AT_HWCAP2, etc.) from a parent process when they have been modified via
prctl.

To support C/R operations (snapshots, live migration) in heterogeneous
clusters, we must ensure that processes utilize CPU features available
on all potential target nodes. To solve this, we need to advertise a
common feature set across the cluster.

This patch adds a new mm flag MMF_USER_HWCAP, which is set when the
auxiliary vector is modified via prctl(PR_SET_MM, PR_SET_MM_AUXV).  When
execve() is called, if the current process has MMF_USER_HWCAP set, the
HWCAP values are extracted from the current auxiliary vector and stored
in the linux_binprm structure. These values are then used to populate
the auxiliary vector of the new process, effectively inheriting the
hardware capabilities.

The inherited HWCAPs are masked with the hardware capabilities supported
by the current kernel to ensure that we don't report more features than
actually supported. This is important to avoid unexpected behavior,
especially for processes with additional privileges.

Signed-off-by: Andrei Vagin <avagin@google.com>
Verify that HWCAPs are correctly inherited from the parent process when
modified via prctl(PR_SET_MM_AUXV).

Signed-off-by: Andrei Vagin <avagin@google.com>
@avagin avagin force-pushed the wip/misc-cgroup-hwcap-mask branch from 45979f5 to d4003c7 Compare January 7, 2026 01:20
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.