Skip to content

[6.6-velinux] - Intel RDT monitoring support on Sub-NUMA Cluster (SNC) enabled systems#11

Merged
guojinhui-liam merged 68 commits into
6.6-velinuxfrom
6.6-velinux-rdt-snc
Dec 11, 2024
Merged

[6.6-velinux] - Intel RDT monitoring support on Sub-NUMA Cluster (SNC) enabled systems#11
guojinhui-liam merged 68 commits into
6.6-velinuxfrom
6.6-velinux-rdt-snc

Conversation

@xiaochenshen
Copy link
Copy Markdown
Contributor

About Intel RDT with Sub-NUMA Cluster (SNC) enabled

Supported Intel platforms: ICX, SPR, EMR, GNR and etc.

The Sub-NUMA Cluster (SNC) feature on some Intel processors partitions the
CPUs that share an L3 cache into two or more sets. This plays havoc with
the Resource Director Technology (RDT) monitoring features. Prior to this
patch Intel has advised that SNC and RDT are incompatible.

Some of these CPUs support an MSR that can partition the RMID counters in
the same way. This allows monitoring features to be used. Legacy monitoring
files provide the sum of counters from each SNC node for backwards
compatibility. Additional files per SNC node provide details per node.

With Sub-NUMA Cluster (SNC) mode enabled the scope of monitoring resources
is per-NODE instead of per-L3 cache. Backwards compatibility is maintained
by providing files in the mon_L3_XX directories that sum event counts
for all SNC nodes sharing an L3 cache.

The top-level monitoring files in each "mon_L3_XX" directory provide the
sum of data across all SNC nodes sharing an L3 cache instance.
Users who bind tasks to the CPUs of a specific Sub-NUMA node can read the
"llc_occupancy", "mbm_total_bytes", and "mbm_local_bytes" in the
"mon_sub_L3_YY" directories to get node local data.

For a system with two SNC nodes per L3, the monitor reporting files like this:

# cd /sys/fs/resctrl/mon_data
# tree mon_L3_00
mon_L3_00                  <- 00 here is L3 cache id
├── llc_occupancy              \  These files provide legacy support
├── mbm_local_bytes             > for non-SNC aware monitor apps
├── mbm_total_bytes            /  that expect data at L3 cache level
├── mon_sub_L3_00          <- 00 here is SNC node id
│   ├── llc_occupancy          \  These files are finer grained
│   ├── mbm_local_bytes         > data from each SNC node
│   └── mbm_total_bytes        /
└── mon_sub_L3_01
    ├── llc_occupancy          \
    ├── mbm_local_bytes         > As above, but for node 1.
    └── mbm_total_bytes        /

Note for Intel RDT control features:
Memory bandwidth allocation is still performed at the L3 cache level. I.e.
throttling controls are applied to all SNC nodes.

L3 cache allocation bitmaps also apply to all SNC nodes. But note that the
amount of L3 cache represented by each bit is divided by the number of SNC
nodes per L3 cache. E.g. with a 100MB cache on a system with 10-bit
allocation masks each bit normally represents 10MB. With SNC mode enabled
with two SNC nodes per L3 cache, each bit only represents 5MB.

About the patches
There are 68 backporting upstream patches against 6.6 LTS kernel. The patch set consists of 2 parts:
(1) Dependencies: x86/resctrl incremental patches until 6.11 upstream kernel.
(2) Intel RDT monitoring support patches for Sub-NUMA Cluster (SNC) enabled systems.

Passed tests
Passed Intel RDT with SNC test cases on ICX, EMR and GNR platforms (e.g., SNC2, SNC3 and etc.)

maciejwieczorretman and others added 30 commits September 18, 2024 13:45
commit f05fd4c upstream.

The kernel test robot reported kernel-doc warnings here:

  arch/x86/kernel/cpu/resctrl/rdtgroup.c:915: warning: Function parameter or member 'of' not described in 'rdt_bit_usage_show'
  arch/x86/kernel/cpu/resctrl/rdtgroup.c:915: warning: Function parameter or member 'seq' not described in 'rdt_bit_usage_show'
  arch/x86/kernel/cpu/resctrl/rdtgroup.c:915: warning: Function parameter or member 'v' not described in 'rdt_bit_usage_show'
  arch/x86/kernel/cpu/resctrl/rdtgroup.c:1144: warning: Function parameter or member 'type' not described in '__rdtgroup_cbm_overlaps'
  arch/x86/kernel/cpu/resctrl/rdtgroup.c:1224: warning: Function parameter or member 'rdtgrp' not described in 'rdtgroup_mode_test_exclusive'
  arch/x86/kernel/cpu/resctrl/rdtgroup.c:1261: warning: Function parameter or member 'of' not described in 'rdtgroup_mode_write'
  arch/x86/kernel/cpu/resctrl/rdtgroup.c:1261: warning: Function parameter or member 'buf' not described in 'rdtgroup_mode_write'
  arch/x86/kernel/cpu/resctrl/rdtgroup.c:1261: warning: Function parameter or member 'nbytes' not described in 'rdtgroup_mode_write'
  arch/x86/kernel/cpu/resctrl/rdtgroup.c:1261: warning: Function parameter or member 'off' not described in 'rdtgroup_mode_write'
  arch/x86/kernel/cpu/resctrl/rdtgroup.c:1370: warning: Function parameter or member 'of' not described in 'rdtgroup_size_show'
  arch/x86/kernel/cpu/resctrl/rdtgroup.c:1370: warning: Function parameter or member 's' not described in 'rdtgroup_size_show'
  arch/x86/kernel/cpu/resctrl/rdtgroup.c:1370: warning: Function parameter or member 'v' not described in 'rdtgroup_size_show'

The first two functions are missing an argument description while the
other three are file callbacks and don't require a kernel-doc comment.

Intel-SIG: commit f05fd4c x86/resctrl: Fix remaining kernel-doc warnings.
Incremental backporting patches for Intel RDT on Intel Xeon platform.

Closes: https://lore.kernel.org/oe-kbuild-all/202310070434.mD8eRNAz-lkp@intel.com/
Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Maciej Wieczor-Retman <maciej.wieczor-retman@intel.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Cc: Peter Newman <peternewman@google.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Reinette Chatre <reinette.chatre@intel.com>
Link: https://lore.kernel.org/r/20231011064843.246592-1-maciej.wieczor-retman@intel.com
[ Xiaochen Shen: amend commit log ]
Signed-off-by: Xiaochen Shen <xiaochen.shen@intel.com>
commit fe2a20e upstream.

The resctrl task assignment for monitor or control group needs to be
done one at a time. For example:

  $mount -t resctrl resctrl /sys/fs/resctrl/
  $mkdir /sys/fs/resctrl/ctrl_grp1
  $echo 123 > /sys/fs/resctrl/ctrl_grp1/tasks
  $echo 456 > /sys/fs/resctrl/ctrl_grp1/tasks
  $echo 789 > /sys/fs/resctrl/ctrl_grp1/tasks

This is not user-friendly when dealing with hundreds of tasks.

Support multiple task assignment in one command with tasks ids separated
by commas. For example:

  $echo 123,456,789 > /sys/fs/resctrl/ctrl_grp1/tasks

Intel-SIG: commit fe2a20e x86/resctrl: Add multiple tasks to the resctrl group at once.
Incremental backporting patches for Intel RDT on Intel Xeon platform.

Signed-off-by: Babu Moger <babu.moger@amd.com>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Reviewed-by: Peter Newman <peternewman@google.com>
Reviewed-by: Tan Shaopeng <tan.shaopeng@jp.fujitsu.com>
Reviewed-by: Fenghua Yu <fenghua.yu@intel.com>
Reviewed-by: Reinette Chatre <reinette.chatre@intel.com>
Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Tested-by: Peter Newman <peternewman@google.com>
Tested-by: Tan Shaopeng <tan.shaopeng@jp.fujitsu.com>
Link: https://lore.kernel.org/r/20231017002308.134480-2-babu.moger@amd.com
[ Xiaochen Shen: amend commit log ]
Signed-off-by: Xiaochen Shen <xiaochen.shen@intel.com>
commit 6846dc1 upstream.

The rftype flags are bitmaps used for adding files under the resctrl
filesystem. Some of these bitmap defines have one extra level of
indirection which is not necessary.

Drop the RF_* defines and simplify the macros.

  [ bp: Massage commit message. ]

Intel-SIG: commit 6846dc1 x86/resctrl: Simplify rftype flag definitions.
Incremental backporting patches for Intel RDT on Intel Xeon platform.

Signed-off-by: Babu Moger <babu.moger@amd.com>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Reviewed-by: Peter Newman <peternewman@google.com>
Reviewed-by: Tan Shaopeng <tan.shaopeng@jp.fujitsu.com>
Reviewed-by: Fenghua Yu <fenghua.yu@intel.com>
Reviewed-by: Reinette Chatre <reinette.chatre@intel.com>
Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Tested-by: Peter Newman <peternewman@google.com>
Tested-by: Tan Shaopeng <tan.shaopeng@jp.fujitsu.com>
Link: https://lore.kernel.org/r/20231017002308.134480-3-babu.moger@amd.com
[ Xiaochen Shen: amend commit log ]
Signed-off-by: Xiaochen Shen <xiaochen.shen@intel.com>
commit d415924 upstream.

resctrl associates rftype flags with its files so that files can be chosen
based on the resource, whether it is info or base, and if it is control
or monitor type file. These flags use the RF_ as well as RFTYPE_ prefixes.

Change the prefix to RFTYPE_ for all these flags to be consistent.

Intel-SIG: commit d415924 x86/resctrl: Rename rftype flags for consistency.
Incremental backporting patches for Intel RDT on Intel Xeon platform.

Signed-off-by: Babu Moger <babu.moger@amd.com>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Reviewed-by: Peter Newman <peternewman@google.com>
Reviewed-by: Tan Shaopeng <tan.shaopeng@jp.fujitsu.com>
Reviewed-by: Fenghua Yu <fenghua.yu@intel.com>
Reviewed-by: Reinette Chatre <reinette.chatre@intel.com>
Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Tested-by: Peter Newman <peternewman@google.com>
Tested-by: Tan Shaopeng <tan.shaopeng@jp.fujitsu.com>
Link: https://lore.kernel.org/r/20231017002308.134480-4-babu.moger@amd.com
[ Xiaochen Shen: amend commit log ]
Signed-off-by: Xiaochen Shen <xiaochen.shen@intel.com>
commit df5f3a1 upstream.

rdt_enable_ctx() enables the features provided during resctrl mount.

Additions to rdt_enable_ctx() are required to also modify error paths
of rdt_enable_ctx() callers to ensure correct unwinding if errors
are encountered after calling rdt_enable_ctx(). This is error prone.

Introduce rdt_disable_ctx() to refactor the error unwinding of
rdt_enable_ctx() to simplify future additions. This also simplifies
cleanup in rdt_kill_sb().

Intel-SIG: commit df5f3a1 x86/resctrl: Unwind properly from rdt_enable_ctx().
Incremental backporting patches for Intel RDT on Intel Xeon platform.

Suggested-by: Reinette Chatre <reinette.chatre@intel.com>
Signed-off-by: Babu Moger <babu.moger@amd.com>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Reviewed-by: Peter Newman <peternewman@google.com>
Reviewed-by: Tan Shaopeng <tan.shaopeng@jp.fujitsu.com>
Reviewed-by: Fenghua Yu <fenghua.yu@intel.com>
Reviewed-by: Reinette Chatre <reinette.chatre@intel.com>
Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Tested-by: Peter Newman <peternewman@google.com>
Tested-by: Tan Shaopeng <tan.shaopeng@jp.fujitsu.com>
Link: https://lore.kernel.org/r/20231017002308.134480-5-babu.moger@amd.com
[ Xiaochen Shen: amend commit log ]
Signed-off-by: Xiaochen Shen <xiaochen.shen@intel.com>
commit d27567a upstream.

The default resource group and its files are created during kernel init
time. Upcoming changes will make some resctrl files optional based on
a mount parameter. If optional files are to be added to the default
group based on the mount option, then each new file needs to be created
separately and call kernfs_activate() again.

Create all files of the default resource group during resctrl mount,
destroyed during unmount, to avoid scattering resctrl file addition
across two separate code flows.

Intel-SIG: commit d27567a x86/resctrl: Move default group file creation to mount.
Incremental backporting patches for Intel RDT on Intel Xeon platform.

Signed-off-by: Babu Moger <babu.moger@amd.com>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Reviewed-by: Peter Newman <peternewman@google.com>
Reviewed-by: Tan Shaopeng <tan.shaopeng@jp.fujitsu.com>
Reviewed-by: Fenghua Yu <fenghua.yu@intel.com>
Reviewed-by: Reinette Chatre <reinette.chatre@intel.com>
Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Tested-by: Peter Newman <peternewman@google.com>
Tested-by: Tan Shaopeng <tan.shaopeng@jp.fujitsu.com>
Link: https://lore.kernel.org/r/20231017002308.134480-6-babu.moger@amd.com
[ Xiaochen Shen: amend commit log ]
Signed-off-by: Xiaochen Shen <xiaochen.shen@intel.com>
commit cb07d71 upstream.

Add "-o debug" option to mount resctrl filesystem in debug mode.  When
in debug mode resctrl displays files that have the new RFTYPE_DEBUG flag
to help resctrl debugging.

Intel-SIG: commit cb07d71 x86/resctrl: Introduce "-o debug" mount option.
Incremental backporting patches for Intel RDT on Intel Xeon platform.

Signed-off-by: Babu Moger <babu.moger@amd.com>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Reviewed-by: Peter Newman <peternewman@google.com>
Reviewed-by: Tan Shaopeng <tan.shaopeng@jp.fujitsu.com>
Reviewed-by: Fenghua Yu <fenghua.yu@intel.com>
Reviewed-by: Reinette Chatre <reinette.chatre@intel.com>
Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Tested-by: Peter Newman <peternewman@google.com>
Tested-by: Tan Shaopeng <tan.shaopeng@jp.fujitsu.com>
Link: https://lore.kernel.org/r/20231017002308.134480-7-babu.moger@amd.com
[ Xiaochen Shen: amend commit log ]
Signed-off-by: Xiaochen Shen <xiaochen.shen@intel.com>
commit ca8dad2 upstream.

In x86, hardware uses CLOSID to identify a control group. When a user
creates a control group this information is not visible to the user. It
can help resctrl debugging.

Add CLOSID(ctrl_hw_id) to the control groups display in the resctrl
interface. Users can see this detail when resctrl is mounted with the
"-o debug" option.

Other architectures do not use "CLOSID". Use the names ctrl_hw_id to refer
to "CLOSID" in an effort to keep the naming generic.

For example:
  $cat /sys/fs/resctrl/ctrl_grp1/ctrl_hw_id
  1

Intel-SIG: commit ca8dad2 x86/resctrl: Display CLOSID for resource group.
Incremental backporting patches for Intel RDT on Intel Xeon platform.

Signed-off-by: Babu Moger <babu.moger@amd.com>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Reviewed-by: Peter Newman <peternewman@google.com>
Reviewed-by: Tan Shaopeng <tan.shaopeng@jp.fujitsu.com>
Reviewed-by: Fenghua Yu <fenghua.yu@intel.com>
Reviewed-by: Reinette Chatre <reinette.chatre@intel.com>
Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Tested-by: Peter Newman <peternewman@google.com>
Tested-by: Tan Shaopeng <tan.shaopeng@jp.fujitsu.com>
Link: https://lore.kernel.org/r/20231017002308.134480-8-babu.moger@amd.com
[ Xiaochen Shen: amend commit log ]
Signed-off-by: Xiaochen Shen <xiaochen.shen@intel.com>
commit 918f211 upstream.

Files unique to monitoring groups have the RFTYPE_MON flag. When a new
monitoring group is created the resctrl files with flags RFTYPE_BASE
(files common to all resource groups) and RFTYPE_MON (files unique to
monitoring groups) are created to support interacting with the new
monitoring group.

A resource group can support both monitoring and control, also termed
a CTRL_MON resource group. CTRL_MON groups should get both monitoring
and control resctrl files but that is not the case. Only the
RFTYPE_BASE and RFTYPE_CTRL files are created for CTRL_MON groups.

Ensure that files with the RFTYPE_MON flag are created for CTRL_MON groups.

Intel-SIG: commit 918f211 x86/resctrl: Add support for the files of MON groups only.
Incremental backporting patches for Intel RDT on Intel Xeon platform.

Signed-off-by: Babu Moger <babu.moger@amd.com>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Reviewed-by: Peter Newman <peternewman@google.com>
Reviewed-by: Tan Shaopeng <tan.shaopeng@jp.fujitsu.com>
Reviewed-by: Fenghua Yu <fenghua.yu@intel.com>
Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Reviewed-by: Reinette Chatre <reinette.chatre@intel.com>
Tested-by: Peter Newman <peternewman@google.com>
Tested-by: Tan Shaopeng <tan.shaopeng@jp.fujitsu.com>
Link: https://lore.kernel.org/r/20231017002308.134480-9-babu.moger@amd.com
[ Xiaochen Shen: amend commit log ]
Signed-off-by: Xiaochen Shen <xiaochen.shen@intel.com>
commit 4cee14b upstream.

In x86, hardware uses RMID to identify a monitoring group. When a user
creates a monitor group these details are not visible. These details
can help resctrl debugging.

Add RMID(mon_hw_id) to the monitor groups display in the resctrl interface.
Users can see these details when resctrl is mounted with "-o debug" option.

Add RFTYPE_MON_BASE that complements existing RFTYPE_CTRL_BASE and
represents files belonging to monitoring groups.

Other architectures do not use "RMID". Use the name mon_hw_id to refer
to "RMID" in an effort to keep the naming generic.

For example:
  $cat /sys/fs/resctrl/mon_groups/mon_grp1/mon_hw_id
  3

Intel-SIG: commit 4cee14b x86/resctrl: Display RMID of resource group.
Incremental backporting patches for Intel RDT on Intel Xeon platform.

Signed-off-by: Babu Moger <babu.moger@amd.com>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Reviewed-by: Peter Newman <peternewman@google.com>
Reviewed-by: Tan Shaopeng <tan.shaopeng@jp.fujitsu.com>
Reviewed-by: Fenghua Yu <fenghua.yu@intel.com>
Reviewed-by: Reinette Chatre <reinette.chatre@intel.com>
Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Tested-by: Peter Newman <peternewman@google.com>
Tested-by: Tan Shaopeng <tan.shaopeng@jp.fujitsu.com>
Link: https://lore.kernel.org/r/20231017002308.134480-10-babu.moger@amd.com
[ Xiaochen Shen: amend commit log ]
Signed-off-by: Xiaochen Shen <xiaochen.shen@intel.com>
commit 1b908de upstream.

In a "W=1" build gcc throws a warning:

  arch/x86/kernel/cpu/resctrl/core.c: In function ‘cache_alloc_hsw_probe’:
  arch/x86/kernel/cpu/resctrl/core.c:139:16: warning: variable ‘h’ set but not used

Switch from wrmsr_safe() to wrmsrl_safe(), and from rdmsr() to rdmsrl()
using a single u64 argument for the MSR value instead of the pair of u32
for the high and low halves.

Intel-SIG: commit 1b908de x86/resctrl: Fix unused variable warning in cache_alloc_hsw_probe().
Incremental backporting patches for Intel RDT on Intel Xeon platform.

Signed-off-by: Tony Luck <tony.luck@intel.com>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Reviewed-by: Babu Moger <babu.moger@amd.com>
Acked-by: Reinette Chatre <reinette.chatre@intel.com>
Link: https://lore.kernel.org/r/ZULCd/TGJL9Dmncf@agluck-desk3
[ Xiaochen Shen: amend commit log ]
Signed-off-by: Xiaochen Shen <xiaochen.shen@intel.com>
commit fc747ee upstream.

The kernel test robot reported the following warning after commit

  54e35eb ("x86/resctrl: Read supported bandwidth sources from CPUID").

even though the issue is present even in the original commit

  92bd5a1 ("x86/resctrl: Add interface to write mbm_total_bytes_config")

which added this function. The reported warning is:

  $ make C=1 CHECK=scripts/coccicheck arch/x86/kernel/cpu/resctrl/rdtgroup.o
  ...
  arch/x86/kernel/cpu/resctrl/rdtgroup.c:1621:5-8: Unneeded variable: "ret". Return "0" on line 1655

Remove the local variable 'ret'.

  [ bp: Massage commit message, make mbm_config_write_domain() void. ]

Intel-SIG: commit fc747ee x86/resctrl: Remove redundant variable in mbm_config_write_domain().
Incremental backporting patches for Intel RDT on Intel Xeon platform.

Fixes: 92bd5a1 ("x86/resctrl: Add interface to write mbm_total_bytes_config")
Reported-by: kernel test robot <lkp@intel.com>
Closes: https://lore.kernel.org/oe-kbuild-all/202401241810.jbd8Ipa1-lkp@intel.com/
Signed-off-by: Babu Moger <babu.moger@amd.com>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Acked-by: Reinette Chatre <reinette.chatre@intel.com>
Link: https://lore.kernel.org/r/202401241810.jbd8Ipa1-lkp@intel.com
[ Xiaochen Shen: amend commit log ]
Signed-off-by: Xiaochen Shen <xiaochen.shen@intel.com>
commit 31a5c0b upstream.

tick_nohz_full_mask lists the CPUs that are nohz_full. This is only needed when
CONFIG_NO_HZ_FULL is defined. tick_nohz_full_cpu() allows a specific CPU to be
tested against the mask, and evaluates to false when CONFIG_NO_HZ_FULL is not
defined.

The resctrl code needs to pick a CPU to run some work on, a new helper prefers
housekeeping CPUs by examining the tick_nohz_full_mask. Hiding the declaration
behind #ifdef CONFIG_NO_HZ_FULL forces all the users to be behind an #ifdef
too.

Move the tick_nohz_full_mask declaration, this lets callers drop the #ifdef,
and guard access to tick_nohz_full_mask with IS_ENABLED() or something like
tick_nohz_full_cpu().

The definition does not need to be moved as any callers should be removed at
compile time unless CONFIG_NO_HZ_FULL is defined.

Intel-SIG: commit 31a5c0b tick/nohz: Move tick_nohz_full_mask declaration outside the #ifdef.
Incremental backporting patches for Intel RDT on Intel Xeon platform.

Signed-off-by: James Morse <james.morse@arm.com>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Reviewed-by: Shaopeng Tan <tan.shaopeng@fujitsu.com>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Reinette Chatre <reinette.chatre@intel.com> # for resctrl dependency
Tested-by: Shaopeng Tan <tan.shaopeng@fujitsu.com>
Tested-by: Peter Newman <peternewman@google.com>
Tested-by: Carl Worth <carl@os.amperecomputing.com> # arm64
Link: https://lore.kernel.org/r/20240213184438.16675-2-james.morse@arm.com
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
[ Xiaochen Shen: amend commit log ]
Signed-off-by: Xiaochen Shen <xiaochen.shen@intel.com>
commit 3f7b073 upstream.

rmid_ptrs[] is allocated from dom_data_init() but never free()d.

While the exit text ends up in the linker script's DISCARD section,
the direction of travel is for resctrl to be/have loadable modules.

Add resctrl_put_mon_l3_config() to cleanup any memory allocated
by rdt_get_mon_l3_config().

There is no reason to backport this to a stable kernel.

Intel-SIG: commit 3f7b073 x86/resctrl: Free rmid_ptrs from resctrl_exit().
Incremental backporting patches for Intel RDT on Intel Xeon platform.

Signed-off-by: James Morse <james.morse@arm.com>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Reviewed-by: Babu Moger <babu.moger@amd.com>
Reviewed-by: Reinette Chatre <reinette.chatre@intel.com>
Tested-by: Babu Moger <babu.moger@amd.com>
Tested-by: Carl Worth <carl@os.amperecomputing.com> # arm64
Link: https://lore.kernel.org/r/20240213184438.16675-3-james.morse@arm.com
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
[ Xiaochen Shen: amend commit log ]
Signed-off-by: Xiaochen Shen <xiaochen.shen@intel.com>
commit b1de313 upstream.

When monitoring is supported, each monitor and control group is allocated an
RMID. For control groups, rdtgroup_mkdir_ctrl_mon() later goes on to allocate
the CLOSID.

MPAM's equivalent of RMID are not an independent number, so can't be allocated
until the CLOSID is known. An RMID allocation for one CLOSID may fail, whereas
another may succeed depending on how many monitor groups a control group has.

The RMID allocation needs to move to be after the CLOSID has been allocated.

Move the RMID allocation and mondata dir creation to a helper.

Intel-SIG: commit b1de313 x86/resctrl: Create helper for RMID allocation and mondata dir creation.
Incremental backporting patches for Intel RDT on Intel Xeon platform.

Signed-off-by: James Morse <james.morse@arm.com>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Reviewed-by: Shaopeng Tan <tan.shaopeng@fujitsu.com>
Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Reviewed-by: Reinette Chatre <reinette.chatre@intel.com>
Reviewed-by: Babu Moger <babu.moger@amd.com>
Tested-by: Peter Newman <peternewman@google.com>
Tested-by: Shaopeng Tan <tan.shaopeng@fujitsu.com>
Tested-by: Babu Moger <babu.moger@amd.com>
Tested-by: Carl Worth <carl@os.amperecomputing.com> # arm64
Link: https://lore.kernel.org/r/20240213184438.16675-4-james.morse@arm.com
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
[ Xiaochen Shen: amend commit log ]
Signed-off-by: Xiaochen Shen <xiaochen.shen@intel.com>
commit 311639e upstream.

RMIDs are allocated for each monitor or control group directory, because
each of these needs its own RMID. For control groups,
rdtgroup_mkdir_ctrl_mon() later goes on to allocate the CLOSID.

MPAM's equivalent of RMID is not an independent number, so can't be
allocated until the CLOSID is known. An RMID allocation for one CLOSID
may fail, whereas another may succeed depending on how many monitor
groups a control group has.

The RMID allocation needs to move to be after the CLOSID has been
allocated.

Move the RMID allocation out of mkdir_rdt_prepare() to occur in its caller,
after the mkdir_rdt_prepare() call. This allows the RMID allocator to
know the CLOSID.

Intel-SIG: commit 311639e x86/resctrl: Move RMID allocation out of mkdir_rdt_prepare().
Incremental backporting patches for Intel RDT on Intel Xeon platform.

Signed-off-by: James Morse <james.morse@arm.com>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Reviewed-by: Shaopeng Tan <tan.shaopeng@fujitsu.com>
Reviewed-by: Reinette Chatre <reinette.chatre@intel.com>
Reviewed-by: Babu Moger <babu.moger@amd.com>
Tested-by: Shaopeng Tan <tan.shaopeng@fujitsu.com>
Tested-by: Peter Newman <peternewman@google.com>
Tested-by: Babu Moger <babu.moger@amd.com>
Tested-by: Carl Worth <carl@os.amperecomputing.com> # arm64
Link: https://lore.kernel.org/r/20240213184438.16675-5-james.morse@arm.com
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
[ Xiaochen Shen: amend commit log ]
Signed-off-by: Xiaochen Shen <xiaochen.shen@intel.com>
commit 40fc735 upstream.

x86's RMID are independent of the CLOSID. An RMID can be allocated,
used and freed without considering the CLOSID.

MPAM's equivalent feature is PMG, which is not an independent number,
it extends the CLOSID/PARTID space. For MPAM, only PMG-bits worth of
'RMID' can be allocated for a single CLOSID.
i.e. if there is 1 bit of PMG space, then each CLOSID can have two
monitor groups.

To allow resctrl to disambiguate RMID values for different CLOSID,
everything in resctrl that keeps an RMID value needs to know the CLOSID
too. This will always be ignored on x86.

Intel-SIG: commit 40fc735 x86/resctrl: Track the closid with the rmid.
Incremental backporting patches for Intel RDT on Intel Xeon platform.

Signed-off-by: James Morse <james.morse@arm.com>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Reviewed-by: Shaopeng Tan <tan.shaopeng@fujitsu.com>
Reviewed-by: Xin Hao <xhao@linux.alibaba.com>
Reviewed-by: Reinette Chatre <reinette.chatre@intel.com>
Tested-by: Shaopeng Tan <tan.shaopeng@fujitsu.com>
Tested-by: Peter Newman <peternewman@google.com>
Tested-by: Babu Moger <babu.moger@amd.com>
Tested-by: Carl Worth <carl@os.amperecomputing.com> # arm64
Link: https://lore.kernel.org/r/20240213184438.16675-6-james.morse@arm.com
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
[ Xiaochen Shen: amend commit log ]
Signed-off-by: Xiaochen Shen <xiaochen.shen@intel.com>
commit 6791e0e upstream.

x86 systems identify traffic using the CLOSID and RMID. The CLOSID is
used to lookup the control policy, the RMID is used for monitoring. For
x86 these are independent numbers.
Arm's MPAM has equivalent features PARTID and PMG, where the PARTID is
used to lookup the control policy. The PMG in contrast is a small number
of bits that are used to subdivide PARTID when monitoring. The
cache-occupancy monitors require the PARTID to be specified when
monitoring.

This means MPAM's PMG field is not unique. There are multiple PMG-0, one
per allocated CLOSID/PARTID. If PMG is treated as equivalent to RMID, it
cannot be allocated as an independent number. Bitmaps like rmid_busy_llc
need to be sized by the number of unique entries for this resource.

Treat the combined CLOSID and RMID as an index, and provide architecture
helpers to pack and unpack an index. This makes the MPAM values unique.
The domain's rmid_busy_llc and rmid_ptrs[] are then sized by index, as
are domain mbm_local[] and mbm_total[].

x86 can ignore the CLOSID field when packing and unpacking an index, and
report as many indexes as RMID.

Intel-SIG: commit 6791e0e x86/resctrl: Access per-rmid structures by index.
Incremental backporting patches for Intel RDT on Intel Xeon platform.

Signed-off-by: James Morse <james.morse@arm.com>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Reviewed-by: Shaopeng Tan <tan.shaopeng@fujitsu.com>
Reviewed-by: Babu Moger <babu.moger@amd.com>
Reviewed-by: Reinette Chatre <reinette.chatre@intel.com>
Tested-by: Shaopeng Tan <tan.shaopeng@fujitsu.com>
Tested-by: Peter Newman <peternewman@google.com>
Tested-by: Babu Moger <babu.moger@amd.com>
Tested-by: Carl Worth <carl@os.amperecomputing.com> # arm64
Link: https://lore.kernel.org/r/20240213184438.16675-7-james.morse@arm.com
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
[ Xiaochen Shen: amend commit log ]
Signed-off-by: Xiaochen Shen <xiaochen.shen@intel.com>
commit c4c0376 upstream.

MPAMs RMID values are not unique unless the CLOSID is considered as well.

alloc_rmid() expects the RMID to be an independent number.

Pass the CLOSID in to alloc_rmid(). Use this to compare indexes when
allocating. If the CLOSID is not relevant to the index, this ends up comparing
the free RMID with itself, and the first free entry will be used. With MPAM the
CLOSID is included in the index, so this becomes a walk of the free RMID
entries, until one that matches the supplied CLOSID is found.

Intel-SIG: commit c4c0376 x86/resctrl: Allow RMID allocation to be scoped by CLOSID.
Incremental backporting patches for Intel RDT on Intel Xeon platform.

Signed-off-by: James Morse <james.morse@arm.com>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Reviewed-by: Shaopeng Tan <tan.shaopeng@fujitsu.com>
Reviewed-by: Reinette Chatre <reinette.chatre@intel.com>
Reviewed-by: Babu Moger <babu.moger@amd.com>
Tested-by: Shaopeng Tan <tan.shaopeng@fujitsu.com>
Tested-by: Peter Newman <peternewman@google.com>
Tested-by: Babu Moger <babu.moger@amd.com>
Tested-by: Carl Worth <carl@os.amperecomputing.com> # arm64
Link: https://lore.kernel.org/r/20240213184438.16675-8-james.morse@arm.com
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
[ Xiaochen Shen: amend commit log ]
Signed-off-by: Xiaochen Shen <xiaochen.shen@intel.com>
commit b30a55d upstream.

MPAM's PMG bits extend its PARTID space, meaning the same PMG value can be
used for different control groups.

This means once a CLOSID is allocated, all its monitoring ids may still be
dirty, and held in limbo.

Keep track of the number of RMID held in limbo each CLOSID has. This will
allow a future helper to find the 'cleanest' CLOSID when allocating.

The array is only needed when CONFIG_RESCTRL_RMID_DEPENDS_ON_CLOSID is
defined. This will never be the case on x86.

Intel-SIG: commit b30a55d x86/resctrl: Track the number of dirty RMID a CLOSID has.
Incremental backporting patches for Intel RDT on Intel Xeon platform.

Signed-off-by: James Morse <james.morse@arm.com>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Reviewed-by: Shaopeng Tan <tan.shaopeng@fujitsu.com>
Reviewed-by: Reinette Chatre <reinette.chatre@intel.com>
Reviewed-by: Babu Moger <babu.moger@amd.com>
Tested-by: Shaopeng Tan <tan.shaopeng@fujitsu.com>
Tested-by: Peter Newman <peternewman@google.com>
Tested-by: Babu Moger <babu.moger@amd.com>
Tested-by: Carl Worth <carl@os.amperecomputing.com> # arm64
Link: https://lore.kernel.org/r/20240213184438.16675-9-james.morse@arm.com
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
[ Xiaochen Shen: amend commit log ]
Signed-off-by: Xiaochen Shen <xiaochen.shen@intel.com>
commit 5d920b6 upstream.

The resctrl CLOSID allocator uses a single 32bit word to track which
CLOSID are free. The setting and clearing of bits is open coded.

Convert the existing open coded bit manipulations of closid_free_map
to use __set_bit() and friends. These don't need to be atomic as this
list is protected by the mutex.

Intel-SIG: commit 5d920b6 x86/resctrl: Use __set_bit()/__clear_bit() instead of open coding.
Incremental backporting patches for Intel RDT on Intel Xeon platform.

Signed-off-by: James Morse <james.morse@arm.com>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Reviewed-by: Shaopeng Tan <tan.shaopeng@fujitsu.com>
Reviewed-by: Reinette Chatre <reinette.chatre@intel.com>
Reviewed-by: Babu Moger <babu.moger@amd.com>
Tested-by: Shaopeng Tan <tan.shaopeng@fujitsu.com>
Tested-by: Peter Newman <peternewman@google.com>
Tested-by: Babu Moger <babu.moger@amd.com>
Tested-by: Carl Worth <carl@os.amperecomputing.com> # arm64
Link: https://lore.kernel.org/r/20240213184438.16675-10-james.morse@arm.com
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
[ Xiaochen Shen: amend commit log ]
Signed-off-by: Xiaochen Shen <xiaochen.shen@intel.com>
…ty_rmid

commit 6eac36b upstream.

MPAM's PMG bits extend its PARTID space, meaning the same PMG value can be used
for different control groups.

This means once a CLOSID is allocated, all its monitoring ids may still be
dirty, and held in limbo.

Instead of allocating the first free CLOSID, on architectures where
CONFIG_RESCTRL_RMID_DEPENDS_ON_CLOSID is enabled, search
closid_num_dirty_rmid[] to find the cleanest CLOSID.

The CLOSID found is returned to closid_alloc() for the free list
to be updated.

Intel-SIG: commit 6eac36b x86/resctrl: Allocate the cleanest CLOSID by searching closid_num_dirty_rmid.
Incremental backporting patches for Intel RDT on Intel Xeon platform.

Signed-off-by: James Morse <james.morse@arm.com>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Reviewed-by: Shaopeng Tan <tan.shaopeng@fujitsu.com>
Reviewed-by: Reinette Chatre <reinette.chatre@intel.com>
Reviewed-by: Babu Moger <babu.moger@amd.com>
Tested-by: Shaopeng Tan <tan.shaopeng@fujitsu.com>
Tested-by: Peter Newman <peternewman@google.com>
Tested-by: Babu Moger <babu.moger@amd.com>
Tested-by: Carl Worth <carl@os.amperecomputing.com> # arm64
Link: https://lore.kernel.org/r/20240213184438.16675-11-james.morse@arm.com
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
[ Xiaochen Shen: amend commit log ]
Signed-off-by: Xiaochen Shen <xiaochen.shen@intel.com>
commit 6eca639 upstream.

When switching tasks, the CLOSID and RMID that the new task should use
are stored in struct task_struct. For x86 the CLOSID known by resctrl,
the value in task_struct, and the value written to the CPU register are
all the same thing.

MPAM's CPU interface has two different PARTIDs - one for data accesses
the other for instruction fetch. Storing resctrl's CLOSID value in
struct task_struct implies the arch code knows whether resctrl is using
CDP.

Move the matching and setting of the struct task_struct properties to
use helpers. This allows arm64 to store the hardware format of the
register, instead of having to convert it each time.

__rdtgroup_move_task()s use of READ_ONCE()/WRITE_ONCE() ensures torn
values aren't seen as another CPU may schedule the task being moved
while the value is being changed. MPAM has an additional corner-case
here as the PMG bits extend the PARTID space.

If the scheduler sees a new-CLOSID but old-RMID, the task will dirty an
RMID that the limbo code is not watching causing an inaccurate count.

x86's RMID are independent values, so the limbo code will still be
watching the old-RMID in this circumstance.

To avoid this, arm64 needs both the CLOSID/RMID WRITE_ONCE()d together.
Both values must be provided together.

Because MPAM's RMID values are not unique, the CLOSID must be provided
when matching the RMID.

Intel-SIG: commit 6eca639 x86/resctrl: Move CLOSID/RMID matching and setting to use helpers.
Incremental backporting patches for Intel RDT on Intel Xeon platform.

Signed-off-by: James Morse <james.morse@arm.com>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Reviewed-by: Shaopeng Tan <tan.shaopeng@fujitsu.com>
Reviewed-by: Reinette Chatre <reinette.chatre@intel.com>
Reviewed-by: Babu Moger <babu.moger@amd.com>
Tested-by: Shaopeng Tan <tan.shaopeng@fujitsu.com>
Tested-by: Peter Newman <peternewman@google.com>
Tested-by: Babu Moger <babu.moger@amd.com>
Tested-by: Carl Worth <carl@os.amperecomputing.com> # arm64
Link: https://lore.kernel.org/r/20240213184438.16675-12-james.morse@arm.com
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
[ Xiaochen Shen: amend commit log ]
Signed-off-by: Xiaochen Shen <xiaochen.shen@intel.com>
commit a4846aa upstream.

The limbo and overflow code picks a CPU to use from the domain's list of online
CPUs. Work is then scheduled on these CPUs to maintain the limbo list and any
counters that may overflow.

cpumask_any() may pick a CPU that is marked nohz_full, which will either
penalise the work that CPU was dedicated to, or delay the processing of limbo
list or counters that may overflow. Perhaps indefinitely. Delaying the overflow
handling will skew the bandwidth values calculated by mba_sc, which expects to
be called once a second.

Add cpumask_any_housekeeping() as a replacement for cpumask_any() that prefers
housekeeping CPUs. This helper will still return a nohz_full CPU if that is the
only option. The CPU to use is re-evaluated each time the limbo/overflow work
runs. This ensures the work will move off a nohz_full CPU once a housekeeping
CPU is available.

Intel-SIG: commit a4846aa x86/resctrl: Add cpumask_any_housekeeping() for limbo/overflow.
Incremental backporting patches for Intel RDT on Intel Xeon platform.

Signed-off-by: James Morse <james.morse@arm.com>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Reviewed-by: Shaopeng Tan <tan.shaopeng@fujitsu.com>
Reviewed-by: Reinette Chatre <reinette.chatre@intel.com>
Reviewed-by: Babu Moger <babu.moger@amd.com>
Tested-by: Shaopeng Tan <tan.shaopeng@fujitsu.com>
Tested-by: Peter Newman <peternewman@google.com>
Tested-by: Babu Moger <babu.moger@amd.com>
Tested-by: Carl Worth <carl@os.amperecomputing.com> # arm64
Link: https://lore.kernel.org/r/20240213184438.16675-13-james.morse@arm.com
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
[ Xiaochen Shen: amend commit log ]
Signed-off-by: Xiaochen Shen <xiaochen.shen@intel.com>
commit 09909e0 upstream.

Intel is blessed with an abundance of monitors, one per RMID, that can be
read from any CPU in the domain. MPAMs monitors reside in the MMIO MSC,
the number implemented is up to the manufacturer. This means when there are
fewer monitors than needed, they need to be allocated and freed.

MPAM's CSU monitors are used to back the 'llc_occupancy' monitor file. The
CSU counter is allowed to return 'not ready' for a small number of
micro-seconds after programming. To allow one CSU hardware monitor to be
used for multiple control or monitor groups, the CPU accessing the
monitor needs to be able to block when configuring and reading the
counter.

Worse, the domain may be broken up into slices, and the MMIO accesses
for each slice may need performing from different CPUs.

These two details mean MPAMs monitor code needs to be able to sleep, and
IPI another CPU in the domain to read from a resource that has been sliced.

mon_event_read() already invokes mon_event_count() via IPI, which means
this isn't possible. On systems using nohz-full, some CPUs need to be
interrupted to run kernel work as they otherwise stay in user-space
running realtime workloads. Interrupting these CPUs should be avoided,
and scheduling work on them may never complete.

Change mon_event_read() to pick a housekeeping CPU, (one that is not using
nohz_full) and schedule mon_event_count() and wait. If all the CPUs
in a domain are using nohz-full, then an IPI is used as the fallback.

This function is only used in response to a user-space filesystem request
(not the timing sensitive overflow code).

This allows MPAM to hide the slice behaviour from resctrl, and to keep
the monitor-allocation in monitor.c. When the IPI fallback is used on
machines where MPAM needs to make an access on multiple CPUs, the counter
read will always fail.

Intel-SIG: commit 09909e0 x86/resctrl: Queue mon_event_read() instead of sending an IPI.
Incremental backporting patches for Intel RDT on Intel Xeon platform.

Signed-off-by: James Morse <james.morse@arm.com>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Reviewed-by: Shaopeng Tan <tan.shaopeng@fujitsu.com>
Reviewed-by: Peter Newman <peternewman@google.com>
Reviewed-by: Reinette Chatre <reinette.chatre@intel.com>
Reviewed-by: Babu Moger <babu.moger@amd.com>
Tested-by: Shaopeng Tan <tan.shaopeng@fujitsu.com>
Tested-by: Peter Newman <peternewman@google.com>
Tested-by: Babu Moger <babu.moger@amd.com>
Tested-by: Carl Worth <carl@os.amperecomputing.com> # arm64
Link: https://lore.kernel.org/r/20240213184438.16675-14-james.morse@arm.com
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
[ Xiaochen Shen: amend commit log ]
Signed-off-by: Xiaochen Shen <xiaochen.shen@intel.com>
commit 6fde142 upstream.

MPAM's cache occupancy counters can take a little while to settle once the
monitor has been configured. The maximum settling time is described to the
driver via a firmware table. The value could be large enough that it makes
sense to sleep. To avoid exposing this to resctrl, it should be hidden behind
MPAM's resctrl_arch_rmid_read().

resctrl_arch_rmid_read() may be called via IPI meaning it is unable to sleep.
In this case, it should return an error if it needs to sleep. This will only
affect MPAM platforms where the cache occupancy counter isn't available
immediately, nohz_full is in use, and there are no housekeeping CPUs in the
necessary domain.

There are three callers of resctrl_arch_rmid_read(): __mon_event_count() and
__check_limbo() are both called from a non-migrateable context.
mon_event_read() invokes __mon_event_count() using smp_call_on_cpu(), which
adds work to the target CPUs workqueue.  rdtgroup_mutex() is held, meaning this
cannot race with the resctrl cpuhp callback. __check_limbo() is invoked via
schedule_delayed_work_on() also adds work to a per-cpu workqueue.

The remaining call is add_rmid_to_limbo() which is called in response to
a user-space syscall that frees an RMID. This opportunistically reads the LLC
occupancy counter on the current domain to see if the RMID is over the dirty
threshold. This has to disable preemption to avoid reading the wrong domain's
value. Disabling preemption here prevents resctrl_arch_rmid_read() from
sleeping.

add_rmid_to_limbo() walks each domain, but only reads the counter on one
domain. If the system has more than one domain, the RMID will always be added
to the limbo list. If the RMIDs usage was not over the threshold, it will be
removed from the list when __check_limbo() runs.  Make this the default
behaviour. Free RMIDs are always added to the limbo list for each domain.

The user visible effect of this is that a clean RMID is not available for
re-allocation immediately after 'rmdir()' completes. This behaviour was never
portable as it never happened on a machine with multiple domains.

Removing this path allows resctrl_arch_rmid_read() to sleep if its called with
interrupts unmasked. Document this is the expected behaviour, and add
a might_sleep() annotation to catch changes that won't work on arm64.

Intel-SIG: commit 6fde142 x86/resctrl: Allow resctrl_arch_rmid_read() to sleep.
Incremental backporting patches for Intel RDT on Intel Xeon platform.

Signed-off-by: James Morse <james.morse@arm.com>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Reviewed-by: Shaopeng Tan <tan.shaopeng@fujitsu.com>
Reviewed-by: Reinette Chatre <reinette.chatre@intel.com>
Reviewed-by: Babu Moger <babu.moger@amd.com>
Tested-by: Shaopeng Tan <tan.shaopeng@fujitsu.com>
Tested-by: Peter Newman <peternewman@google.com>
Tested-by: Babu Moger <babu.moger@amd.com>
Tested-by: Carl Worth <carl@os.amperecomputing.com> # arm64
Link: https://lore.kernel.org/r/20240213184438.16675-15-james.morse@arm.com
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
[ Xiaochen Shen: amend commit log ]
Signed-off-by: Xiaochen Shen <xiaochen.shen@intel.com>
…d_read()

commit e557999 upstream.

Depending on the number of monitors available, Arm's MPAM may need to
allocate a monitor prior to reading the counter value. Allocating a
contended resource may involve sleeping.

__check_limbo() and mon_event_count() each make multiple calls to
resctrl_arch_rmid_read(), to avoid extra work on contended systems,
the allocation should be valid for multiple invocations of
resctrl_arch_rmid_read().

The memory or hardware allocated is not specific to a domain.

Add arch hooks for this allocation, which need calling before
resctrl_arch_rmid_read(). The allocated monitor is passed to
resctrl_arch_rmid_read(), then freed again afterwards. The helper
can be called on any CPU, and can sleep.

Intel-SIG: commit e557999 x86/resctrl: Allow arch to allocate memory needed in resctrl_arch_rmid_read().
Incremental backporting patches for Intel RDT on Intel Xeon platform.

Signed-off-by: James Morse <james.morse@arm.com>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Reviewed-by: Shaopeng Tan <tan.shaopeng@fujitsu.com>
Reviewed-by: Reinette Chatre <reinette.chatre@intel.com>
Reviewed-by: Babu Moger <babu.moger@amd.com>
Tested-by: Shaopeng Tan <tan.shaopeng@fujitsu.com>
Tested-by: Peter Newman <peternewman@google.com>
Tested-by: Babu Moger <babu.moger@amd.com>
Tested-by: Carl Worth <carl@os.amperecomputing.com> # arm64
Link: https://lore.kernel.org/r/20240213184438.16675-16-james.morse@arm.com
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
[ Xiaochen Shen: amend commit log ]
Signed-off-by: Xiaochen Shen <xiaochen.shen@intel.com>
commit 13e5769 upstream.

The rdt_enable_key is switched when resctrl is mounted, and used to prevent
a second mount of the filesystem. It also enables the architecture's context
switch code.

This requires another architecture to have the same set of static keys, as
resctrl depends on them too. The existing users of these static keys are
implicitly also checking if the filesystem is mounted.

Make the resctrl_mounted checks explicit: resctrl can keep track of whether it
has been mounted once. This doesn't need to be combined with whether the arch
code is context switching the CLOSID.

rdt_mon_enable_key is never used just to test that resctrl is mounted, but does
also have this implication. Add a resctrl_mounted to all uses of
rdt_mon_enable_key.

This will allow the static key changing to be moved behind resctrl_arch_ calls.

Intel-SIG: commit 13e5769 x86/resctrl: Make resctrl_mounted checks explicit.
Incremental backporting patches for Intel RDT on Intel Xeon platform.

Signed-off-by: James Morse <james.morse@arm.com>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Reviewed-by: Shaopeng Tan <tan.shaopeng@fujitsu.com>
Reviewed-by: Reinette Chatre <reinette.chatre@intel.com>
Reviewed-by: Babu Moger <babu.moger@amd.com>
Tested-by: Shaopeng Tan <tan.shaopeng@fujitsu.com>
Tested-by: Peter Newman <peternewman@google.com>
Tested-by: Babu Moger <babu.moger@amd.com>
Tested-by: Carl Worth <carl@os.amperecomputing.com> # arm64
Link: https://lore.kernel.org/r/20240213184438.16675-17-james.morse@arm.com
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
[ Xiaochen Shen: amend commit log ]
Signed-off-by: Xiaochen Shen <xiaochen.shen@intel.com>
commit 5db6a4a upstream.

resctrl enables three static keys depending on the features it has enabled.
Another architecture's context switch code may look different, any static keys
that control it should be buried behind helpers.

Move the alloc/mon logic into arch-specific helpers as a preparatory step for
making the rdt_enable_key's status something the arch code decides.

This means other architectures don't have to mirror the static keys.

Intel-SIG: commit 5db6a4a x86/resctrl: Move alloc/mon static keys into helpers.
Incremental backporting patches for Intel RDT on Intel Xeon platform.

Signed-off-by: James Morse <james.morse@arm.com>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Reviewed-by: Shaopeng Tan <tan.shaopeng@fujitsu.com>
Reviewed-by: Reinette Chatre <reinette.chatre@intel.com>
Reviewed-by: Babu Moger <babu.moger@amd.com>
Tested-by: Shaopeng Tan <tan.shaopeng@fujitsu.com>
Tested-by: Peter Newman <peternewman@google.com>
Tested-by: Babu Moger <babu.moger@amd.com>
Tested-by: Carl Worth <carl@os.amperecomputing.com> # arm64
Link: https://lore.kernel.org/r/20240213184438.16675-18-james.morse@arm.com
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
[ Xiaochen Shen: amend commit log ]
Signed-off-by: Xiaochen Shen <xiaochen.shen@intel.com>
commit 0a2f4d9 upstream.

rdt_enable_key is switched when resctrl is mounted. It was also previously used
to prevent a second mount of the filesystem.

Any other architecture that wants to support resctrl has to provide identical
static keys.

Now that there are helpers for enabling and disabling the alloc/mon keys,
resctrl doesn't need to switch this extra key, it can be done by the arch code.
Use the static-key increment and decrement helpers, and change resctrl to
ensure the calls are balanced.

Intel-SIG: commit 0a2f4d9 x86/resctrl: Make rdt_enable_key the arch's decision to switch.
Incremental backporting patches for Intel RDT on Intel Xeon platform.

Signed-off-by: James Morse <james.morse@arm.com>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Reviewed-by: Shaopeng Tan <tan.shaopeng@fujitsu.com>
Reviewed-by: Reinette Chatre <reinette.chatre@intel.com>
Reviewed-by: Babu Moger <babu.moger@amd.com>
Tested-by: Shaopeng Tan <tan.shaopeng@fujitsu.com>
Tested-by: Peter Newman <peternewman@google.com>
Tested-by: Babu Moger <babu.moger@amd.com>
Tested-by: Carl Worth <carl@os.amperecomputing.com> # arm64
Link: https://lore.kernel.org/r/20240213184438.16675-19-james.morse@arm.com
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
[ Xiaochen Shen: amend commit log ]
Signed-off-by: Xiaochen Shen <xiaochen.shen@intel.com>
commit 0158ed6 upstream.

When SNC mode is enabled, create subdirectories and files to monitor at the SNC
node granularity. Legacy behavior is preserved by tagging the monitor files at
the L3 granularity with the "sum" attribute.  When the user reads these files
the kernel will read monitor data from all SNC nodes that share the same L3
cache instance and return the aggregated value to the user.

Note that the "domid" field for files that must sum across SNC domains has the
L3 cache instance id, while non-summing files use the domain id.

The "sum" files do not need to make a call to mon_event_read() to initialize
the MBM counters. This will be handled by initializing the individual SNC nodes
that share the L3.

Intel-SIG: commit 0158ed6 x86/resctrl: Create Sub-NUMA Cluster (SNC) monitor files.
Backporting patches for Intel RDT monitoring with SNC on Intel Xeon platform.

Signed-off-by: Tony Luck <tony.luck@intel.com>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Reviewed-by: Reinette Chatre <reinette.chatre@intel.com>
Tested-by: Babu Moger <babu.moger@amd.com>
Link: https://lore.kernel.org/r/20240628215619.76401-14-tony.luck@intel.com
[ Xiaochen Shen: amend commit log ]
Signed-off-by: Xiaochen Shen <xiaochen.shen@intel.com>
commit 6b48b80 upstream.

In SNC mode, there are multiple subdirectories in each L3 level monitor
directory (one for each SNC node). If all the CPUs in an SNC node are taken
offline, just remove the SNC directory for that node. In non-SNC mode, or when
the last SNC node directory is removed, remove the L3 monitor directory.

Add a helper function to avoid duplicated code.

Intel-SIG: commit 6b48b80 x86/resctrl: Handle removing directories in Sub-NUMA Cluster (SNC) mode.
Backporting patches for Intel RDT monitoring with SNC on Intel Xeon platform.

Signed-off-by: Tony Luck <tony.luck@intel.com>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Reviewed-by: Reinette Chatre <reinette.chatre@intel.com>
Link: https://lore.kernel.org/r/20240702173820.90368-2-tony.luck@intel.com
[ Xiaochen Shen: amend commit log ]
Signed-off-by: Xiaochen Shen <xiaochen.shen@intel.com>
…ounter

commit c8c7d3d upstream.

mon_event_read() fills out most fields of the struct rmid_read that is passed
via an smp_call*() function to a CPU that is part of the correct domain to
read the monitor counters.

With Sub-NUMA Cluster (SNC) mode there are now two cases to handle:

1) Reading a file that returns a value for a single domain.
   + Choose the CPU to execute from the domain cpu_mask

2) Reading a file that must sum across domains sharing an L3 cache
   instance.
   + Indicate to called code that a sum is needed by passing a NULL
     rdt_mon_domain pointer.
   + Choose the CPU from the L3 shared_cpu_map.

Intel-SIG: commit c8c7d3d x86/resctrl: Fill out rmid_read structure for smp_call*() to read a counter.
Backporting patches for Intel RDT monitoring with SNC on Intel Xeon platform.

Signed-off-by: Tony Luck <tony.luck@intel.com>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Reviewed-by: Reinette Chatre <reinette.chatre@intel.com>
Tested-by: Babu Moger <babu.moger@amd.com>
Link: https://lore.kernel.org/r/20240628215619.76401-16-tony.luck@intel.com
[ Xiaochen Shen: amend commit log ]
Signed-off-by: Xiaochen Shen <xiaochen.shen@intel.com>
commit 9fbb303 upstream.

Legacy resctrl monitor files must provide the sum of event values across
all Sub-NUMA Cluster (SNC) domains that share an L3 cache instance.

There are now two cases:
1) A specific domain is provided in struct rmid_read
   This is either a non-SNC system, or the request is to read data
   from just one SNC node.
2) Domain pointer is NULL. In this case the cacheinfo field in struct
   rmid_read indicates that all SNC nodes that share that L3 cache
   instance should have the event read and return the sum of all
   values.

Update the CPU sanity check. The existing check that an event is read
from a CPU in the requested domain still applies when reading a single
domain. But when summing across domains a more relaxed check that the
current CPU is in the scope of the L3 cache instance is appropriate
since the MSRs to read events are scoped at L3 cache level.

Intel-SIG: commit 9fbb303 x86/resctrl: Make __mon_event_count() handle sum domains.
Backporting patches for Intel RDT monitoring with SNC on Intel Xeon platform.

Signed-off-by: Tony Luck <tony.luck@intel.com>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Reviewed-by: Reinette Chatre <reinette.chatre@intel.com>
Tested-by: Babu Moger <babu.moger@amd.com>
Link: https://lore.kernel.org/r/20240628215619.76401-17-tony.luck@intel.com
[ Xiaochen Shen: amend commit log ]
Signed-off-by: Xiaochen Shen <xiaochen.shen@intel.com>
commit 21b362c upstream.

Hardware has two RMID configuration options for SNC systems. The default
mode divides RMID counters between SNC nodes. E.g. with 200 RMIDs and
two SNC nodes per L3 cache RMIDs 0..99 are used on node 0, and 100..199
on node 1. This isn't compatible with Linux resctrl usage. On this
example system a process using RMID 5 would only update monitor counters
while running on SNC node 0.

The other mode is "RMID Sharing Mode". This is enabled by clearing bit
0 of the RMID_SNC_CONFIG (0xCA0) model specific register. In this mode
the number of logical RMIDs is the number of physical RMIDs (from CPUID
leaf 0xF) divided by the number of SNC nodes per L3 cache instance. A
process can use the same RMID across different SNC nodes.

See the "Intel Resource Director Technology Architecture Specification"
for additional details.

When SNC is enabled, update the MSR when a monitor domain is marked
online. Technically this is overkill. It only needs to be done once
per L3 cache instance rather than per SNC domain. But there is no harm
in doing it more than once, and this is not in a critical path.

Intel-SIG: commit 21b362c x86/resctrl: Enable shared RMID mode on Sub-NUMA Cluster (SNC) systems.
Backporting patches for Intel RDT monitoring with SNC on Intel Xeon platform.

Signed-off-by: Tony Luck <tony.luck@intel.com>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Reviewed-by: Reinette Chatre <reinette.chatre@intel.com>
Link: https://lore.kernel.org/r/20240702173820.90368-3-tony.luck@intel.com
[ Xiaochen Shen: amend commit log ]
Signed-off-by: Xiaochen Shen <xiaochen.shen@intel.com>
commit 1348815 upstream.

There isn't a simple hardware bit that indicates whether a CPU is running in
Sub-NUMA Cluster (SNC) mode. Infer the state by comparing the number of CPUs
sharing the L3 cache with CPU0 to the number of CPUs in the same NUMA node as
CPU0.

Add the missing definition of pr_fmt() to monitor.c. This wasn't noticed
before as there are only "can't happen" console messages from this file.

  [ bp: Massage commit message. ]

Intel-SIG: commit 1348815 x86/resctrl: Detect Sub-NUMA Cluster (SNC) mode.
Backporting patches for Intel RDT monitoring with SNC on Intel Xeon platform.

Signed-off-by: Tony Luck <tony.luck@intel.com>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Reviewed-by: Reinette Chatre <reinette.chatre@intel.com>
Tested-by: Babu Moger <babu.moger@amd.com>
Link: https://lore.kernel.org/r/20240628215619.76401-19-tony.luck@intel.com
[ Xiaochen Shen: amend commit log ]
Signed-off-by: Xiaochen Shen <xiaochen.shen@intel.com>
commit ea34999 upstream.

With Sub-NUMA Cluster (SNC) mode enabled, the scope of monitoring resources
is per-NODE instead of per-L3 cache. Backwards compatibility is maintained
by providing files in the mon_L3_XX directories that sum event counts
for all SNC nodes sharing an L3 cache.

New files provide per-SNC node event counts.

Users should be aware that SNC mode also affects the amount of L3 cache
available for allocation within each SNC node.

Intel-SIG: commit ea34999 x86/resctrl: Update documentation with Sub-NUMA cluster changes.
Backporting patches for Intel RDT monitoring with SNC on Intel Xeon platform.

Signed-off-by: Tony Luck <tony.luck@intel.com>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Reviewed-by: Reinette Chatre <reinette.chatre@intel.com>
Tested-by: Babu Moger <babu.moger@amd.com>
Link: https://lore.kernel.org/r/20240628215619.76401-20-tony.luck@intel.com
[ Xiaochen Shen: amend commit log ]
Signed-off-by: Xiaochen Shen <xiaochen.shen@intel.com>
@xiaochenshen xiaochenshen changed the title 6.6 - Intel RDT monitoring support on Sub-NUMA Cluster (SNC) enabled systems [6.6-velinux] - Intel RDT monitoring support on Sub-NUMA Cluster (SNC) enabled systems Oct 30, 2024
commit a547a58 upstream.

When using resctrl on systems with Sub-NUMA Clustering enabled, monitoring
groups may be allocated RMID values which would overrun the
arch_mbm_{local,total} arrays.

This is due to inconsistencies in whether the SNC-adjusted num_rmid value or
the unadjusted value in resctrl_arch_system_num_rmid_idx() is used. The
num_rmid value for the L3 resource is currently:

  resctrl_arch_system_num_rmid_idx() / snc_nodes_per_l3_cache

As a simple fix, make resctrl_arch_system_num_rmid_idx() return the
SNC-adjusted, L3 num_rmid value on x86.

Intel-SIG: commit a547a58 x86/resctrl: Fix arch_mbm_* array overrun on SNC.
Backporting patches for Intel RDT monitoring with SNC on Intel Xeon platform.

Fixes: e13db55 ("x86/resctrl: Introduce snc_nodes_per_l3_cache")
Signed-off-by: Peter Newman <peternewman@google.com>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Reviewed-by: Reinette Chatre <reinette.chatre@intel.com>
Link: https://lore.kernel.org/r/20240822190212.1848788-1-peternewman@google.com
[ Xiaochen Shen: amend commit log ]
Signed-off-by: Xiaochen Shen <xiaochen.shen@intel.com>
@xiaochenshen
Copy link
Copy Markdown
Contributor Author

Who could help have a look at error in "Pull Request Checking / BuildAndTestKernel / build-and-test-kernel (x86_64, buster) (pull_request)"?

It seems to have nothing to do with the code change. Is there something wrong with the build environment?

Makefile.config:597: No sys/sdt.h found, no SDT events are defined, please install systemtap-sdt-devel or systemtap-sdt-dev
Makefile.config:978: No libzstd found, disables trace compression, please install libzstd-dev[el] and/or set LIBZSTD_DIR
Makefile.config:989: No libcap found, disables capability support, please install libcap-devel/libcap-dev
Makefile.config:1061: No libbabeltrace found, disables 'perf data' CTF format support, please install libbabeltrace-dev[el]/libbabeltrace-ctf-dev
Makefile.config:1095: No alternatives command found, you need to set JDIR= to point to the root of your Java directory
Makefile.config:1126: libpfm4 not found, disables libpfm4 support. Please install libpfm4-dev
Makefile.config:1144: *** ERROR: libtraceevent is missing. Please install libtraceevent-dev/libtraceevent-devel or build with NO_LIBTRACEEVENT=1.  Stop.
make[6]: *** [Makefile.perf:242: sub-make] Error 2
make[5]: *** [Makefile:113: install] Error 2
make[4]: *** [Makefile:2044: run-command] Error 2
make[3]: *** [debian/rules:17: binary-arch] Error 2
dpkg-buildpackage: error: make -f debian/rules binary subprocess returned exit status 2

@kchuyizhou
Copy link
Copy Markdown

LGTM

@guojinhui-liam guojinhui-liam merged commit 82f3c42 into openvelinux:6.6-velinux Dec 11, 2024
guojinhui-liam added a commit that referenced this pull request Dec 11, 2024
[6.6-velinux] - Intel RDT monitoring support on Sub-NUMA Cluster (SNC) enabled systems
PvsNarasimha pushed a commit to PvsNarasimha/kernel that referenced this pull request Dec 27, 2024
commit 99d4850 upstream

Found by leak sanitizer:
```
==1632594==ERROR: LeakSanitizer: detected memory leaks

Direct leak of 21 byte(s) in 1 object(s) allocated from:
    #0 0x7f2953a7077b in __interceptor_strdup ../../../../src/libsanitizer/asan/asan_interceptors.cpp:439
    openvelinux#1 0x556701d6fbbf in perf_env__read_cpuid util/env.c:369
    openvelinux#2 0x556701d70589 in perf_env__cpuid util/env.c:465
    openvelinux#3 0x55670204bba2 in x86__is_amd_cpu arch/x86/util/env.c:14
    #4 0x5567020487a2 in arch__post_evsel_config arch/x86/util/evsel.c:83
    #5 0x556701d8f78b in evsel__config util/evsel.c:1366
    openvelinux#6 0x556701ef5872 in evlist__config util/record.c:108
    openvelinux#7 0x556701cd6bcd in test__PERF_RECORD tests/perf-record.c:112
    openvelinux#8 0x556701cacd07 in run_test tests/builtin-test.c:236
    openvelinux#9 0x556701cacfac in test_and_print tests/builtin-test.c:265
    openvelinux#10 0x556701cadddb in __cmd_test tests/builtin-test.c:402
    openvelinux#11 0x556701caf2aa in cmd_test tests/builtin-test.c:559
    openvelinux#12 0x556701d3b557 in run_builtin tools/perf/perf.c:323
    openvelinux#13 0x556701d3bac8 in handle_internal_command tools/perf/perf.c:377
    openvelinux#14 0x556701d3be90 in run_argv tools/perf/perf.c:421
    openvelinux#15 0x556701d3c3f8 in main tools/perf/perf.c:537
    openvelinux#16 0x7f2952a46189 in __libc_start_call_main ../sysdeps/nptl/libc_start_call_main.h:58

SUMMARY: AddressSanitizer: 21 byte(s) leaked in 1 allocation(s).
```

Fixes: f7b58cb ("perf mem/c2c: Add load store event mappings for AMD")
Signed-off-by: Ian Rogers <irogers@google.com>
Acked-by: Ravi Bangoria <ravi.bangoria@amd.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Link: https://lore.kernel.org/r/20230613235416.1650755-1-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
PvsNarasimha pushed a commit to PvsNarasimha/kernel that referenced this pull request Jan 7, 2025
commit 99d4850 upstream

Found by leak sanitizer:
```
==1632594==ERROR: LeakSanitizer: detected memory leaks

Direct leak of 21 byte(s) in 1 object(s) allocated from:
    #0 0x7f2953a7077b in __interceptor_strdup ../../../../src/libsanitizer/asan/asan_interceptors.cpp:439
    openvelinux#1 0x556701d6fbbf in perf_env__read_cpuid util/env.c:369
    openvelinux#2 0x556701d70589 in perf_env__cpuid util/env.c:465
    openvelinux#3 0x55670204bba2 in x86__is_amd_cpu arch/x86/util/env.c:14
    #4 0x5567020487a2 in arch__post_evsel_config arch/x86/util/evsel.c:83
    #5 0x556701d8f78b in evsel__config util/evsel.c:1366
    openvelinux#6 0x556701ef5872 in evlist__config util/record.c:108
    openvelinux#7 0x556701cd6bcd in test__PERF_RECORD tests/perf-record.c:112
    openvelinux#8 0x556701cacd07 in run_test tests/builtin-test.c:236
    openvelinux#9 0x556701cacfac in test_and_print tests/builtin-test.c:265
    openvelinux#10 0x556701cadddb in __cmd_test tests/builtin-test.c:402
    openvelinux#11 0x556701caf2aa in cmd_test tests/builtin-test.c:559
    openvelinux#12 0x556701d3b557 in run_builtin tools/perf/perf.c:323
    openvelinux#13 0x556701d3bac8 in handle_internal_command tools/perf/perf.c:377
    openvelinux#14 0x556701d3be90 in run_argv tools/perf/perf.c:421
    openvelinux#15 0x556701d3c3f8 in main tools/perf/perf.c:537
    openvelinux#16 0x7f2952a46189 in __libc_start_call_main ../sysdeps/nptl/libc_start_call_main.h:58

SUMMARY: AddressSanitizer: 21 byte(s) leaked in 1 allocation(s).
```

Fixes: f7b58cb ("perf mem/c2c: Add load store event mappings for AMD")
Signed-off-by: Ian Rogers <irogers@google.com>
Acked-by: Ravi Bangoria <ravi.bangoria@amd.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Link: https://lore.kernel.org/r/20230613235416.1650755-1-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
PvsNarasimha pushed a commit to PvsNarasimha/kernel that referenced this pull request Jan 7, 2025
commit 99d4850 upstream

Found by leak sanitizer:
```
==1632594==ERROR: LeakSanitizer: detected memory leaks

Direct leak of 21 byte(s) in 1 object(s) allocated from:
    #0 0x7f2953a7077b in __interceptor_strdup ../../../../src/libsanitizer/asan/asan_interceptors.cpp:439
    openvelinux#1 0x556701d6fbbf in perf_env__read_cpuid util/env.c:369
    openvelinux#2 0x556701d70589 in perf_env__cpuid util/env.c:465
    openvelinux#3 0x55670204bba2 in x86__is_amd_cpu arch/x86/util/env.c:14
    #4 0x5567020487a2 in arch__post_evsel_config arch/x86/util/evsel.c:83
    #5 0x556701d8f78b in evsel__config util/evsel.c:1366
    openvelinux#6 0x556701ef5872 in evlist__config util/record.c:108
    openvelinux#7 0x556701cd6bcd in test__PERF_RECORD tests/perf-record.c:112
    openvelinux#8 0x556701cacd07 in run_test tests/builtin-test.c:236
    openvelinux#9 0x556701cacfac in test_and_print tests/builtin-test.c:265
    openvelinux#10 0x556701cadddb in __cmd_test tests/builtin-test.c:402
    openvelinux#11 0x556701caf2aa in cmd_test tests/builtin-test.c:559
    openvelinux#12 0x556701d3b557 in run_builtin tools/perf/perf.c:323
    openvelinux#13 0x556701d3bac8 in handle_internal_command tools/perf/perf.c:377
    openvelinux#14 0x556701d3be90 in run_argv tools/perf/perf.c:421
    openvelinux#15 0x556701d3c3f8 in main tools/perf/perf.c:537
    openvelinux#16 0x7f2952a46189 in __libc_start_call_main ../sysdeps/nptl/libc_start_call_main.h:58

SUMMARY: AddressSanitizer: 21 byte(s) leaked in 1 allocation(s).
```

Fixes: f7b58cb ("perf mem/c2c: Add load store event mappings for AMD")
Signed-off-by: Ian Rogers <irogers@google.com>
Acked-by: Ravi Bangoria <ravi.bangoria@amd.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Link: https://lore.kernel.org/r/20230613235416.1650755-1-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
PvsNarasimha pushed a commit to PvsNarasimha/kernel that referenced this pull request Jan 7, 2025
commit 99d4850 upstream

Found by leak sanitizer:
```
==1632594==ERROR: LeakSanitizer: detected memory leaks

Direct leak of 21 byte(s) in 1 object(s) allocated from:
    #0 0x7f2953a7077b in __interceptor_strdup ../../../../src/libsanitizer/asan/asan_interceptors.cpp:439
    openvelinux#1 0x556701d6fbbf in perf_env__read_cpuid util/env.c:369
    openvelinux#2 0x556701d70589 in perf_env__cpuid util/env.c:465
    openvelinux#3 0x55670204bba2 in x86__is_amd_cpu arch/x86/util/env.c:14
    #4 0x5567020487a2 in arch__post_evsel_config arch/x86/util/evsel.c:83
    #5 0x556701d8f78b in evsel__config util/evsel.c:1366
    openvelinux#6 0x556701ef5872 in evlist__config util/record.c:108
    openvelinux#7 0x556701cd6bcd in test__PERF_RECORD tests/perf-record.c:112
    openvelinux#8 0x556701cacd07 in run_test tests/builtin-test.c:236
    openvelinux#9 0x556701cacfac in test_and_print tests/builtin-test.c:265
    openvelinux#10 0x556701cadddb in __cmd_test tests/builtin-test.c:402
    openvelinux#11 0x556701caf2aa in cmd_test tests/builtin-test.c:559
    openvelinux#12 0x556701d3b557 in run_builtin tools/perf/perf.c:323
    openvelinux#13 0x556701d3bac8 in handle_internal_command tools/perf/perf.c:377
    openvelinux#14 0x556701d3be90 in run_argv tools/perf/perf.c:421
    openvelinux#15 0x556701d3c3f8 in main tools/perf/perf.c:537
    openvelinux#16 0x7f2952a46189 in __libc_start_call_main ../sysdeps/nptl/libc_start_call_main.h:58

SUMMARY: AddressSanitizer: 21 byte(s) leaked in 1 allocation(s).
```

Fixes: f7b58cb ("perf mem/c2c: Add load store event mappings for AMD")
Signed-off-by: Ian Rogers <irogers@google.com>
Acked-by: Ravi Bangoria <ravi.bangoria@amd.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Link: https://lore.kernel.org/r/20230613235416.1650755-1-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: PvsNarasimha <PVS.NarasimhaRao@amd.com>
PvsNarasimha pushed a commit to PvsNarasimha/kernel that referenced this pull request Jan 30, 2025
commit 99d4850 upstream

Found by leak sanitizer:
```
==1632594==ERROR: LeakSanitizer: detected memory leaks

Direct leak of 21 byte(s) in 1 object(s) allocated from:
    #0 0x7f2953a7077b in __interceptor_strdup ../../../../src/libsanitizer/asan/asan_interceptors.cpp:439
    openvelinux#1 0x556701d6fbbf in perf_env__read_cpuid util/env.c:369
    openvelinux#2 0x556701d70589 in perf_env__cpuid util/env.c:465
    openvelinux#3 0x55670204bba2 in x86__is_amd_cpu arch/x86/util/env.c:14
    #4 0x5567020487a2 in arch__post_evsel_config arch/x86/util/evsel.c:83
    #5 0x556701d8f78b in evsel__config util/evsel.c:1366
    openvelinux#6 0x556701ef5872 in evlist__config util/record.c:108
    openvelinux#7 0x556701cd6bcd in test__PERF_RECORD tests/perf-record.c:112
    openvelinux#8 0x556701cacd07 in run_test tests/builtin-test.c:236
    openvelinux#9 0x556701cacfac in test_and_print tests/builtin-test.c:265
    openvelinux#10 0x556701cadddb in __cmd_test tests/builtin-test.c:402
    openvelinux#11 0x556701caf2aa in cmd_test tests/builtin-test.c:559
    openvelinux#12 0x556701d3b557 in run_builtin tools/perf/perf.c:323
    openvelinux#13 0x556701d3bac8 in handle_internal_command tools/perf/perf.c:377
    openvelinux#14 0x556701d3be90 in run_argv tools/perf/perf.c:421
    openvelinux#15 0x556701d3c3f8 in main tools/perf/perf.c:537
    openvelinux#16 0x7f2952a46189 in __libc_start_call_main ../sysdeps/nptl/libc_start_call_main.h:58

SUMMARY: AddressSanitizer: 21 byte(s) leaked in 1 allocation(s).
```

Fixes: f7b58cb ("perf mem/c2c: Add load store event mappings for AMD")
Signed-off-by: Ian Rogers <irogers@google.com>
Acked-by: Ravi Bangoria <ravi.bangoria@amd.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Link: https://lore.kernel.org/r/20230613235416.1650755-1-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: PvsNarasimha <PVS.NarasimhaRao@amd.com>
PvsNarasimha pushed a commit to PvsNarasimha/kernel that referenced this pull request Feb 5, 2025
commit 99d4850 upstream.

Found by leak sanitizer:
```
==1632594==ERROR: LeakSanitizer: detected memory leaks

Direct leak of 21 byte(s) in 1 object(s) allocated from:
    #0 0x7f2953a7077b in __interceptor_strdup ../../../../src/libsanitizer/asan/asan_interceptors.cpp:439
    openvelinux#1 0x556701d6fbbf in perf_env__read_cpuid util/env.c:369
    openvelinux#2 0x556701d70589 in perf_env__cpuid util/env.c:465
    openvelinux#3 0x55670204bba2 in x86__is_amd_cpu arch/x86/util/env.c:14
    #4 0x5567020487a2 in arch__post_evsel_config arch/x86/util/evsel.c:83
    #5 0x556701d8f78b in evsel__config util/evsel.c:1366
    openvelinux#6 0x556701ef5872 in evlist__config util/record.c:108
    openvelinux#7 0x556701cd6bcd in test__PERF_RECORD tests/perf-record.c:112
    openvelinux#8 0x556701cacd07 in run_test tests/builtin-test.c:236
    openvelinux#9 0x556701cacfac in test_and_print tests/builtin-test.c:265
    openvelinux#10 0x556701cadddb in __cmd_test tests/builtin-test.c:402
    openvelinux#11 0x556701caf2aa in cmd_test tests/builtin-test.c:559
    openvelinux#12 0x556701d3b557 in run_builtin tools/perf/perf.c:323
    openvelinux#13 0x556701d3bac8 in handle_internal_command tools/perf/perf.c:377
    openvelinux#14 0x556701d3be90 in run_argv tools/perf/perf.c:421
    openvelinux#15 0x556701d3c3f8 in main tools/perf/perf.c:537
    openvelinux#16 0x7f2952a46189 in __libc_start_call_main ../sysdeps/nptl/libc_start_call_main.h:58

SUMMARY: AddressSanitizer: 21 byte(s) leaked in 1 allocation(s).
```

Fixes: f7b58cb ("perf mem/c2c: Add load store event mappings for AMD")
Signed-off-by: Ian Rogers <irogers@google.com>
Acked-by: Ravi Bangoria <ravi.bangoria@amd.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Link: https://lore.kernel.org/r/20230613235416.1650755-1-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: PvsNarasimha <PVS.NarasimhaRao@amd.com>
guojinhui-liam pushed a commit that referenced this pull request Feb 11, 2025
commit 99d4850 upstream.

Found by leak sanitizer:
```
==1632594==ERROR: LeakSanitizer: detected memory leaks

Direct leak of 21 byte(s) in 1 object(s) allocated from:
    #0 0x7f2953a7077b in __interceptor_strdup ../../../../src/libsanitizer/asan/asan_interceptors.cpp:439
    #1 0x556701d6fbbf in perf_env__read_cpuid util/env.c:369
    #2 0x556701d70589 in perf_env__cpuid util/env.c:465
    #3 0x55670204bba2 in x86__is_amd_cpu arch/x86/util/env.c:14
    #4 0x5567020487a2 in arch__post_evsel_config arch/x86/util/evsel.c:83
    #5 0x556701d8f78b in evsel__config util/evsel.c:1366
    #6 0x556701ef5872 in evlist__config util/record.c:108
    #7 0x556701cd6bcd in test__PERF_RECORD tests/perf-record.c:112
    #8 0x556701cacd07 in run_test tests/builtin-test.c:236
    #9 0x556701cacfac in test_and_print tests/builtin-test.c:265
    #10 0x556701cadddb in __cmd_test tests/builtin-test.c:402
    #11 0x556701caf2aa in cmd_test tests/builtin-test.c:559
    #12 0x556701d3b557 in run_builtin tools/perf/perf.c:323
    #13 0x556701d3bac8 in handle_internal_command tools/perf/perf.c:377
    #14 0x556701d3be90 in run_argv tools/perf/perf.c:421
    #15 0x556701d3c3f8 in main tools/perf/perf.c:537
    #16 0x7f2952a46189 in __libc_start_call_main ../sysdeps/nptl/libc_start_call_main.h:58

SUMMARY: AddressSanitizer: 21 byte(s) leaked in 1 allocation(s).
```

Fixes: f7b58cb ("perf mem/c2c: Add load store event mappings for AMD")
Signed-off-by: Ian Rogers <irogers@google.com>
Acked-by: Ravi Bangoria <ravi.bangoria@amd.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Link: https://lore.kernel.org/r/20230613235416.1650755-1-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: PvsNarasimha <PVS.NarasimhaRao@amd.com>
guojinhui-liam pushed a commit that referenced this pull request May 8, 2025
commit a699781 upstream.

A sysfs reader can race with a device reset or removal, attempting to
read device state when the device is not actually present. eg:

     [exception RIP: qed_get_current_link+17]
  #8 [ffffb9e4f2907c48] qede_get_link_ksettings at ffffffffc07a994a [qede]
  #9 [ffffb9e4f2907cd8] __rh_call_get_link_ksettings at ffffffff992b01a3
 #10 [ffffb9e4f2907d38] __ethtool_get_link_ksettings at ffffffff992b04e4
 #11 [ffffb9e4f2907d90] duplex_show at ffffffff99260300
 #12 [ffffb9e4f2907e38] dev_attr_show at ffffffff9905a01c
 #13 [ffffb9e4f2907e50] sysfs_kf_seq_show at ffffffff98e0145b
 #14 [ffffb9e4f2907e68] seq_read at ffffffff98d902e3
 #15 [ffffb9e4f2907ec8] vfs_read at ffffffff98d657d1
 #16 [ffffb9e4f2907f00] ksys_read at ffffffff98d65c3f
 #17 [ffffb9e4f2907f38] do_syscall_64 at ffffffff98a052fb

 crash> struct net_device.state ffff9a9d21336000
    state = 5,

state 5 is __LINK_STATE_START (0b1) and __LINK_STATE_NOCARRIER (0b100).
The device is not present, note lack of __LINK_STATE_PRESENT (0b10).

This is the same sort of panic as observed in commit 4224cfd
("net-sysfs: add check for netdevice being present to speed_show").

There are many other callers of __ethtool_get_link_ksettings() which
don't have a device presence check.

Move this check into ethtool to protect all callers.

Fixes: d519e17 ("net: export device speed and duplex via sysfs")
Fixes: 4224cfd ("net-sysfs: add check for netdevice being present to speed_show")
Signed-off-by: Jamie Bainbridge <jamie.bainbridge@gmail.com>
Link: https://patch.msgid.link/8bae218864beaa44ed01628140475b9bf641c5b0.1724393671.git.jamie.bainbridge@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
(cherry picked from commit a699781)
Signed-off-by: Tao Ma <matao.9@bytedance.com>
guojinhui-liam pushed a commit that referenced this pull request May 8, 2025
commit 2545deb upstream.

Couple of error paths in do_core_test() was returning directly without
doing a necessary cpus_read_unlock().

Following lockdep warning was observed when exercising these scenarios
with PROVE_RAW_LOCK_NESTING enabled:

[  139.304775] ================================================
[  139.311185] WARNING: lock held when returning to user space!
[  139.317593] 6.6.0-rc2ifs01+ #11 Tainted: G S      W I
[  139.324499] ------------------------------------------------
[  139.330908] bash/11476 is leaving the kernel with locks still held!
[  139.338000] 1 lock held by bash/11476:
[  139.342262]  #0: ffffffffaa26c930 (cpu_hotplug_lock){++++}-{0:0}, at:
do_core_test+0x35/0x1c0 [intel_ifs]

Fix the flow so that all scenarios release the lock prior to returning
from the function.

Fixes: 5210fb4 ("platform/x86/intel/ifs: Sysfs interface for Array BIST")
Cc: stable@vger.kernel.org
Signed-off-by: Jithu Joseph <jithu.joseph@intel.com>
Link: https://lore.kernel.org/r/20230927184824.2566086-1-jithu.joseph@intel.com
Reviewed-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Yunhui Cui <cuiyunhui@bytedance.com>
guojinhui-liam pushed a commit that referenced this pull request Nov 18, 2025
commit 23cc6ef upstream.

Running syzkaller with the newly enabled signed integer overflow
sanitizer produces this report:

[  195.401651] ------------[ cut here ]------------
[  195.404808] UBSAN: signed-integer-overflow in ../fs/open.c:321:15
[  195.408739] 9223372036854775807 + 562984447377399 cannot be represented in type 'loff_t' (aka 'long long')
[  195.414683] CPU: 1 PID: 703 Comm: syz-executor.0 Not tainted 6.8.0-rc2-00039-g14de58dbe653-dirty #11
[  195.420138] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.3-debian-1.16.3-2 04/01/2014
[  195.425804] Call Trace:
[  195.427360]  <TASK>
[  195.428791]  dump_stack_lvl+0x93/0xd0
[  195.431150]  handle_overflow+0x171/0x1b0
[  195.433640]  vfs_fallocate+0x459/0x4f0
...
[  195.490053] ------------[ cut here ]------------
[  195.493146] UBSAN: signed-integer-overflow in ../fs/open.c:321:61
[  195.497030] 9223372036854775807 + 562984447377399 cannot be represented in type 'loff_t' (aka 'long long)
[  195.502940] CPU: 1 PID: 703 Comm: syz-executor.0 Not tainted 6.8.0-rc2-00039-g14de58dbe653-dirty #11
[  195.508395] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.3-debian-1.16.3-2 04/01/2014
[  195.514075] Call Trace:
[  195.515636]  <TASK>
[  195.517000]  dump_stack_lvl+0x93/0xd0
[  195.519255]  handle_overflow+0x171/0x1b0
[  195.521677]  vfs_fallocate+0x4cb/0x4f0
[  195.524033]  __x64_sys_fallocate+0xb2/0xf0

Historically, the signed integer overflow sanitizer did not work in the
kernel due to its interaction with `-fwrapv` but this has since been
changed [1] in the newest version of Clang. It was re-enabled in the
kernel with Commit 557f8c5 ("ubsan: Reintroduce signed overflow
sanitizer").

Let's use the check_add_overflow helper to first verify the addition
stays within the bounds of its type (long long); then we can use that
sum for the following check.

Link: llvm/llvm-project#82432 [1]
Closes: KSPP/linux#356
Cc: linux-hardening@vger.kernel.org
Reviewed-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Justin Stitt <justinstitt@google.com>
Link: https://lore.kernel.org/r/20240513-b4-sio-vfs_fallocate-v2-1-db415872fb16@google.com
Reviewed-by: Jan Kara <jack@suse.cz>
Signed-off-by: Christian Brauner <brauner@kernel.org>
Signed-off-by: Xin Yin <yinxin.x@bytedance.com>
guojinhui-liam pushed a commit that referenced this pull request Nov 18, 2025
commit e972b08 upstream.

We're seeing crashes from rq_qos_wake_function that look like this:

  BUG: unable to handle page fault for address: ffffafe180a40084
  #PF: supervisor write access in kernel mode
  #PF: error_code(0x0002) - not-present page
  PGD 100000067 P4D 100000067 PUD 10027c067 PMD 10115d067 PTE 0
  Oops: Oops: 0002 [#1] PREEMPT SMP PTI
  CPU: 17 UID: 0 PID: 0 Comm: swapper/17 Not tainted 6.12.0-rc3-00013-geca631b8fe80 #11
  Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.16.0-0-gd239552ce722-prebuilt.qemu.org 04/01/2014
  RIP: 0010:_raw_spin_lock_irqsave+0x1d/0x40
  Code: 90 90 90 90 90 90 90 90 90 90 90 90 90 f3 0f 1e fa 0f 1f 44 00 00 41 54 9c 41 5c fa 65 ff 05 62 97 30 4c 31 c0 ba 01 00 00 00 <f0> 0f b1 17 75 0a 4c 89 e0 41 5c c3 cc cc cc cc 89 c6 e8 2c 0b 00
  RSP: 0018:ffffafe180580ca0 EFLAGS: 00010046
  RAX: 0000000000000000 RBX: ffffafe180a3f7a8 RCX: 0000000000000011
  RDX: 0000000000000001 RSI: 0000000000000003 RDI: ffffafe180a40084
  RBP: 0000000000000000 R08: 00000000001e7240 R09: 0000000000000011
  R10: 0000000000000028 R11: 0000000000000888 R12: 0000000000000002
  R13: ffffafe180a40084 R14: 0000000000000000 R15: 0000000000000003
  FS:  0000000000000000(0000) GS:ffff9aaf1f280000(0000) knlGS:0000000000000000
  CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
  CR2: ffffafe180a40084 CR3: 000000010e428002 CR4: 0000000000770ef0
  DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
  DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
  PKRU: 55555554
  Call Trace:
   <IRQ>
   try_to_wake_up+0x5a/0x6a0
   rq_qos_wake_function+0x71/0x80
   __wake_up_common+0x75/0xa0
   __wake_up+0x36/0x60
   scale_up.part.0+0x50/0x110
   wb_timer_fn+0x227/0x450
   ...

So rq_qos_wake_function() calls wake_up_process(data->task), which calls
try_to_wake_up(), which faults in raw_spin_lock_irqsave(&p->pi_lock).

p comes from data->task, and data comes from the waitqueue entry, which
is stored on the waiter's stack in rq_qos_wait(). Analyzing the core
dump with drgn, I found that the waiter had already woken up and moved
on to a completely unrelated code path, clobbering what was previously
data->task. Meanwhile, the waker was passing the clobbered garbage in
data->task to wake_up_process(), leading to the crash.

What's happening is that in between rq_qos_wake_function() deleting the
waitqueue entry and calling wake_up_process(), rq_qos_wait() is finding
that it already got a token and returning. The race looks like this:

rq_qos_wait()                           rq_qos_wake_function()
==============================================================
prepare_to_wait_exclusive()
                                        data->got_token = true;
                                        list_del_init(&curr->entry);
if (data.got_token)
        break;
finish_wait(&rqw->wait, &data.wq);
  ^- returns immediately because
     list_empty_careful(&wq_entry->entry)
     is true
... return, go do something else ...
                                        wake_up_process(data->task)
                                          (NO LONGER VALID!)-^

Normally, finish_wait() is supposed to synchronize against the waker.
But, as noted above, it is returning immediately because the waitqueue
entry has already been removed from the waitqueue.

The bug is that rq_qos_wake_function() is accessing the waitqueue entry
AFTER deleting it. Note that autoremove_wake_function() wakes the waiter
and THEN deletes the waitqueue entry, which is the proper order.

Fix it by swapping the order. We also need to use
list_del_init_careful() to match the list_empty_careful() in
finish_wait().

Fixes: 38cfb5a ("blk-wbt: improve waking of tasks")
Cc: stable@vger.kernel.org
Signed-off-by: Omar Sandoval <osandov@fb.com>
Acked-by: Tejun Heo <tj@kernel.org>
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Link: https://lore.kernel.org/r/d3bee2463a67b1ee597211823bf7ad3721c26e41.1729014591.git.osandov@fb.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Julian Sun <sunjunchao@bytedance.com>
PvsNarasimha pushed a commit to PvsNarasimha/kernel that referenced this pull request Jan 23, 2026
…th()

[ Upstream commit 39dfc971e42d886e7df01371cd1bef505076d84c ]

KASAN reports a stack-out-of-bounds read in regs_get_kernel_stack_nth().

Call Trace:
[   97.283505] BUG: KASAN: stack-out-of-bounds in regs_get_kernel_stack_nth+0xa8/0xc8
[   97.284677] Read of size 8 at addr ffff800089277c10 by task 1.sh/2550
[   97.285732]
[   97.286067] CPU: 7 PID: 2550 Comm: 1.sh Not tainted 6.6.0+ openvelinux#11
[   97.287032] Hardware name: linux,dummy-virt (DT)
[   97.287815] Call trace:
[   97.288279]  dump_backtrace+0xa0/0x128
[   97.288946]  show_stack+0x20/0x38
[   97.289551]  dump_stack_lvl+0x78/0xc8
[   97.290203]  print_address_description.constprop.0+0x84/0x3c8
[   97.291159]  print_report+0xb0/0x280
[   97.291792]  kasan_report+0x84/0xd0
[   97.292421]  __asan_load8+0x9c/0xc0
[   97.293042]  regs_get_kernel_stack_nth+0xa8/0xc8
[   97.293835]  process_fetch_insn+0x770/0xa30
[   97.294562]  kprobe_trace_func+0x254/0x3b0
[   97.295271]  kprobe_dispatcher+0x98/0xe0
[   97.295955]  kprobe_breakpoint_handler+0x1b0/0x210
[   97.296774]  call_break_hook+0xc4/0x100
[   97.297451]  brk_handler+0x24/0x78
[   97.298073]  do_debug_exception+0xac/0x178
[   97.298785]  el1_dbg+0x70/0x90
[   97.299344]  el1h_64_sync_handler+0xcc/0xe8
[   97.300066]  el1h_64_sync+0x78/0x80
[   97.300699]  kernel_clone+0x0/0x500
[   97.301331]  __arm64_sys_clone+0x70/0x90
[   97.302084]  invoke_syscall+0x68/0x198
[   97.302746]  el0_svc_common.constprop.0+0x11c/0x150
[   97.303569]  do_el0_svc+0x38/0x50
[   97.304164]  el0_svc+0x44/0x1d8
[   97.304749]  el0t_64_sync_handler+0x100/0x130
[   97.305500]  el0t_64_sync+0x188/0x190
[   97.306151]
[   97.306475] The buggy address belongs to stack of task 1.sh/2550
[   97.307461]  and is located at offset 0 in frame:
[   97.308257]  __se_sys_clone+0x0/0x138
[   97.308910]
[   97.309241] This frame has 1 object:
[   97.309873]  [48, 184) 'args'
[   97.309876]
[   97.310749] The buggy address belongs to the virtual mapping at
[   97.310749]  [ffff800089270000, ffff800089279000) created by:
[   97.310749]  dup_task_struct+0xc0/0x2e8
[   97.313347]
[   97.313674] The buggy address belongs to the physical page:
[   97.314604] page: refcount:1 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x14f69a
[   97.315885] flags: 0x15ffffe00000000(node=1|zone=2|lastcpupid=0xfffff)
[   97.316957] raw: 015ffffe00000000 0000000000000000 dead000000000122 0000000000000000
[   97.318207] raw: 0000000000000000 0000000000000000 00000001ffffffff 0000000000000000
[   97.319445] page dumped because: kasan: bad access detected
[   97.320371]
[   97.320694] Memory state around the buggy address:
[   97.321511]  ffff800089277b00: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[   97.322681]  ffff800089277b80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[   97.323846] >ffff800089277c00: 00 00 f1 f1 f1 f1 f1 f1 00 00 00 00 00 00 00 00
[   97.325023]                          ^
[   97.325683]  ffff800089277c80: 00 00 00 00 00 00 00 00 00 f3 f3 f3 f3 f3 f3 f3
[   97.326856]  ffff800089277d00: f3 f3 00 00 00 00 00 00 00 00 00 00 00 00 00 00

This issue seems to be related to the behavior of some gcc compilers and
was also fixed on the s390 architecture before:

 commit d93a855c31b7 ("s390/ptrace: Avoid KASAN false positives in regs_get_kernel_stack_nth()")

As described in that commit, regs_get_kernel_stack_nth() has confirmed that
`addr` is on the stack, so reading the value at `*addr` should be allowed.
Use READ_ONCE_NOCHECK() helper to silence the KASAN check for this case.

Fixes: 0a8ea52 ("arm64: Add HAVE_REGS_AND_STACK_ACCESS_API feature")
Signed-off-by: Tengda Wu <wutengda@huaweicloud.com>
Link: https://lore.kernel.org/r/20250604005533.1278992-1-wutengda@huaweicloud.com
[will: Use '*addr' as the argument to READ_ONCE_NOCHECK()]
Signed-off-by: Will Deacon <will@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
guojinhui-liam pushed a commit that referenced this pull request Mar 3, 2026
commit d21d406 upstream.

syzkaller reported infinite recursive calls of fib6_dump_done() during
netlink socket destruction.  [1]

From the log, syzkaller sent an AF_UNSPEC RTM_GETROUTE message, and then
the response was generated.  The following recvmmsg() resumed the dump
for IPv6, but the first call of inet6_dump_fib() failed at kzalloc() due
to the fault injection.  [0]

  12:01:34 executing program 3:
  r0 = socket$nl_route(0x10, 0x3, 0x0)
  sendmsg$nl_route(r0, ... snip ...)
  recvmmsg(r0, ... snip ...) (fail_nth: 8)

Here, fib6_dump_done() was set to nlk_sk(sk)->cb.done, and the next call
of inet6_dump_fib() set it to nlk_sk(sk)->cb.args[3].  syzkaller stopped
receiving the response halfway through, and finally netlink_sock_destruct()
called nlk_sk(sk)->cb.done().

fib6_dump_done() calls fib6_dump_end() and nlk_sk(sk)->cb.done() if it
is still not NULL.  fib6_dump_end() rewrites nlk_sk(sk)->cb.done() by
nlk_sk(sk)->cb.args[3], but it has the same function, not NULL, calling
itself recursively and hitting the stack guard page.

To avoid the issue, let's set the destructor after kzalloc().

[0]:
FAULT_INJECTION: forcing a failure.
name failslab, interval 1, probability 0, space 0, times 0
CPU: 1 PID: 432110 Comm: syz-executor.3 Not tainted 6.8.0-12821-g537c2e91d354-dirty #11
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.16.0-0-gd239552ce722-prebuilt.qemu.org 04/01/2014
Call Trace:
 <TASK>
 dump_stack_lvl (lib/dump_stack.c:117)
 should_fail_ex (lib/fault-inject.c:52 lib/fault-inject.c:153)
 should_failslab (mm/slub.c:3733)
 kmalloc_trace (mm/slub.c:3748 mm/slub.c:3827 mm/slub.c:3992)
 inet6_dump_fib (./include/linux/slab.h:628 ./include/linux/slab.h:749 net/ipv6/ip6_fib.c:662)
 rtnl_dump_all (net/core/rtnetlink.c:4029)
 netlink_dump (net/netlink/af_netlink.c:2269)
 netlink_recvmsg (net/netlink/af_netlink.c:1988)
 ____sys_recvmsg (net/socket.c:1046 net/socket.c:2801)
 ___sys_recvmsg (net/socket.c:2846)
 do_recvmmsg (net/socket.c:2943)
 __x64_sys_recvmmsg (net/socket.c:3041 net/socket.c:3034 net/socket.c:3034)

[1]:
BUG: TASK stack guard page was hit at 00000000f2fa9af1 (stack is 00000000b7912430..000000009a436beb)
stack guard page: 0000 [#1] PREEMPT SMP KASAN
CPU: 1 PID: 223719 Comm: kworker/1:3 Not tainted 6.8.0-12821-g537c2e91d354-dirty #11
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.16.0-0-gd239552ce722-prebuilt.qemu.org 04/01/2014
Workqueue: events netlink_sock_destruct_work
RIP: 0010:fib6_dump_done (net/ipv6/ip6_fib.c:570)
Code: 3c 24 e8 f3 e9 51 fd e9 28 fd ff ff 66 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 00 f3 0f 1e fa 41 57 41 56 41 55 41 54 55 48 89 fd <53> 48 8d 5d 60 e8 b6 4d 07 fd 48 89 da 48 b8 00 00 00 00 00 fc ff
RSP: 0018:ffffc9000d980000 EFLAGS: 00010293
RAX: 0000000000000000 RBX: ffffffff84405990 RCX: ffffffff844059d3
RDX: ffff8881028e0000 RSI: ffffffff84405ac2 RDI: ffff88810c02f358
RBP: ffff88810c02f358 R08: 0000000000000007 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000224 R12: 0000000000000000
R13: ffff888007c82c78 R14: ffff888007c82c68 R15: ffff888007c82c68
FS:  0000000000000000(0000) GS:ffff88811b100000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: ffffc9000d97fff8 CR3: 0000000102309002 CR4: 0000000000770ef0
PKRU: 55555554
Call Trace:
 <#DF>
 </#DF>
 <TASK>
 fib6_dump_done (net/ipv6/ip6_fib.c:572 (discriminator 1))
 fib6_dump_done (net/ipv6/ip6_fib.c:572 (discriminator 1))
 ...
 fib6_dump_done (net/ipv6/ip6_fib.c:572 (discriminator 1))
 fib6_dump_done (net/ipv6/ip6_fib.c:572 (discriminator 1))
 netlink_sock_destruct (net/netlink/af_netlink.c:401)
 __sk_destruct (net/core/sock.c:2177 (discriminator 2))
 sk_destruct (net/core/sock.c:2224)
 __sk_free (net/core/sock.c:2235)
 sk_free (net/core/sock.c:2246)
 process_one_work (kernel/workqueue.c:3259)
 worker_thread (kernel/workqueue.c:3329 kernel/workqueue.c:3416)
 kthread (kernel/kthread.c:388)
 ret_from_fork (arch/x86/kernel/process.c:153)
 ret_from_fork_asm (arch/x86/entry/entry_64.S:256)
Modules linked in:

Fixes: 1da177e ("Linux-2.6.12-rc2")
Reported-by: syzkaller <syzkaller@googlegroups.com>
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: David Ahern <dsahern@kernel.org>
Link: https://lore.kernel.org/r/20240401211003.25274-1-kuniyu@amazon.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Dongdong Wang <wangdongdong.6@bytedance.com>
guojinhui-liam pushed a commit that referenced this pull request Mar 3, 2026
[ Upstream commit c8e008b60492cf6fd31ef127aea6d02fd3d314cd ]

Once inside 'ext4_xattr_inode_dec_ref_all' we should
ignore xattrs entries past the 'end' entry.

This fixes the following KASAN reported issue:

==================================================================
BUG: KASAN: slab-use-after-free in ext4_xattr_inode_dec_ref_all+0xb8c/0xe90
Read of size 4 at addr ffff888012c120c4 by task repro/2065

CPU: 1 UID: 0 PID: 2065 Comm: repro Not tainted 6.13.0-rc2+ #11
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.16.3-0-ga6ed6b701f0a-prebuilt.qemu.org 04/01/2014
Call Trace:
 <TASK>
 dump_stack_lvl+0x1fd/0x300
 ? tcp_gro_dev_warn+0x260/0x260
 ? _printk+0xc0/0x100
 ? read_lock_is_recursive+0x10/0x10
 ? irq_work_queue+0x72/0xf0
 ? __virt_addr_valid+0x17b/0x4b0
 print_address_description+0x78/0x390
 print_report+0x107/0x1f0
 ? __virt_addr_valid+0x17b/0x4b0
 ? __virt_addr_valid+0x3ff/0x4b0
 ? __phys_addr+0xb5/0x160
 ? ext4_xattr_inode_dec_ref_all+0xb8c/0xe90
 kasan_report+0xcc/0x100
 ? ext4_xattr_inode_dec_ref_all+0xb8c/0xe90
 ext4_xattr_inode_dec_ref_all+0xb8c/0xe90
 ? ext4_xattr_delete_inode+0xd30/0xd30
 ? __ext4_journal_ensure_credits+0x5f0/0x5f0
 ? __ext4_journal_ensure_credits+0x2b/0x5f0
 ? inode_update_timestamps+0x410/0x410
 ext4_xattr_delete_inode+0xb64/0xd30
 ? ext4_truncate+0xb70/0xdc0
 ? ext4_expand_extra_isize_ea+0x1d20/0x1d20
 ? __ext4_mark_inode_dirty+0x670/0x670
 ? ext4_journal_check_start+0x16f/0x240
 ? ext4_inode_is_fast_symlink+0x2f2/0x3a0
 ext4_evict_inode+0xc8c/0xff0
 ? ext4_inode_is_fast_symlink+0x3a0/0x3a0
 ? do_raw_spin_unlock+0x53/0x8a0
 ? ext4_inode_is_fast_symlink+0x3a0/0x3a0
 evict+0x4ac/0x950
 ? proc_nr_inodes+0x310/0x310
 ? trace_ext4_drop_inode+0xa2/0x220
 ? _raw_spin_unlock+0x1a/0x30
 ? iput+0x4cb/0x7e0
 do_unlinkat+0x495/0x7c0
 ? try_break_deleg+0x120/0x120
 ? 0xffffffff81000000
 ? __check_object_size+0x15a/0x210
 ? strncpy_from_user+0x13e/0x250
 ? getname_flags+0x1dc/0x530
 __x64_sys_unlinkat+0xc8/0xf0
 do_syscall_64+0x65/0x110
 entry_SYSCALL_64_after_hwframe+0x67/0x6f
RIP: 0033:0x434ffd
Code: 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 00 f3 0f 1e fa 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 8
RSP: 002b:00007ffc50fa7b28 EFLAGS: 00000246 ORIG_RAX: 0000000000000107
RAX: ffffffffffffffda RBX: 00007ffc50fa7e18 RCX: 0000000000434ffd
RDX: 0000000000000000 RSI: 0000000020000240 RDI: 0000000000000005
RBP: 00007ffc50fa7be0 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000001
R13: 00007ffc50fa7e08 R14: 00000000004bbf30 R15: 0000000000000001
 </TASK>

The buggy address belongs to the object at ffff888012c12000
 which belongs to the cache filp of size 360
The buggy address is located 196 bytes inside of
 freed 360-byte region [ffff888012c12000, ffff888012c12168)

The buggy address belongs to the physical page:
page: refcount:1 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x12c12
head: order:1 mapcount:0 entire_mapcount:0 nr_pages_mapped:0 pincount:0
flags: 0x40(head|node=0|zone=0)
page_type: f5(slab)
raw: 0000000000000040 ffff888000ad7640 ffffea0000497a00 dead000000000004
raw: 0000000000000000 0000000000100010 00000001f5000000 0000000000000000
head: 0000000000000040 ffff888000ad7640 ffffea0000497a00 dead000000000004
head: 0000000000000000 0000000000100010 00000001f5000000 0000000000000000
head: 0000000000000001 ffffea00004b0481 ffffffffffffffff 0000000000000000
head: 0000000000000002 0000000000000000 00000000ffffffff 0000000000000000
page dumped because: kasan: bad access detected

Memory state around the buggy address:
 ffff888012c11f80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
 ffff888012c12000: fa fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
> ffff888012c12080: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
                                           ^
 ffff888012c12100: fb fb fb fb fb fb fb fb fb fb fb fb fb fc fc fc
 ffff888012c12180: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
==================================================================

Reported-by: syzbot+b244bda78289b00204ed@syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=b244bda78289b00204ed
Suggested-by: Thadeu Lima de Souza Cascardo <cascardo@igalia.com>
Signed-off-by: Bhupesh <bhupesh@igalia.com>
Link: https://patch.msgid.link/20250128082751.124948-2-bhupesh@igalia.com
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Signed-off-by: Fengnan Chang <changfengnan@bytedance.com>
jackYoung0915 pushed a commit to jackYoung0915/kernel that referenced this pull request Mar 18, 2026
[ Upstream commit 163e5f2b96632b7fb2eaa965562aca0dbdf9f996 ]

When using perf record with the `--overwrite` option, a segmentation fault
occurs if an event fails to open. For example:

  perf record -e cycles-ct -F 1000 -a --overwrite
  Error:
  cycles-ct:H: PMU Hardware doesn't support sampling/overflow-interrupts. Try 'perf stat'
  perf: Segmentation fault
      #0 0x6466b6 in dump_stack debug.c:366
      #1 0x646729 in sighandler_dump_stack debug.c:378
      openvelinux#2 0x453fd1 in sigsegv_handler builtin-record.c:722
      openvelinux#3 0x7f8454e65090 in __restore_rt libc-2.32.so[54090]
      #4 0x6c5671 in __perf_event__synthesize_id_index synthetic-events.c:1862
      #5 0x6c5ac0 in perf_event__synthesize_id_index synthetic-events.c:1943
      openvelinux#6 0x458090 in record__synthesize builtin-record.c:2075
      openvelinux#7 0x45a85a in __cmd_record builtin-record.c:2888
      openvelinux#8 0x45deb6 in cmd_record builtin-record.c:4374
      openvelinux#9 0x4e5e33 in run_builtin perf.c:349
      openvelinux#10 0x4e60bf in handle_internal_command perf.c:401
      openvelinux#11 0x4e6215 in run_argv perf.c:448
      openvelinux#12 0x4e653a in main perf.c:555
      openvelinux#13 0x7f8454e4fa72 in __libc_start_main libc-2.32.so[3ea72]
      openvelinux#14 0x43a3ee in _start ??:0

The --overwrite option implies --tail-synthesize, which collects non-sample
events reflecting the system status when recording finishes. However, when
evsel opening fails (e.g., unsupported event 'cycles-ct'), session->evlist
is not initialized and remains NULL. The code unconditionally calls
record__synthesize() in the error path, which iterates through the NULL
evlist pointer and causes a segfault.

To fix it, move the record__synthesize() call inside the error check block, so
it's only called when there was no error during recording, ensuring that evlist
is properly initialized.

Fixes: 4ea648a ("perf record: Add --tail-synthesize option")
Signed-off-by: Shuai Xue <xueshuai@linux.alibaba.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
guojinhui-liam pushed a commit that referenced this pull request Apr 16, 2026
commit 23cc6ef upstream.

Running syzkaller with the newly enabled signed integer overflow
sanitizer produces this report:

[  195.401651] ------------[ cut here ]------------
[  195.404808] UBSAN: signed-integer-overflow in ../fs/open.c:321:15
[  195.408739] 9223372036854775807 + 562984447377399 cannot be represented in type 'loff_t' (aka 'long long')
[  195.414683] CPU: 1 PID: 703 Comm: syz-executor.0 Not tainted 6.8.0-rc2-00039-g14de58dbe653-dirty #11
[  195.420138] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.3-debian-1.16.3-2 04/01/2014
[  195.425804] Call Trace:
[  195.427360]  <TASK>
[  195.428791]  dump_stack_lvl+0x93/0xd0
[  195.431150]  handle_overflow+0x171/0x1b0
[  195.433640]  vfs_fallocate+0x459/0x4f0
...
[  195.490053] ------------[ cut here ]------------
[  195.493146] UBSAN: signed-integer-overflow in ../fs/open.c:321:61
[  195.497030] 9223372036854775807 + 562984447377399 cannot be represented in type 'loff_t' (aka 'long long)
[  195.502940] CPU: 1 PID: 703 Comm: syz-executor.0 Not tainted 6.8.0-rc2-00039-g14de58dbe653-dirty #11
[  195.508395] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.3-debian-1.16.3-2 04/01/2014
[  195.514075] Call Trace:
[  195.515636]  <TASK>
[  195.517000]  dump_stack_lvl+0x93/0xd0
[  195.519255]  handle_overflow+0x171/0x1b0
[  195.521677]  vfs_fallocate+0x4cb/0x4f0
[  195.524033]  __x64_sys_fallocate+0xb2/0xf0

Historically, the signed integer overflow sanitizer did not work in the
kernel due to its interaction with `-fwrapv` but this has since been
changed [1] in the newest version of Clang. It was re-enabled in the
kernel with Commit 557f8c5 ("ubsan: Reintroduce signed overflow
sanitizer").

Let's use the check_add_overflow helper to first verify the addition
stays within the bounds of its type (long long); then we can use that
sum for the following check.

Link: llvm/llvm-project#82432 [1]
Closes: KSPP/linux#356
Cc: linux-hardening@vger.kernel.org
Reviewed-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Justin Stitt <justinstitt@google.com>
Link: https://lore.kernel.org/r/20240513-b4-sio-vfs_fallocate-v2-1-db415872fb16@google.com
Reviewed-by: Jan Kara <jack@suse.cz>
Signed-off-by: Christian Brauner <brauner@kernel.org>
Signed-off-by: Xin Yin <yinxin.x@bytedance.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

8 participants