Skip to content

Conversation

@MingcongBai
Copy link
Contributor

This workaround causes instability for LoongArch/Loongson (MIPS) devices based on the 7A1000/2000 chipset under heavy I/O load.

FIXME: Disable this workaround until we find a better fix (possibly in the platform-specific PCI code).

@deepin-ci-robot
Copy link

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
Once this PR has been reviewed and has the lgtm label, please assign yukarichiba for approval. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@FlyGoat
Copy link
Contributor

FlyGoat commented Jul 16, 2024

MACH_LOONGSON64 足以覆盖LA与MIPS的受影响平台了。

… Loongson64

This workaround causes instability for LoongArch/Loongson (MIPS) devices
based on the 7A1000/2000 chipset under heavy I/O load.

FIXME: Disable this workaround until we find a better fix (possibly in the
platform-specific PCI code).

Signed-off-by: Mingcong Bai <baimingcong@uniontech.com>
@MingcongBai MingcongBai force-pushed the bai/linux-6.6.y/loongarch64-amdgpu-workaround branch from edce4eb to 8f09e0a Compare July 16, 2024 06:45
@deepin-ci-robot
Copy link

deepin pr auto review

关键摘要:

  • 在AMDGPU的ring_emit_fence_gfx函数中,新增了条件编译指令来禁用针对LoongArch/Loongson (MIPS) 7A1000/2000芯片组的workaround,这有助于解决在重负载下的不稳定问题。
  • 注释中提到“FIXME:禁用此workaround直到我们找到更好的解决方案(可能是在平台特定PCI代码中)”。这表明当前的workaround可能不是最优的,需要进一步探索解决方案。

是否建议立即修改:

  • 是,建议立即修改以避免潜在的系统不稳定问题。同时,需要关注注释中提到的“FIXME”,确保问题得到及时解决,以保护系统的稳定性。

@MingcongBai
Copy link
Contributor Author

MACH_LOONGSON64 足以覆盖LA与MIPS的受影响平台了。

了解,已据此修改

@opsiff opsiff merged commit 46ba4ef into linux-6.6.y Jul 16, 2024
@MingcongBai MingcongBai deleted the bai/linux-6.6.y/loongarch64-amdgpu-workaround branch July 22, 2024 06:06
opsiff pushed a commit to opsiff/UOS-kernel that referenced this pull request Dec 29, 2025
… Loongson64

This workaround causes instability for LoongArch/Loongson (MIPS) devices
based on the 7A1000/2000 chipset under heavy I/O load.

FIXME: Disable this workaround until we find a better fix (possibly in the
platform-specific PCI code).

Signed-off-by: Mingcong Bai <baimingcong@uniontech.com>
Link: deepin-community#323
(cherry picked from commit 7060b71)
Signed-off-by: Wentao Guan <guanwentao@uniontech.com>

Conflicts:
	drivers/gpu/drm/amd/amdgpu/gfx_v7_0.c
opsiff pushed a commit to opsiff/UOS-kernel that referenced this pull request Dec 29, 2025
… Loongson64

This workaround causes instability for LoongArch/Loongson (MIPS) devices
based on the 7A1000/2000 chipset under heavy I/O load.

FIXME: Disable this workaround until we find a better fix (possibly in the
platform-specific PCI code).

Signed-off-by: Mingcong Bai <baimingcong@uniontech.com>
Link: deepin-community#323
(cherry picked from commit 7060b71)
Signed-off-by: Wentao Guan <guanwentao@uniontech.com>

Conflicts:
	drivers/gpu/drm/amd/amdgpu/gfx_v7_0.c
opsiff pushed a commit to opsiff/UOS-kernel that referenced this pull request Jan 4, 2026
commit ef7f38df890f5dcd2ae62f8dbde191d72f3bebae upstream.

Synthetic events currently do not have a function to register perf events.
This leads to calling the tracepoint register functions with a NULL
function pointer which triggers:

 ------------[ cut here ]------------
 WARNING: kernel/tracepoint.c:175 at tracepoint_add_func+0x357/0x370, CPU#2: perf/2272
 Modules linked in: kvm_intel kvm irqbypass
 CPU: 2 UID: 0 PID: 2272 Comm: perf Not tainted 6.18.0-ftest-11964-ge022764176fc-dirty deepin-community#323 PREEMPTLAZY
 Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.17.0-debian-1.17.0-1 04/01/2014
 RIP: 0010:tracepoint_add_func+0x357/0x370
 Code: 28 9c e8 4c 0b f5 ff eb 0f 4c 89 f7 48 c7 c6 80 4d 28 9c e8 ab 89 f4 ff 31 c0 5b 41 5c 41 5d 41 5e 41 5f 5d c3 cc cc cc cc cc <0f> 0b 49 c7 c6 ea ff ff ff e9 ee fe ff ff 0f 0b e9 f9 fe ff ff 0f
 RSP: 0018:ffffabc0c44d3c40 EFLAGS: 00010246
 RAX: 0000000000000001 RBX: ffff9380aa9e4060 RCX: 0000000000000000
 RDX: 000000000000000a RSI: ffffffff9e1d4a98 RDI: ffff937fcf5fd6c8
 RBP: 0000000000000001 R08: 0000000000000007 R09: ffff937fcf5fc780
 R10: 0000000000000003 R11: ffffffff9c193910 R12: 000000000000000a
 R13: ffffffff9e1e5888 R14: 0000000000000000 R15: ffffabc0c44d3c78
 FS:  00007f6202f5f340(0000) GS:ffff93819f00f000(0000) knlGS:0000000000000000
 CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
 CR2: 000055d3162281a8 CR3: 0000000106a56003 CR4: 0000000000172ef0
 Call Trace:
  <TASK>
  tracepoint_probe_register+0x5d/0x90
  synth_event_reg+0x3c/0x60
  perf_trace_event_init+0x204/0x340
  perf_trace_init+0x85/0xd0
  perf_tp_event_init+0x2e/0x50
  perf_try_init_event+0x6f/0x230
  ? perf_event_alloc+0x4bb/0xdc0
  perf_event_alloc+0x65a/0xdc0
  __se_sys_perf_event_open+0x290/0x9f0
  do_syscall_64+0x93/0x7b0
  ? entry_SYSCALL_64_after_hwframe+0x76/0x7e
  ? trace_hardirqs_off+0x53/0xc0
  entry_SYSCALL_64_after_hwframe+0x76/0x7e

Instead, have the code return -ENODEV, which doesn't warn and has perf
error out with:

 # perf record -e synthetic:futex_wait
Error:
The sys_perf_event_open() syscall returned with 19 (No such device) for event (synthetic:futex_wait).
"dmesg | grep -i perf" may provide additional information.

Ideally perf should support synthetic events, but for now just fix the
warning. The support can come later.

Cc: stable@vger.kernel.org
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://patch.msgid.link/20251216182440.147e4453@gandalf.local.home
Fixes: 4b14793 ("tracing: Add support for 'synthetic' events")
Reported-by: Ian Rogers <irogers@google.com>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
(cherry picked from commit 6df47e5bb9b62d72f186f826ab643ea1856877c7)
Signed-off-by: Wentao Guan <guanwentao@uniontech.com>
lanlanxiyiji pushed a commit that referenced this pull request Jan 4, 2026
… Loongson64

This workaround causes instability for LoongArch/Loongson (MIPS) devices
based on the 7A1000/2000 chipset under heavy I/O load.

FIXME: Disable this workaround until we find a better fix (possibly in the
platform-specific PCI code).

Signed-off-by: Mingcong Bai <baimingcong@uniontech.com>
Link: #323
(cherry picked from commit 7060b71)
Signed-off-by: Wentao Guan <guanwentao@uniontech.com>

Conflicts:
	drivers/gpu/drm/amd/amdgpu/gfx_v7_0.c
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants