Skip to content

Suppress sysrq Kernel panics for Kdump tests#4473

Open
SRIKKANTH wants to merge 7 commits into
mainfrom
smyakam/2026_05_08/suppress_sysrq_for_kdumps
Open

Suppress sysrq Kernel panics for Kdump tests#4473
SRIKKANTH wants to merge 7 commits into
mainfrom
smyakam/2026_05_08/suppress_sysrq_for_kdumps

Conversation

@SRIKKANTH
Copy link
Copy Markdown
Collaborator

Description

In certain scenarios (e.g., 160 vCore VMs), panics triggered via sysrq are incorrectly detected as legitimate kernel panics. This behavior is not expected for kdump tests. This change excludes sysrq-triggered panics during kdump testing.

Related Issue

Type of Change

  • Bug fix
  • New feature
  • Breaking change
  • Refactoring
  • Documentation update

Checklist

  • Description is filled in above
  • No credentials, secrets, or internal details are included
  • Peer review requested (if not, add required peer reviewers after raising PR)
  • Tests executed and results posted below

Test Validation

Key Test Cases:
verify_kdumpcrash_on_random_cpu

Impacted LISA Features:
KDUMP

Tested Azure Marketplace Images:

  • resf rockylinux-x86_64 9-base 9.6.20250531
  • canonical ubuntu-24_04-lts server 24.04.202603120

Test Results

Image VM Size Result
verify_kdumpcrash_on_random_cpu Standard_L160ias_v5 PASSED
verify_kdumpcrash_on_random_cpu Standard_L160ias_v5 PASSED

@SRIKKANTH SRIKKANTH requested a review from LiliDeng as a code owner May 11, 2026 03:51
Copilot AI review requested due to automatic review settings May 11, 2026 03:51
@github-actions
Copy link
Copy Markdown

✅ AI Test Selection — PASSED

1 test case(s) selected (view run)

Marketplace image: canonical 0001-com-ubuntu-server-jammy 22_04-lts-gen2 latest

Count
✅ Passed 0
❌ Failed 0
⏭️ Skipped 1
Total 1
Test case details
Test Case Status Time (s) Message
verify_mshv_crash (lisa_0_0) ⏭️ SKIPPED 6.582 before_case skipped: MSHV_DIAG not enabled, skip

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR updates LISA’s kdump test flow to avoid treating sysrq-triggered (intentional) crashes as unexpected kernel panics during post-case serial-console panic scanning.

Changes:

  • Adds a kdump helper to extend SerialConsole.panic_ignorable_patterns with sysrq-related patterns.
  • Invokes the suppression logic at the end of kdump_test() after a successful dump verification and cleanup.

Comment thread lisa/tools/kdump.py
Comment thread lisa/tools/kdump.py Outdated
Comment on lines +919 to +920
# The RIP line accompanies the sysrq-triggered panic.
re.compile(r"^(.*RIP:.*)$", re.MULTILINE),
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@SRIKKANTH RIP is too broad

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is not much to compare after RIP.
[ 17.532453] RIP: 0033:0x75f94711c5a4

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

RIP is too broad , kdump is being run by many teams and with this change we will start getting false panic detection . i had discussion with Vijay/Tyle some months ago regarding kernel panic detection and from meeting it was decided we should not add overly broad detectiom and every kernel panic detection change should go through kernel team for review .

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can do like this :
re.compile(
r"(?ms)^.Kernel panic - not syncing: sysrq triggered crash.\n"
r"(?:^.\n){0,80}?"
r"^.sysrq_handle_crash.\n"
r"(?:^.
\n){0,80}?"
r"^(.RIP: 0033:.)$"
)

It does not ignore every RIP: line.
It does not require sysrq_handle_crash to be on the RIP: line.
It scopes the ignored RIP: 0033: to a panic block that is clearly the intentional sysrq-triggered kdump crash.
It still returns only the RIP: line as the ignored candidate, so it fits the current behavior.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given regex is not catching the sysrq panic.

Added '0033:' which will restrict the regex only to user space generated kernel panics like 'echo c > /proc/sysrq-trigger'
re.compile(r"^(.*RIP: 0033:.*)$", re.MULTILINE),

Comment thread lisa/tools/kdump.py Outdated
),
re.compile(r"^(.*sysrq: SysRq : Trigger a crash.*)$", re.MULTILINE),
# The RIP line accompanies the sysrq-triggered panic.
re.compile(r"^(.*RIP:.*)$", re.MULTILINE),
Copy link
Copy Markdown
Collaborator

@kanchansenlaskar kanchansenlaskar May 11, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

re.compile(r"^(.*RIP:.sysrq_handle_crash.)$", re.MULTILINE) re.compile(r"^(.*RIP:.*)$", re.MULTILINE) This pattern matches any RIP: line in the entire console log, not just the one from the sysrq crash. If a genuine, unrelated kernel panic also produces a RIP: line (which they almost always do), it would be silently ignored. Please scope this to the sysrq crash handler:
re.compile(r"^(.*RIP:.*sysrq_handle_crash.*)$", re.MULTILINE)

fixed the typo

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sample sysrq generated kernel panic. There is not much to compare after RIP.

[   17.519224] sysrq: Trigger a crash
[   17.519617] Kernel panic - not syncing: sysrq triggered crash
[   17.519964] CPU: 18 UID: 0 PID: 8948 Comm: echo Kdump: loaded Not tainted 6.17.0-1008-azure #8~24.04.1-Ubuntu VOLUNTARY 
[   17.520541] Hardware name: Microsoft Corporation Virtual Machine/Virtual Machine, BIOS Hyper-V UEFI Release v4.1 02/25/2026
[   17.521245] Call Trace:
[   17.521366]  <TASK>
[   17.521510]  dump_stack_lvl+0x27/0x70
[   17.521758]  dump_stack+0x10/0x20
[   17.521996]  vpanic+0x31f/0x3b0
[   17.522236]  panic+0x5f/0x60
[   17.522468]  sysrq_handle_crash+0x15/0x20
[   17.522726]  __handle_sysrq+0xe0/0x250
[   17.522931]  write_sysrq_trigger+0x5c/0x80
[   17.523182]  proc_reg_write+0x5e/0xa0
[   17.523443]  ? __cond_resched+0x1a/0x50
[   17.523654]  vfs_write+0xf9/0x440
[   17.523868]  ? srso_return_thunk+0x5/0x5f
[   17.524077]  ? xas_load+0x17/0x100
[   17.524290]  ? get_page_from_freelist+0x471/0x6c0
[   17.524601]  ? srso_return_thunk+0x5/0x5f
[   17.524808]  ? __cond_resched+0x1a/0x50
[   17.525025]  ? srso_return_thunk+0x5/0x5f
[   17.525234]  ? mutex_lock+0x12/0x40
[   17.525444]  ksys_write+0x71/0xf0
[   17.525625]  __x64_sys_write+0x19/0x20
[   17.525791]  x64_sys_call+0x79/0x20d0
[   17.526007]  do_syscall_64+0x7b/0xb70
[   17.526173]  ? srso_return_thunk+0x5/0x5f
[   17.526487]  ? cp_new_stat+0x141/0x170
[   17.526720]  ? srso_return_thunk+0x5/0x5f
[   17.526958]  ? __do_sys_newfstat+0x4c/0x80
[   17.527207]  ? srso_return_thunk+0x5/0x5f
[   17.527499]  ? srso_return_thunk+0x5/0x5f
[   17.527781]  ? arch_exit_to_user_mode_prepare.isra.0+0xd/0xc0
[   17.528261]  ? srso_return_thunk+0x5/0x5f
[   17.528528]  ? do_syscall_64+0xad/0xb70
[   17.528740]  ? srso_return_thunk+0x5/0x5f
[   17.528996]  ? count_memcg_events+0xba/0x1a0
[   17.529252]  ? srso_return_thunk+0x5/0x5f
[   17.529472]  ? handle_mm_fault+0x1d3/0x2d0
[   17.529677]  ? srso_return_thunk+0x5/0x5f
[   17.529939]  ? do_user_addr_fault+0x1b9/0x860
[   17.530191]  ? srso_return_thunk+0x5/0x5f
[   17.530460]  ? arch_exit_to_user_mode_prepare.isra.0+0xd/0xe0
[   17.530811]  ? srso_return_thunk+0x5/0x5f
[   17.531025]  ? irqentry_exit_to_user_mode+0x2d/0x1b0
[   17.531283]  ? srso_return_thunk+0x5/0x5f
[   17.531443]  ? irqentry_exit+0x1d/0x30
[   17.531705]  ? srso_return_thunk+0x5/0x5f
[   17.531957]  ? exc_page_fault+0x84/0x150
[   17.532196]  entry_SYSCALL_64_after_hwframe+0x76/0x7e
[   17.532453] RIP: 0033:0x75f94711c5a4
[   17.532661] Code: c7 00 16 00 00 00 b8 ff ff ff ff c3 66 2e 0f 1f 84 00 00 00 00 00 f3 0f 1e fa 80 3d a5 ea 0e 00 00 74 13 b8 01 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 54 c3 0f 1f 00 55 48 89 e5 48 83 ec 20 48 89
[   17.533822] RSP: 002b:00007ffe3b627158 EFLAGS: 00000202 ORIG_RAX: 0000000000000001
[   17.534173] RAX: ffffffffffffffda RBX: 0000000000000002 RCX: 000075f94711c5a4
[   17.534609] RDX: 0000000000000002 RSI: 000063539a6b10b0 RDI: 0000000000000001
[   17.535042] RBP: 00007ffe3b627180 R08: 0000000000000000 R09: 0000000000000410
[   17.535587] R10: 0000000000000001 R11: 0000000000000202 R12: 0000000000000002
[   17.536028] R13: 000063539a6b10b0 R14: 000075f9472045c0 R15: 000075f947201ee0
[   17.536469]  </TASK>
[   17.542975] Kernel Offset: 0x28200000 from 0xffffffff81000000 (relocation range: 0xffffffff80000000-0xffffffffbfffffff)

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There was a typo in my comment, corrected it, now check

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please scope this to the sysrq crash handler:
re.compile(r"^(.*RIP:.*sysrq_handle_crash.*)$", re.MULTILINE)

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you try changing it
from:
re.compile(r"^(.*RIP:.*)$", re.MULTILINE)
to:
re.compile(r"^(.*RIP: 0033:.*)$", re.MULTILINE)

Comment thread lisa/tools/kdump.py
@github-actions
Copy link
Copy Markdown

✅ AI Test Selection — PASSED

1 test case(s) selected (view run)

Marketplace image: canonical 0001-com-ubuntu-server-jammy 22_04-lts-gen2 latest

Count
✅ Passed 0
❌ Failed 0
⏭️ Skipped 1
Total 1
Test case details
Test Case Status Time (s) Message
verify_mshv_crash (lisa_0_0) ⏭️ SKIPPED 10.142 before_case skipped: MSHV_DIAG not enabled, skip

Copilot AI review requested due to automatic review settings May 11, 2026 05:15
@github-actions
Copy link
Copy Markdown

✅ AI Test Selection — PASSED

1 test case(s) selected (view run)

Marketplace image: canonical 0001-com-ubuntu-server-jammy 22_04-lts-gen2 latest

Count
✅ Passed 0
❌ Failed 0
⏭️ Skipped 1
Total 1
Test case details
Test Case Status Time (s) Message
verify_mshv_crash (lisa_0_0) ⏭️ SKIPPED 6.839 before_case skipped: MSHV_DIAG not enabled, skip

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 1 out of 1 changed files in this pull request and generated 2 comments.

Comment thread lisa/tools/kdump.py
Comment on lines +981 to +988
re.compile(r"^(.*sysrq: SysRq : Trigger a crash.*)$", re.MULTILINE),
# The RIP line accompanies the sysrq-triggered panic.
re.compile(r"^(.*RIP:.*)$", re.MULTILINE),
]
# Shadow the class attribute on the instance so other nodes are unaffected.
existing = list(serial_console.panic_ignorable_patterns)
existing.extend(expected_patterns)
serial_console.panic_ignorable_patterns = existing
Comment thread lisa/tools/kdump.py
@github-actions
Copy link
Copy Markdown

✅ AI Test Selection — PASSED

1 test case(s) selected (view run)

Marketplace image: canonical 0001-com-ubuntu-server-jammy 22_04-lts-gen2 latest

Count
✅ Passed 0
❌ Failed 0
⏭️ Skipped 1
Total 1
Test case details
Test Case Status Time (s) Message
verify_mshv_crash (lisa_0_0) ⏭️ SKIPPED 9.189 before_case skipped: MSHV_DIAG not enabled, skip

Copy link
Copy Markdown
Collaborator

@kanchansenlaskar kanchansenlaskar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The solution may not be perfect at this moment, but this right now only impact kdump tests

@SRIKKANTH SRIKKANTH marked this pull request as draft May 12, 2026 15:08
Copilot AI review requested due to automatic review settings May 12, 2026 15:09
@github-actions
Copy link
Copy Markdown

✅ AI Test Selection — PASSED

1 test case(s) selected (view run)

Marketplace image: canonical 0001-com-ubuntu-server-jammy 22_04-lts-gen2 latest

Count
✅ Passed 0
❌ Failed 0
⏭️ Skipped 1
Total 1
Test case details
Test Case Status Time (s) Message
verify_mshv_crash (lisa_0_0) ⏭️ SKIPPED 9.486 before_case skipped: MSHV_DIAG not enabled, skip

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 1 out of 1 changed files in this pull request and generated 3 comments.

Comment thread lisa/tools/kdump.py Outdated
Comment on lines +978 to +982
r"(?ms)^.Kernel panic - not syncing: sysrq triggered crash.\n"
r"(?:^.\n){0,80}?"
r"^.sysrq_handle_crash.\n"
r"(?:^.\n){0,80}?"
r"^(.RIP: 0033:.)$"
Comment thread lisa/tools/kdump.py
# [ 17.536469] </TASK>
# [ 17.542975] Kernel Offset: 0x28200000 from 0xffffffff81000000 (relocation range: 0xffffffff80000000-0xffffffffbfffffff) # noqa: E501

expected_patterns: List[re.Pattern[str]] = [
Comment thread lisa/tools/kdump.py
@github-actions
Copy link
Copy Markdown

✅ AI Test Selection — PASSED

1 test case(s) selected (view run)

Marketplace image: canonical 0001-com-ubuntu-server-jammy 22_04-lts-gen2 latest

Count
✅ Passed 0
❌ Failed 0
⏭️ Skipped 1
Total 1
Test case details
Test Case Status Time (s) Message
verify_mshv_crash (lisa_0_0) ⏭️ SKIPPED 7.819 before_case skipped: MSHV_DIAG not enabled, skip

@SRIKKANTH SRIKKANTH marked this pull request as ready for review May 13, 2026 05:42
Copilot AI review requested due to automatic review settings May 13, 2026 05:42
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 1 out of 1 changed files in this pull request and generated 1 comment.

Comment thread lisa/tools/kdump.py
Comment on lines +978 to +983
r"^(.*Kernel panic - not syncing: sysrq triggered crash.*)$",
re.MULTILINE,
),
re.compile(r"^(.*sysrq: SysRq : Trigger a crash.*)$", re.MULTILINE),
# The RIP line accompanies the sysrq-triggered panic.
re.compile(r"^(.*RIP: 0033:.*)$", re.MULTILINE),
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants