Skip to content

[LNL] hang, timeout or crash in pause-release #5109

@marc-hb

Description

@marc-hb

I spent a long time looking for an existing bug and I found many similar issues (see list below) but I think this one is either brand new or never filed yet. Part of the problem was that the Expect part of the pause-resume test was utterly buggy; that didn't help. Now that I rewrote it in thesofproject/sof-test#1218 we can finally pay less attention to the test code and a bit more attention to the product.

The WARNING: received == PAUSE == while in state recording! Ignoring. message is very surprising. It showed up more than once.
EDIT: this message is now gone since "fail fast" test fix thesofproject/sof-test#1226. Never look past the very first error which is usually now file descriptor in bad state.

2024-07-16T13:31:16-07:00

Linux Branch: topic/sof-dev
Commit: 1998ade4783a
Kconfig Branch: master
Kconfig Commit: 8189104a4f38
SOF Branch: main
Commit: 3051607efb4f
Zephyr Commit:650227d8c47f

https://sof-ci.01.org/softestpr/PR1218/build632/devicetest/index.html?model=LNLM_RVP_NOCODEC&testcase=multiple-pause-resume-50
https://sof-ci.01.org/softestpr/PR966/build659/devicetest/index.html
https://sof-ci.01.org/softestpr/PR812/build656/devicetest/index.html
https://sof-ci.01.org/linuxpr/PR5106/build4026/devicetest/index.html?model=LNLM_RVP_NOCODEC&testcase=multiple-pause-resume-50
https://sof-ci.01.org/sofpr/PR9305/build6551/devicetest/index.html

https://sof-ci.01.org/softestpr/PR1224/build686/devicetest/index.html

https://sof-ci.01.org/sofpr/PR9335/build6748/devicetest/index.html

The logs don't all look the same but here's a typical one:

t=7412 ms: cmd3 arecord Port0 2nd Capture: (16/50) Found volume ### | xx%, recording for 38 ms
t=7538 ms: cmd1 arecord Port0: (19/50) Found   === PAUSE ===  ,  pausing for 36 ms
t=7420 ms: cmd3 arecord Port0 2nd Capture: (16/50) Found   === PAUSE ===  ,  pausing for 22 ms
t=7553 ms: cmd3 arecord Port0 2nd Capture: (17/50) Found volume ### | xx%, recording for 47 ms
t=7553 ms: cmd3 arecord Port0 2nd Capture: WARNING: received == PAUSE == while in state recording! Ignoring.
t=7794 ms: cmd1 arecord Port0: (20/50) Found volume ### | xx%, recording for 36 ms
t=7919 ms: cmd1 arecord Port0: (20/50) Found   === PAUSE ===  ,  pausing for 44 ms
t=8170 ms: cmd1 arecord Port0: (21/50) Found volume ### | xx%, recording for 27 ms
t=8295 ms: cmd1 arecord Port0: (21/50) Found   === PAUSE ===  ,  pausing for 26 ms
t=8546 ms: cmd1 arecord Port0: (22/50) Found volume ### | xx%, recording for 29 ms


t=18714 ms: cmd1 arecord Port0: (49/50) Found volume ### | xx%, recording for 20 ms
t=18839 ms: cmd1 arecord Port0: (49/50) Found   === PAUSE ===  ,  pausing for 39 ms
t=19090 ms: cmd1 arecord Port0: (50/50) Found volume ### | xx%, recording for 29 ms
t=19215 ms: cmd1 arecord Port0: (50/50) Found   === PAUSE ===  ,  pausing for 33 ms
t=19215 ms: cmd1 arecord Port0: WARNING: volume was always 00%!
t=19215 ms: cmd1 arecord Port0: SUCCESS: /home/ubuntu/sof-test/case-lib/apause.exp arecord -D hw:0,0 -r 48000 -c 2 -f S16_LE -vv -i /dev/null

2024-07-16 07:53:47 UTC [REMOTE_ERROR] Still have expect process not finished after wait for 250
11568 expect /home/ubuntu/sof-test/case-lib/apause.exp cmd3 arecord Port0 2nd Capture 50 20 30 arecord -D hw:0,12 -r 48000 -c 2 -f S16_LE -vv -i /dev/null
11570 arecord -D hw:0,12 -r 48000 -c 2 -f S16_LE -vv -i /dev/null
2024-07-16 07:53:47 UTC [REMOTE_INFO] Starting func_exit_handler(1)
2024-07-16 07:53:47 UTC [REMOTE_ERROR] Starting func_exit_handler(), exit status=1, FUNCNAME stack:
2024-07-16 07:53:47 UTC [REMOTE_ERROR]  main()  @  /home/ubuntu/sof-test/test-case/multiple-pause-resume.sh
2024-07-16 07:53:48 UTC [REMOTE_INFO] pkill -TERM -f mtrace-reader.py
2024-07-16 07:53:48 UTC [REMOTE_INFO] nlines=24148 /home/ubuntu/sof-test/logs/multiple-pause-resume/2024-07-16-07:45:35-16732/mtrace.txt
+ grep -B 2 -A 1 -i --word-regexp -e ERR -e ERROR -e '' -e OSError /home/ubuntu/sof-test/logs/multiple-pause-resume/2024-07-16-07:45:35-16732/mtrace.txt
2024-07-16 07:53:50 UTC [REMOTE_INFO] ktime=1779 sof-test PID=10390: ending
2024-07-16 07:53:50 UTC [REMOTE_WARNING] Process(es) started by /home/ubuntu/sof-test/test-case/multiple-pause-resume.sh are still active, killing these process(es):
2024-07-16 07:53:50 UTC [REMOTE_WARNING] Catch pid: 11568 expect /home/ubuntu/sof-test/case-lib/apause.exp cmd3 arecord Port0 2nd Capture 50 20 30 arecord -D hw:0,12 -r 48000 -c 2 -f S16_LE -vv -i /dev/null
2024-07-16 07:53:50 UTC [REMOTE_WARNING] Kill cmd:'expect /home/ubuntu/sof-test/case-lib/apause.exp cmd3 arecord Port0 2nd Capture 50 20 30 arecord -D hw:0,12 -r 48000 -c 2 -f S16_LE -vv -i /dev/null' by kill -9
/home/ubuntu/sof-test/case-lib/hijack.sh: line 204: 11568 Killed                  "$TOPDIR"/case-lib/apause.exp "$shortname" "$repeat_count" "$rnd_min" "$rnd_range" "$cmd" -D "$dev" -r "$rate" -c "$channel" -f "$fmt" -vv -i "$file"
2024-07-16 07:53:50 UTC [REMOTE_INFO] Test Result: FAIL!
[ 1682.256786] <inf> ll_schedule: zephyr_domain_thread_fn: ll core 0 timer avg 2079, max 2127, overruns 0
[ 1683.280786] <inf> ll_schedule: zephyr_domain_thread_fn: ll core 0 timer avg 2079, max 2117, overruns 0
[ 1684.304788] <inf> ll_schedule: zephyr_domain_thread_fn: ll core 0 timer avg 2079, max 2120, overruns 0
[ 1685.328786] <inf> ll_schedule: zephyr_domain_thread_fn: ll core 0 timer avg 2079, max 2128, overruns 0
[ 1686.352788] <inf> ll_schedule: zephyr_domain_thread_fn: ll core 0 timer avg 2079, max 2117, overruns 0
[ 1687.376788] <inf> ll_schedule: zephyr_domain_thread_fn: ll core 0 timer avg 2079, max 2123, overruns 0
[ 1688.400788] <inf> ll_schedule: zephyr_domain_thread_fn: ll core 0 timer avg 2079, max 2121, overruns 0
[ 1689.424788] <inf> ll_schedule: zephyr_domain_thread_fn: ll core 0 timer avg 2079, max 2343, overruns 0
[ 1690.448788] <inf> ll_schedule: zephyr_domain_thread_fn: ll core 0 timer avg 2079, max 2120, overruns 0
[ 1691.472788] <inf> ll_schedule: zephyr_domain_thread_fn: ll core 0 timer avg 2079, max 2117, overruns 0
[ 1692.496788] <inf> ll_schedule: zephyr_domain_thread_fn: ll core 0 timer avg 2079, max 2130, overruns 0
[ 1693.520788] <inf> ll_schedule: zephyr_domain_thread_fn: ll core 0 timer avg 2079, max 2122, overruns 0
[ 1694.544788] <inf> ll_schedule: zephyr_domain_thread_fn: ll core 0 timer avg 2079, max 2127, overruns 0
[ 1695.568788] <inf> ll_schedule: zephyr_domain_thread_fn: ll core 0 timer avg 2079, max 2117, overruns 0
[ 1696.592788] <inf> ll_schedule: zephyr_domain_thread_fn: ll core 0 timer avg 2079, max 2128, overruns 0
[ 1697.616788] <inf> ll_schedule: zephyr_domain_thread_fn: ll core 0 timer avg 2079, max 2126, overruns 0
[ 1698.640788] <inf> ll_schedule: zephyr_domain_thread_fn: ll core 0 timer avg 2079, max 2123, overruns 0
[ 1699.664788] <inf> ll_schedule: zephyr_domain_thread_fn: ll core 0 timer avg 2079, max 2119, overruns 0
[ 1700.688786] <inf> ll_schedule: zephyr_domain_thread_fn: ll core 0 timer avg 2079, max 2116, overruns 0
[ 1701.712788] <inf> ll_schedule: zephyr_domain_thread_fn: ll core 0 timer avg 2079, max 2127, overruns 0
[ 1702.736788] <inf> ll_schedule: zephyr_domain_thread_fn: ll core 0 timer avg 2079, max 2124, overruns 0
[ 1703.760788] <inf> ll_schedule: zephyr_domain_thread_fn: ll core 0 timer avg 2079, max 2127, overruns 0
[ 1704.784788] <inf> ll_schedule: zephyr_domain_thread_fn: ll core 0 timer avg 2079, max 2117, overruns 0
[ 1705.808786] <inf> ll_schedule: zephyr_domain_thread_fn: ll core 0 timer avg 2079, max 2126, overruns 0
[ 1706.832788] <inf> ll_schedule: zephyr_domain_thread_fn: ll core 0 timer avg 2079, max 2117, overruns 0
[ 1707.856786] <inf> ll_schedule: zephyr_domain_thread_fn: ll core 0 timer avg 2079, max 2126, overruns 0
[ 1708.880786] <inf> ll_schedule: zephyr_domain_thread_fn: ll core 0 timer avg 2079, max 2128, overruns 0
[ 1709.904788] <inf> ll_schedule: zephyr_domain_thread_fn: ll core 0 timer avg 2079, max 2126, overruns 0

etc.

cc:

Metadata

Metadata

Assignees

No one assigned

    Labels

    LNLApplies to Lunar Lake platformbugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions