Skip to content

Conversation

@fredoh9
Copy link
Contributor

@fredoh9 fredoh9 commented Apr 22, 2021

In error case, no kernel check point causes TIMEOUT in SOF CI. The below
line will be last log without verdict, then ends with TIMEOUT.

[ERROR] Kernel check point "KERNEL_CHECKPOINT" is not properly set

Signed-off-by: Fred Oh fred.oh@linux.intel.com

In error case, no kernel check point causes TIMEOUT in SOF CI. The below
line will be last log without verdict, then ends with TIMEOUT.

[ERROR] Kernel check point "KERNEL_CHECKPOINT" is not properly set

Signed-off-by: Fred Oh <fred.oh@linux.intel.com>
@fredoh9 fredoh9 requested a review from a team as a code owner April 22, 2021 20:39
@fredoh9
Copy link
Contributor Author

fredoh9 commented Apr 22, 2021

I was able to test with TGLU_RVP_NOCODEC. With this PR, there is verdict at the end. Without this PR, last error was [ERROR] Kernel check point "KERNEL_CHECKPOINT" is not properly set

2021-04-22 20:18:47 UTC Sub-Test: [INFO] ===== Testing: (Round: 1/1) (PCM: smart-nocodec [hw:0,2]<playback>) (Loop: 1/1) =====
2021-04-22 20:18:47 UTC Sub-Test: [COMMAND] aplay   -Dhw:0,2 -r 48000 -c 2 -f S16_LE -d 3 /dev/zero -v -q
aplay: main:830: audio open error: Invalid argument
lsof: WARNING: can't stat() fuse.gvfsd-fuse file system /run/user/120/gvfs
      Output information may be incomplete.
2021-04-22 20:18:47 UTC Sub-Test: [ERROR] aplay on PCM hw:0,2 failed at 1/1.
2021-04-22 20:18:47 UTC Sub-Test: [ERROR] Starting func_exit_handler(), exit status=1, FUNCNAME stack:
2021-04-22 20:18:47 UTC Sub-Test: [ERROR]  die()  @  /home/ubuntu/sof-test/test-case/../case-lib/lib.sh
2021-04-22 20:18:47 UTC Sub-Test: [ERROR]  main()  @  /home/ubuntu/sof-test/test-case/check-playback.sh:99
2021-04-22 20:18:47 UTC Sub-Test: [INFO] Starting /usr/bin/sof-logger  -l /etc/sof/sof-tgl.ldc -o /home/ubuntu/sof-test/logs/check-kmod-load-unload-after-playback/2021-04-22-20:18:47-11971/etrace.txt
2021-04-22 20:18:48 UTC Sub-Test: [INFO] pkill -TERM sof-logger
Terminated
2021-04-22 20:18:49 UTC Sub-Test: [INFO] nlines=3 /home/ubuntu/sof-test/logs/check-kmod-load-unload-after-playback/2021-04-22-20:18:47-11971/etrace.txt
2021-04-22 20:18:49 UTC Sub-Test: [INFO] Test Result: FAIL!
2021-04-22 20:18:49 UTC [ERROR] aplay check failed
2021-04-22 20:18:49 UTC [ERROR] Starting func_exit_handler(), exit status=1, FUNCNAME stack:
2021-04-22 20:18:49 UTC [ERROR]  die()  @  /home/ubuntu/sof-test/test-case/../case-lib/lib.sh
2021-04-22 20:18:49 UTC [ERROR]  main()  @  /home/ubuntu/sof-test/test-case/check-kmod-load-unload-after-playback.sh:67
2021-04-22 20:18:49 UTC [INFO] Test Result: FAIL!

And TIMEOUT errors can be found in internal daily test 3430 with TGLU_RVP_NOCODEC also.


func_opt_parse_option "$@"
fxunc_opt_parse_option "$@"
setup_kernel_check_point
Copy link
Member

@plbossart plbossart Apr 22, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@fredoh9 @marc-hb maybe a stupid question, but there was a similar change (EDIT: c76cb0b) from @aiChaoSONG for pm_runtime tests yesterday, so should we add this in ALL tests?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is more or less what I asked in #639 (comment)

Some (future?) tests may want to scan the entire log since boot though. For instance we discussed one "empty" test only for the purpose of checking SOF modules loaded at boot time. It's still not clear what such a test should do wrt checkpoints, @aiChaoSONG

After excluding aliases and non-audio tests I found only 2 tests that don't invoke setup_kernel_check_point() directly:

+test-case/check-audio-equalizer.sh
+test-case/check-volume-levels.sh

Would these two tests also timeout if they fail?

Copy link

@aiChaoSONG aiChaoSONG Apr 23, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@plbossart @marc-hb Previously, when we were using DMESG_KERNEL_LAST_LINE which acted like kernel check point, this variable is implicitly set up in lib.sh. But I think it is better to set up check point explicitly in every test case, this will tell where the log collect is started.

In PR #653, I add an option to disable check point, with it, we are able to collect log from kernel boot

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so should we add this in ALL tests

I have a round of clean up on this within the journalctl PR, only minor number of test case(s) which is not run in PR test or daily test may need this. And also, when this happens now, some boundary condition is triggered.

@marc-hb
Copy link
Collaborator

marc-hb commented Apr 22, 2021

@fredoh9 , @aiChaoSONG can you say more about this TIMEOUT, why does it happen, where and how long it is?

Can we please change this checkpoint failure so it fails immediately? EDIT: done in separate #661

@marc-hb
Copy link
Collaborator

marc-hb commented Apr 22, 2021

In https://sof-ci.01.org/softestpr/PR655/build666/devicetest and https://sof-ci.01.org/softestpr/PR655/build687/devicetest/ the modified test failed on all platforms with: check-kmod-load-unload-after-playback.sh: line 51: fxunc_opt_parse_option: command not found

It's very good that it failed, no green failure for a change.

It's not so good that it timed-out. Why did the exit_handler not run? (it ran). Can we just print some magic string at the end?

@marc-hb
Copy link
Collaborator

marc-hb commented Apr 22, 2021

Some (future?) tests may want to scan the entire log since boot though. For instance we discussed one "empty" test only for the purpose of checking SOF modules loaded at boot time. It's still not clear what such a test should do wrt checkpoints, @aiChaoSONG

Already answered 3 days ago by @aiChaoSONG in #653, sorry I've been running behind with code reviews.

@marc-hb marc-hb changed the title sof-test: set the kernel check point before starting a test check-kmod: set the kernel check point before starting a test Apr 23, 2021
@marc-hb
Copy link
Collaborator

marc-hb commented Apr 23, 2021

It's not so good that it timed-out. Why did the exit_handler not run? (it ran). Can we just print some magic string at the end?

All checkpoint timeouts fixed in new #661 "don't die in the die handler"

OPT_HAS_ARG['p']=0 OPT_VAL['p']=1

func_opt_parse_option "$@"
fxunc_opt_parse_option "$@"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While fixing this typo, can you also add set -e if testing is OK?

If set -e requires some other changes then forget it, does not really belong to this PR.

@fredoh9
Copy link
Contributor Author

fredoh9 commented Apr 26, 2021

#661 is merged

@fredoh9 fredoh9 closed this Apr 26, 2021
@marc-hb
Copy link
Collaborator

marc-hb commented Apr 29, 2021

I'm confused @fredoh9 , @aiChaoSONG : don't we still want this?

@aiChaoSONG
Copy link

I'm confused @fredoh9 , @aiChaoSONG : don't we still want this?

I think we still need this, we don't die but only print error messages, without this, we may see the error messages printed out.

@marc-hb marc-hb reopened this Apr 29, 2021
@marc-hb
Copy link
Collaborator

marc-hb commented Nov 1, 2021

SOFCI TEST

EDIT: the fxunc_opt_parse_option typo still fails across all platforms.

@fredoh9 fredoh9 closed this Jun 3, 2022
@marc-hb
Copy link
Collaborator

marc-hb commented Jun 16, 2022

New KERNEL_CHECKPOINT discussion in

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants