Skip to content

Conversation

@marc-hb
Copy link
Collaborator

@marc-hb marc-hb commented Jun 10, 2022

This should help identifying suspend/resume issues unrelated to audio

2 commits must be reviewed separately

Per rationale detailed in thesofproject#740.

Also add new $TOPDIR constant.

Absolutely zero functional change.

Signed-off-by: Marc Herbert <marc.herbert@intel.com>
@marc-hb marc-hb marked this pull request as ready for review June 10, 2022 23:00
@marc-hb marc-hb requested a review from a team as a code owner June 10, 2022 23:00
This should help identifying suspend/resume issues unrelated to audio

Signed-off-by: Marc Herbert <marc.herbert@intel.com>
@marc-hb marc-hb force-pushed the suspend-without-audio branch from 7ef3c8a to 87064e9 Compare June 13, 2022 16:40
@marc-hb
Copy link
Collaborator Author

marc-hb commented Jun 13, 2022

Unrelated snap / SELinux issue on https://sof-ci.01.org/softestpr/PR918/build49/devicetest/?model=TGLU_UP_HDA_ZEPHYR&testcase=verify-kernel-boot-log, working on it. everything else green.

Jun 13 17:06:56 sh-tglu-up-hda-04 snapd-desktop-integration.snapd-desktop-integration[1419]: cmd_run.go:1044: WARNING: cannot create user data directory: failed to verify SELinux context of /home/ubuntu/snap: exec: "matchpathcon": executable file not found in $PATH

EDIT: fixed by removing all snaps

do
sleep_once "$i"
done

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should there be a checkpoint here before reloading audio drivers?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There could be but it's not really necessary because one of the last thing sleep_once did is to check the kernel logs and die if that failed. So if the execution makes it to this point it means the kernel logs are free of errors so far.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't disagree, it's more that if you get an error while loading audio drivers it might be a bit difficult to check if the test worked until the point where you start loading driver. You'd get a fail and no intermediate information, that's it.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You will see the difference because of the different die error message; one is dump_and_die "Caught error in kernel log" whereas the other one below is die "Found kernel error after reloading audio drivers"

Setting the $KERNEL_CHECKPOINT is completely quiet so it won't help in that respect.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Think of $KERNEL_CHECKPOINT as a way to avoid errors from previous tests but inside the same test looking for errors and dying is actually a "stronger" checkpoint.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will have to trust you on that, I don't have any context here.

Copy link
Member

@plbossart plbossart left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@marc-hb
Copy link
Collaborator Author

marc-hb commented Jun 16, 2022

There is another, theoretical problem with $KERNEL_CHECKPOINT: it's not "atomic". In other words, kernel errors happening between sof-kernel-log-check.sh $KERNEL_CHECKPOINT and $KERNEL_CHECKPOINT=now are theoretically missed! This is not a real problem in practice because:

  • we usually do nothing in that window
  • we "cheat" and actually check KERNEL_CHECKPOINT - N seconds behind the scenes! Ugly workaround.

But still, journalctl performance issues and other exceptions aside I think we should generally set KERNEL_CHECKPOINT only once at the start of a test to be really sure not to miss any error.

@aiChaoSONG , @fredoh9 thoughts?

@marc-hb marc-hb merged commit 4c324a4 into thesofproject:main Jun 17, 2022
@marc-hb marc-hb deleted the suspend-without-audio branch June 17, 2022 17:37
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants