-
Notifications
You must be signed in to change notification settings - Fork 349
trace: enable trace after it is ready #4636
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
src/trace/dma-trace.c
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So, calling dma_trace_init_complete will turn the tracing on BUT calling trace_init will not turn the tracing ON.
I feel that the normal sequence of enabling tracing should be:
dma_trace_init_complete()
trace_on()
I think is called DMA trace.
And for the other trace we shold do:
trace_init()
trace_on()
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
good point, moving and unifying the trace_on() invoking to the end of primary_core_init() looks better to me.
lgirdwood
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@dbaluta good for you now ? Want to get this in the v1.9 stable branch
|
@keyonjie seeing this in the CI for CMl Will rerun CI tests to rule out script or DUT. |
|
SOFCI TEST |
|
@lgirdwood looks the failure on CML module unload/reload is valid, let me try to figure out it tomorrow. |
|
@lgirdwood looks good to me. When do you plan to create the v1.9 branch/tag? Good to merge once @keyonjie sorts out the CI failure. |
|
@lgirdwood @dbaluta looks the CI failure is not related, just rebased and force push to trigger a re-test. |
|
@keyonjie looks like a null pointer deref race after latest CI reults. |
Sorry, now done (everyone is on holiday so my workload is high), we have the stable-v1.9 branch, there are a couple of fixes pending though prior to rc1. I've no issues if you need to do a rc1 tag if needed. The fixes Intel fixes will probably land next week (which I could tag as rc2). |
|
Thanks, no need for tagging on our side. I will go with your pace for tagging and RCs. |
If the log tracing (e.g. tr_err()) is called before the trace itself is available, the FW will crash and FW boot fail happen. Enable the trace after it is ready, and don't try to perform tracing when it is unavailable. We have the empty version of trace_init/on(), so the extra "#ifdef" in primary_core_init() is superfluous. Signed-off-by: Keyon Jie <yang.jie@linux.intel.com>
That's not my experience; I've seen tr_xxx() called thousands of times before the trace was initialized and this never caused any issue because the code was smart enough not to do anything in this case. See for instance PR #4334 What crashed exactly and where? Bug fixes should fix well identified bug sequences, not just "something crashes" |
|
... and now the trace is crashing!! See #4676 report. Revert this ASAP? |
Easy: merging bug fixes for bugs that have no description, no test and no reproduction steps. |
I hit this during the memory allocation debugging, let me submit a PR to demonstrate it. |
I just recheck the CI result of the PR, we didn't capture failure on multiple platforms but it did observe the issue on ADLP_RVP_NOCODEC, maybe the "always failure" on _ZEPHYR platforms had reduced attentions from reviewers and it got slipped. |
|
Another regression in #4699 |
|
I don't know that yet, I'm just getting back into it. I was just connecting github dots above. |
I don't understand what this means. If you look at the sof-logger results in some random, recent PR you can see that the logger.etrace header is present on every platform: https://sof-ci.01.org/sofpr/PR4701/build10153/devicetest/ |
|
@marc-hb Mailbox trace being disabled by default means that etrace will not get the regular traces, although error traces might go through it anyway. That's the default behaviour I observe on my end. |
|
Yes that's correct, I remember now. I had forgotten that very poor name in the Kconfig interface, if someone had said TRACEM instead I would have understood immediately :-) |
As reported in thesofproject#4759, thesofproject#4636 and a few others linked from there. Signed-off-by: Marc Herbert <marc.herbert@intel.com>
This shows that thesofproject#4636 did not fix any real-world issue. thesofproject#4636 changed some sof->trace logic, however sof->trace does not exist before trace_init() and calling tr_err() immediately after trace_init() works. Note DSP panics were detected immediately after thesofproject#4636 was merged, see reports in thesofproject#4676 Signed-off-by: Marc Herbert <marc.herbert@intel.com>
|
Finally submitted a revert in #4760 |
As reported in thesofproject#4759, thesofproject#4636 and a few others linked from there. Signed-off-by: Marc Herbert <marc.herbert@intel.com> (cherry picked from commit 3ff1dc0)
If the log tracing (e.g. tr_err()) is called before the trace itself is
available, the FW will crash and FW boot fail happen.
Enable the trace after it is ready, and don't try to perform tracing
when it is unavailable.
Signed-off-by: Keyon Jie yang.jie@linux.intel.com