-
Notifications
You must be signed in to change notification settings - Fork 349
[Not For Merge] PR to demonstrate trace crash issue #4759
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Closed
Closed
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
marc-hb
added a commit
to marc-hb/sof
that referenced
this pull request
Sep 14, 2021
As reported in thesofproject#4759, thesofproject#4636 and a few others linked from there. Signed-off-by: Marc Herbert <marc.herbert@intel.com>
marc-hb
added a commit
to marc-hb/sof
that referenced
this pull request
Sep 14, 2021
This reverts commit 7df3674. This restores the ability to use CONFIG_TRACEM (copy everything to mailbox) without crashing, in other words it fixes thesofproject#4699 This also fixes the other DSP panic thesofproject#4676 and removes the need for logical changes in thesofproject#4678, which can be reverted too. commit 7df3674 ("trace: enable trace after it is ready") was meant to fix a crash when tr_xxx() was used early. However I've used very early tracing for months and it never caused any crash (see thesofproject#4334) I tried adding a tr_err() statement immediately after trace_init(sof) in primary_core_init() and it works just fine. primary_core_init() runs extremely early so I don't think it's too demanding not to use an tr_XXX() before the trace even exists. The reverted commits confused initializing and enabling. Reproduction thesofproject#4683 did not seem to demonstrate anything obvious, there's not even a link to a failed test run. I don't understand how playing with spin locks is relevant to this. Later, reproduction thesofproject#4759 finally demonstrated the real issue: through DEBUG_TRACE_PTR(), some tr_XXX() can indeed be called (in very unusal debug circumstances specific to the original author) before the trace is initialized. The previous commit in this series fixes that by simply guarding it with if(trace_get()) -------- I am _not_ pretending that these reverts make the tracing code bug-free and perfect again, absolutely not and very far from it. I'm merely saying that: - The first reverted commit caused at least two regressions: thesofproject#4676 and thesofproject#4699 - These two commits added yet another variable (time) in an already complex situation with an already existing combinatorial "explosion": compile-time Kconfigs, run-time settings, platform-specific bugs (thesofproject#4333, thesofproject#4573, ...), various races, mbox + DMA, different DMA engines, Zephyr vs XTOS, etc. - Last but not least, we don't want to invest in making the exist trace implementation better. We want to switch to the Zephyr implementation instead So let's go back to a previous known good state, I mean _relatively_ good and stay there if we can. Signed-off-by: Marc Herbert <marc.herbert@intel.com>
lgirdwood
pushed a commit
that referenced
this pull request
Sep 15, 2021
This reverts commit 7df3674. This restores the ability to use CONFIG_TRACEM (copy everything to mailbox) without crashing, in other words it fixes #4699 This also fixes the other DSP panic #4676 and removes the need for logical changes in #4678, which can be reverted too. commit 7df3674 ("trace: enable trace after it is ready") was meant to fix a crash when tr_xxx() was used early. However I've used very early tracing for months and it never caused any crash (see #4334) I tried adding a tr_err() statement immediately after trace_init(sof) in primary_core_init() and it works just fine. primary_core_init() runs extremely early so I don't think it's too demanding not to use an tr_XXX() before the trace even exists. The reverted commits confused initializing and enabling. Reproduction #4683 did not seem to demonstrate anything obvious, there's not even a link to a failed test run. I don't understand how playing with spin locks is relevant to this. Later, reproduction #4759 finally demonstrated the real issue: through DEBUG_TRACE_PTR(), some tr_XXX() can indeed be called (in very unusal debug circumstances specific to the original author) before the trace is initialized. The previous commit in this series fixes that by simply guarding it with if(trace_get()) -------- I am _not_ pretending that these reverts make the tracing code bug-free and perfect again, absolutely not and very far from it. I'm merely saying that: - The first reverted commit caused at least two regressions: #4676 and #4699 - These two commits added yet another variable (time) in an already complex situation with an already existing combinatorial "explosion": compile-time Kconfigs, run-time settings, platform-specific bugs (#4333, #4573, ...), various races, mbox + DMA, different DMA engines, Zephyr vs XTOS, etc. - Last but not least, we don't want to invest in making the exist trace implementation better. We want to switch to the Zephyr implementation instead So let's go back to a previous known good state, I mean _relatively_ good and stay there if we can. Signed-off-by: Marc Herbert <marc.herbert@intel.com>
marc-hb
added a commit
to marc-hb/sof
that referenced
this pull request
Sep 21, 2021
As reported in thesofproject#4759, thesofproject#4636 and a few others linked from there. Signed-off-by: Marc Herbert <marc.herbert@intel.com> (cherry picked from commit 3ff1dc0)
marc-hb
added a commit
to marc-hb/sof
that referenced
this pull request
Sep 21, 2021
This reverts commit 7df3674. This restores the ability to use CONFIG_TRACEM (copy everything to mailbox) without crashing, in other words it fixes thesofproject#4699 This also fixes the other DSP panic thesofproject#4676 and removes the need for logical changes in thesofproject#4678, which can be reverted too. commit 7df3674 ("trace: enable trace after it is ready") was meant to fix a crash when tr_xxx() was used early. However I've used very early tracing for months and it never caused any crash (see thesofproject#4334) I tried adding a tr_err() statement immediately after trace_init(sof) in primary_core_init() and it works just fine. primary_core_init() runs extremely early so I don't think it's too demanding not to use an tr_XXX() before the trace even exists. The reverted commits confused initializing and enabling. Reproduction thesofproject#4683 did not seem to demonstrate anything obvious, there's not even a link to a failed test run. I don't understand how playing with spin locks is relevant to this. Later, reproduction thesofproject#4759 finally demonstrated the real issue: through DEBUG_TRACE_PTR(), some tr_XXX() can indeed be called (in very unusal debug circumstances specific to the original author) before the trace is initialized. The previous commit in this series fixes that by simply guarding it with if(trace_get()) -------- I am _not_ pretending that these reverts make the tracing code bug-free and perfect again, absolutely not and very far from it. I'm merely saying that: - The first reverted commit caused at least two regressions: thesofproject#4676 and thesofproject#4699 - These two commits added yet another variable (time) in an already complex situation with an already existing combinatorial "explosion": compile-time Kconfigs, run-time settings, platform-specific bugs (thesofproject#4333, thesofproject#4573, ...), various races, mbox + DMA, different DMA engines, Zephyr vs XTOS, etc. - Last but not least, we don't want to invest in making the exist trace implementation better. We want to switch to the Zephyr implementation instead So let's go back to a previous known good state, I mean _relatively_ good and stay there if we can. Signed-off-by: Marc Herbert <marc.herbert@intel.com> (cherry picked from commit f2c13f5)
keyonjie
pushed a commit
that referenced
this pull request
Nov 2, 2021
This reverts commit 7df3674. This restores the ability to use CONFIG_TRACEM (copy everything to mailbox) without crashing, in other words it fixes #4699 This also fixes the other DSP panic #4676 and removes the need for logical changes in #4678, which can be reverted too. commit 7df3674 ("trace: enable trace after it is ready") was meant to fix a crash when tr_xxx() was used early. However I've used very early tracing for months and it never caused any crash (see #4334) I tried adding a tr_err() statement immediately after trace_init(sof) in primary_core_init() and it works just fine. primary_core_init() runs extremely early so I don't think it's too demanding not to use an tr_XXX() before the trace even exists. The reverted commits confused initializing and enabling. Reproduction #4683 did not seem to demonstrate anything obvious, there's not even a link to a failed test run. I don't understand how playing with spin locks is relevant to this. Later, reproduction #4759 finally demonstrated the real issue: through DEBUG_TRACE_PTR(), some tr_XXX() can indeed be called (in very unusal debug circumstances specific to the original author) before the trace is initialized. The previous commit in this series fixes that by simply guarding it with if(trace_get()) -------- I am _not_ pretending that these reverts make the tracing code bug-free and perfect again, absolutely not and very far from it. I'm merely saying that: - The first reverted commit caused at least two regressions: #4676 and #4699 - These two commits added yet another variable (time) in an already complex situation with an already existing combinatorial "explosion": compile-time Kconfigs, run-time settings, platform-specific bugs (#4333, #4573, ...), various races, mbox + DMA, different DMA engines, Zephyr vs XTOS, etc. - Last but not least, we don't want to invest in making the exist trace implementation better. We want to switch to the Zephyr implementation instead So let's go back to a previous known good state, I mean _relatively_ good and stay there if we can. Signed-off-by: Marc Herbert <marc.herbert@intel.com> (cherry picked from commit f2c13f5)
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
@marc-hb as my branch of previous PR #4683 was used for another PR, here create a new PR for your reference.
Hope this help!