Skip to content

Conversation

@marc-hb
Copy link
Collaborator

@marc-hb marc-hb commented Sep 13, 2021

Purely for extra testing of #4760

@keyonjie
Copy link
Contributor

@marc-hb as my branch of previous PR #4683 was used for another PR, here create a new PR for your reference.

Please see the boot failure on APL platforms.

Hope this help!

#4759

@marc-hb marc-hb requested a review from plbossart September 14, 2021 01:48
As reported in thesofproject#4759, thesofproject#4636 and a few others linked from there.

Signed-off-by: Marc Herbert <marc.herbert@intel.com>
This reverts commit 7df3674.

This restores the ability to use CONFIG_TRACEM (copy everything to
mailbox) without crashing, in other words it fixes thesofproject#4699

This also fixes the other DSP panic thesofproject#4676 and removes the need for
logical changes in thesofproject#4678, which can be reverted too.

commit 7df3674 ("trace: enable trace after it is ready") was meant
to fix a crash when tr_xxx() was used early. However I've used very
early tracing for months and it never caused any crash (see thesofproject#4334)

I tried adding a tr_err() statement immediately after trace_init(sof) in
primary_core_init() and it works just fine. primary_core_init() runs
extremely early so I don't think it's too demanding not to use an
tr_XXX() before the trace even exists.

The reverted commits confused initializing and enabling.

Reproduction thesofproject#4683 did not seem to demonstrate anything obvious,
there's not even a link to a failed test run. I don't understand how
playing with spin locks is relevant to this.

Later, reproduction thesofproject#4759 finally demonstrated the real issue: through
DEBUG_TRACE_PTR(), some tr_XXX() can indeed be called (in very unusal
debug circumstances specific to the original author) before the trace is
initialized. The previous commit in this series fixes that by simply
guarding it with if(trace_get())

     --------

I am _not_ pretending that these reverts make the tracing code bug-free
and perfect again, absolutely not and very far from it. I'm merely
saying that:

- The first reverted commit caused at least two regressions: thesofproject#4676 and
  thesofproject#4699

- These two commits added yet another variable (time) in an already
  complex situation with an already existing combinatorial "explosion":
  compile-time Kconfigs, run-time settings, platform-specific bugs
  (thesofproject#4333, thesofproject#4573, ...), various races, mbox + DMA, different DMA engines,
  Zephyr vs XTOS, etc.

- Last but not least, we don't want to invest in making the exist trace
  implementation better. We want to switch to the Zephyr implementation
  instead

So let's go back to a previous known good state, I mean _relatively_
good and stay there if we can.

Signed-off-by: Marc Herbert <marc.herbert@intel.com>
This reverts commit 89ec377.

As commit 7df3674 ("trace: enable trace after it is ready") is
reverted this is not required anymore. See long previous commit message.

Signed-off-by: Marc Herbert <marc.herbert@intel.com>
This shows that thesofproject#4636 did not fix any real-world issue. thesofproject#4636 changed
some sof->trace logic, however sof->trace does not exist before
trace_init() and calling tr_err() immediately after trace_init() works.

Note DSP panics were detected immediately after thesofproject#4636 was merged, see
reports in thesofproject#4676

Signed-off-by: Marc Herbert <marc.herbert@intel.com>
No crash when guarded with if(trace_get())

Signed-off-by: Marc Herbert <marc.herbert@intel.com>
@marc-hb
Copy link
Collaborator Author

marc-hb commented Sep 14, 2021

https://sof-ci.01.org/sofpr/PR4758/build10325/devicetest/?model=BDW_WSB_RT286&testcase=check-sof-logger is the old stuck DMA issue #4333. Everything else is green and shows SUPER EARLY trace does not crash

Example:

https://sof-ci.01.org/sofpr/PR4758/build10325/devicetest/?model=TGLU_RVP_SDW&testcase=check-sof-logger

 TIMESTAMP      (us)           DELTA  C# COMPONENT          LOCATION                      CONTENT	ktime=102.940s  @  2021-09-14 05:35:02 +0000 UTC
[           2.604] (           0.000) c0 buffer                      src/init/init.c:146  ERROR SUPER EARLY trace does not crash
[   191893022.427] (   191893024.000) c0 dma-trace             src/trace/dma-trace.c:386  INFO SHM: FW ABI 0x3013000 DBG ABI 0x5003000 tag v1.9-rc1-9-g8d7546172218 src hash 0x608e882a (ldc hash 0x608e882a)
Skipped 8124 bytes after the last statement.

Older https://sof-ci.01.org/sofpr/PR4758/build10320/devicetest/ is all green and tests show the expected in the mailbox trace:

 TIMESTAMP      (us)           DELTA  C# COMPONENT          LOCATION                      CONTENT	ktime=131.041s  @  2021-09-13 23:50:30 +0000 UTC
[           2.604] (           0.000) c0 buffer                      src/init/init.c:146  ERROR SUPER EARLY trace does not crash
[   124791585.354] (   124791584.000) c0 dma-trace             src/trace/dma-trace.c:386  INFO SHM: FW ABI 0x3013000 DBG ABI 0x5003000 tag v1.9-rc1-7-g1ad368b06eaa src hash 0x00e0bd9a (ldc hash 0x00e0bd9a)
Skipped 8124 bytes after the last statement.

More DEBUG_TRACE_PTR() testing in the 3rd force push.

@marc-hb
Copy link
Collaborator Author

marc-hb commented Sep 14, 2021

https://sof-ci.01.org/sofpr/PR4758/build10327/devicetest/ is all green and is has the expected traces, for instance in https://sof-ci.01.org/sofpr/PR4758/build10327/devicetest/?model=CML_RVP_SDW&testcase=check-sof-logger

 TIMESTAMP      (us)           DELTA  C# COMPONENT          LOCATION                      CONTENT	ktime=185.422s  @  2021-09-14 06:47:01 +0000 UTC
[   225575306.974] (           0.000) c0 dma-trace             src/trace/dma-trace.c:392  INFO DMA: FW ABI 0x3013000 DBG ABI 0x5003000 tag v1.9-rc1-9-ge3234fe475ec src hash 0xdbe1afa7 (ldc hash 0xdbe1afa7)
[          14.115] (          14.115) c0 memory                      src/lib/alloc.c:787  INFO MARC: allocated ptr 0xbe090300 0x1800 bytes at zone 0x0
[         169.844] (         155.729) c0 ll-schedule        ./schedule/ll_schedule.c:390  INFO task add 0x9e17e340 dma-trace-task <2b972272-c5b1-4b7e-926f-0fc5cb4c4690>
 TIMESTAMP      (us)           DELTA  C# COMPONENT          LOCATION                      CONTENT	ktime=187.450s  @  2021-09-14 06:47:03 +0000 UTC
[           2.604] (           0.000) c0 buffer                      src/init/init.c:146  ERROR SUPER EARLY trace does not crash
[   225575294.109] (   225575296.000) c0 dma-trace             src/trace/dma-trace.c:386  INFO SHM: FW ABI 0x3013000 DBG ABI 0x5003000 tag v1.9-rc1-9-ge3234fe475ec src hash 0xdbe1afa7 (ldc hash 0xdbe1afa7)
Skipped 8124 bytes after the last statement.

@marc-hb marc-hb changed the title [DRAFT][TEST] Super early trace [DRAFT][TEST][DNM] Super early trace Sep 14, 2021
@marc-hb marc-hb closed this Sep 20, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants