-
Notifications
You must be signed in to change notification settings - Fork 350
DMA: Fix fw panic after release on SdW platforms #2673
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
d954a81 to
8f6ff11
Compare
Bug was caused because sdw controller is stopped first, then DSP is stopped, so DW FIFO will never be consumed, so timeout occurs, and watch dog will reset hardware. Moreover polling for FIFO empty in duch a place should have positive result only when pause take shorten than 1ms what is not reasonable value. Signed-off-by: Karol Trzcinski <karolx.trzcinski@linux.intel.com>
8f6ff11 to
f01e607
Compare
lgirdwood
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't the fix be stopping DMA then SDW ? What do we do with DMIC and SSP ?
@plbossart would that be a good solution to you in your eyes? I'm not entirely sure if we could guarantee correct flow if we were to do that. |
|
@slawblauciak @ktrzcinx we need to have something consistent and not reorder the ops depending on DAI. e.g.
Can you confirm the stop flow today. |
|
Thing is, the SDW controller is managed on the host side. So, this really has to be synched with kernel. |
|
@slawblauciak which parts of SDW are controlled by host and which are FW ? |
|
The entirety of SDW is controlled by host. The FW only handles the DMA. |
|
@slawblauciak I'm assuming DW DMA is used here ? The programming partitioning flow with host would be
|
|
That's right, we use DW DMA for SDW/ALH. |
|
@slawblauciak can you confirm the current flow like 1 - 6 above, it may mean we need to make driver and FW changes here. |
|
Yeah, I think that's the kind of flow we'd want here. |
|
Sorry I don't understand this entire thread. We ALREADY stop the DMA before stopping the SoundWire transfers, and that creates a pop noise. thesofproject/linux#1897 |
|
@plbossart so does thesofproject/linux#1897 fix the pop issue for you ? |
No, the RT1308 will deal with this. What it can't deal with is a sustained output to -1 as is currently the case.
I have no idea how this error happens since it's different from what is being used today. Unless this is with the python scripts? At any rate I started an Intel internal thread on the recommended programming sequence. let's not continue here until we know what direction to take. |
|
I haven't seen any DSP panics actually. Haven't tried to reproduce the problem so far. |
ok, then the PR title is misleading. Please align internally with @plbossart, I will close this for the moment, we can re-open once solution is agreed. |
|
Uhm, I believe there's a misunderstanding, this is not my PR :) |
|
@ktrzcinx You do still get the FW panics right? If so please reopen if this is the fix you want to go forward with. |
|
the DSP panic is caused by FW for thesofproject/linux#1897 |
|
@RanderWang please provide more logs, I dont see a panic above. I only see that we cannot release a DMA channel. Please discuss internally with @plbossart |
|
This code for sure causes agent panic. Such long wait shouldn't be done with interrupts disabled. Every timeout will kill DSP immediately. I don't understand why we're trying to close this. |
|
@tlauda it was closed until internal alignment. |
@lgirdwood Pierre, @slawblauciak and me got a conclusion that we need to change the start/stop sequence for sdw. With the sequence of HDA was changed (just tested HDA, not changed. But HDA doesn't use GP-DMA), QA also reported DSP panic. All the kernel or FW logs are at https://sof-ci.01.org/linuxpr/PR1897/build3520/devicetest/CML_RVP_SDW/check-pause-resume-playback-10/ |
|
@RanderWang I've not seen any internal alignment. Please make sure I am on any email alongside @plbossart and @lbetlej . |
@lgirdwood we discussed int microsoft team. please check the picture (edited) |
|
@RanderWang this make no sense, it's unreadable. |
@lgirdwood the sequence used so far for SoundWire is broken and there's consensus to change it. thesofproject/linux#1897 was updated to use the same sequence for all DAIs, and keep the weird sequence for HDAudio - we still don't have a clue about the firmware underflows and panic issues for the HDaudio link DMA. |
lgirdwood
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ktrzcinx what happens here when the DMA channel FIFO has data when the stream is stopped ? How can the data be cleared and channel recovered ?
|
@lgirdwood As I know FIFO should be cleared after disabling DMA enable bit, what is going to be done in |
|
SOFCI TEST |
|
@xiulipan I guess Jenkins is back now ? Do we need to restart CI ? |
|
@lgirdwood I think for this PR, we may need to restart the test. |
|
SOFCI TEST |
|
CI known issues. |

Bug was caused because sdw controller is stopped first,
then DSP is stopped, so DW FIFO will never be consumed,
so timeout occurs, and watch dog will reset hardware.
Moreover polling for FIFO emty in duch a place should have positive
result only when pause take shorten than 1ms what is not
reasonable value.
Signed-off-by: Karol Trzcinski karolx.trzcinski@linux.intel.com