Skip to content

Conversation

@mrajwa
Copy link

@mrajwa mrajwa commented Jun 10, 2020

This patch fixes the suspend & resume procedure to allow entry into the
low power states. In order to enter low power state all active links as
well as streams need to be power down.

Signed-off-by: Marcin Rajwa marcin.rajwa@linux.intel.com

Copy link
Collaborator

@kv2019i kv2019i left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great find @mrajwa , thanks! This is a pretty big change to S0ix support. Initial comment sinline

}

/* Power up streams */
hda_trigger_ignored_streams(sdev, SNDRV_PCM_TRIGGER_START);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mrajwa This is pretty interesting. Given this has worked without the power-down on older platforms, should this be made conditional (per platform), or should we use this flow for all now? We need to review the surrounding code documentation as well (the ignored streams are not ignored anymore). @keyonjie @ranj063 , thoughts?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah, this does require a lot more explanations on why this becomes necessary and what exactly sending a trigger command does. It honestly feels like the firmware does not properly restore its state if it needs the driver to resend a command.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@kv2019i , this flow is for all platforms. It worked for older platform because of several reasons. For example, different s0ix HW qualifications and the possibility to mask certain conditions which are not available on all platforms (TGL is an example here).

@plbossart, you got confuse because the function naming is misleading... I should have rename it as well (will do in the update). In essence the snd_sof_pcm_platform_trigger() function doesn't do any triggering but powers up & down host DMA.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

well, here's the code:

	/* cmd must be for audio stream */
	switch (cmd) {
	case SNDRV_PCM_TRIGGER_RESUME:
	case SNDRV_PCM_TRIGGER_PAUSE_RELEASE:
	case SNDRV_PCM_TRIGGER_START:
		snd_sof_dsp_update_bits(sdev, HDA_DSP_HDA_BAR, SOF_HDA_INTCTL,
					1 << hstream->index,
					1 << hstream->index);

		snd_sof_dsp_update_bits(sdev, HDA_DSP_HDA_BAR,
					sd_offset,
					SOF_HDA_SD_CTL_DMA_START |
					SOF_HDA_CL_DMA_SD_INT_MASK,
					SOF_HDA_SD_CTL_DMA_START |
					SOF_HDA_CL_DMA_SD_INT_MASK);

		ret = snd_sof_dsp_read_poll_timeout(sdev,
					HDA_DSP_HDA_BAR,
					sd_offset, run,
					((run &	dma_start) == dma_start),
					HDA_DSP_REG_POLL_INTERVAL_US,
					HDA_DSP_STREAM_RUN_TIMEOUT);

All we do is enable an interrupt and start/stop the DMA. Where do you see a power-related field or register?

Copy link
Author

@mrajwa mrajwa Jun 10, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When you clear this DMA bit the DMA engine associated with this stream will be power gated so that why I wrote "Power up" since it will be the end effect.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's only if power gating is enabled.
You really want to describe the sequence and underlying behavior of the hardware. I am not saying you are wrong, but you are taking shortcuts in your explanations.

if (ret < 0) {
dev_dbg(sdev->dev,
"error %d in %s: failed to power down links",
ret, __func__);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should block suspend and return an error in this case.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@kv2019i , well this is something I was thinking about too.

int ret;
struct snd_sof_pcm *spcm;
struct snd_sof_pcm_stream *stream;
struct snd_pcm_substream *substream;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reverse-xmas order for declarations.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please fix as asked by @kv2019i


/* Power down s0ix capable streams */
hda_trigger_ignored_streams(sdev, SNDRV_PCM_TRIGGER_SUSPEND);

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, this is a variant of hda_dsp_ctrl_stop_chip() now -- there's not a lot missing in the end (position buffer is not cleared, but otherwise closed). Maybe this code could be shared with exception case for the S0ix flow.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1, this is duplicating existing code.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@kv2019i , some steps are similar but honesty this is quite different than hda_dsp_ctrl_stop_chip()

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mrajwa I agree we should unify the flow here for all suspend. I'd modify hda_suspend()/hda_resume() to account for the target state and do the above in there. All of the above are just part of hda_suspend() already

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mrajwa Hmm, it seems stop_chip() would need some rework now. So I think the cleanest way would be to group individual steps like clearing interrupt status and disabling interrupts into common helpers functions that could be called from stop_chip() as well as here. This is something hdac_controller.c already does and is cleaner. I think this goes out of this PR though, so I'm good to go ahead with the current one.

Copy link
Member

@plbossart plbossart left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mrajwa maybe this solves a problem but this patch raises a lot of questions and will need a lot more work to explain what the programming sequence needs to be, and why this is necessary for new platforms.

return 0;
}

/* Trigger streams which are ignored / stay active during suspend.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't understand this sentence, what does this '/' represent or mean?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It reads "..streams which are ignored or saying it differently "stay active" during suspend shall be...

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Trigger streams which are ignored during suspend and remain active

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

comment remains valid...


/* Trigger streams which are ignored / stay active during suspend.
* These streams needs to be powered down and up during suspend and resume
* respectively.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we don't have power control for streams, I don't understand this comment either.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We do, all we do here is an update of HD Audio Stream Registers.

if (ret < 0) {
dev_dbg(sdev->dev,
"error %d in %s: failed to trigger substream %s",
ret, __func__, substream->name);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we do a loop for the stream direction? the code seems to be an exact duplication and we have macros to go through all PLAYBACK and CAPTURE.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@plbossart do you consider this a duplicate of snd_sof_stream_suspend_ignored()? If so we can not use it here because you need to pass substream to ``snd_sof_pcm_platform_trigger` if you loop throughout all PLAYBACK & CAPTURE streams how can you distinguish which one should be triggered?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was only talking of using a for loop here.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mrajwa @plbossart is referring to for_each_pcm_streams()

ret, __func__);
}

/* Power up streams */
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

a trigger has nothing to do with power.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

have you checked what snd_sof_pcm_platform_trigger actually do? It does power down/up streams depending on command passed.

hda_codec_i915_display_power(sdev, true);

snd_hdac_bus_init_cmd_io(bus);
ret = snd_hdac_ext_bus_link_power_up_all(bus);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you explain why this is necessary? this may end-up resetting some things and conflict with other resume sequences.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is needed because of very simple reason - in order to enter low power state you have to have all links power gated. Your concern is reasonable however please note that during suspend with "active" pipelines any other link is down. Therefore there will be no resume issue.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mrajwa why power up all? And if you do power up all, dont you have to power down the ones that werent used before suspend?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because we have power down all at suspend. If we resume from suspend and some additional stream will be started its link must be ready as our trigger procedure doesn't care about links.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mrajwa streams arent going to be magically started upon resuming. So we should power down the links that werent used before suspend and only power up the ones needed. The new ones that are started after resuming will go through the normal route for powering up the link

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

here's the reference to what we do for the S3 suspend case:

/* turn off the links that were off before suspend */

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the logic is right, the links needs to be powered down via LCTL.SPA at S0ix suspend. But @mrajwa , I agree with @ranj063 , we need to mimic the logic in hda_resume() and shut down any links that were not active when we entered suspend. Otherwise S0ix resume will result in more links being powered on than what were powered on at S0ix suspend, and this is not right. But copying the logic in hda_resume() should be sufficient.

Copy link
Author

@mrajwa mrajwa Jun 23, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm why on the resume we do /* turn off the links that were off before suspend */ What is the point of power gating links on resume? Shouldn't that be the part of suspend?

PS: Also, this comment and whole action seems weird to me - we turn off links that are already off? oO

Copy link
Collaborator

@ranj063 ranj063 Jun 23, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mrajwa whats weird to you? You power up all links upon resume then power down the ones that were not powered up before suspend? Seems totally logical to me

}

/* Power up streams */
hda_trigger_ignored_streams(sdev, SNDRV_PCM_TRIGGER_START);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah, this does require a lot more explanations on why this becomes necessary and what exactly sending a trigger command does. It honestly feels like the firmware does not properly restore its state if it needs the driver to resend a command.


/* Power down s0ix capable streams */
hda_trigger_ignored_streams(sdev, SNDRV_PCM_TRIGGER_SUSPEND);

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1, this is duplicating existing code.

SOF_HDA_CL_DMA_SD_INT_MASK);
}

/* Power down s0ix capable streams */
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

again a trigger is not related to power management.

* These streams needs to be powered down and up during suspend and resume
* respectively.
*/
static void hda_trigger_ignored_streams(struct snd_sof_dev *sdev, int cmd)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mrajwa this whole function seems to be in the wrong place. Why not just do platform_trigger() when the component driver receives the SUSPEND trigger in pcm.c? Instead of returning as below , do the platform trigger and then return?

return 0;

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As we discussed earlier, the problem is that during entry to suspend application detects that and tries to resume the stream. So if you place the code into the pcm.c (which I agree is tempting, I myself started from that place) your suspend will last few milliseconds instead of the time until detection happens.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Humm, what? The rule in power management is that once you commit to a transaction you complete it to avoid zombie states.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Corrected that with new patch set

@plbossart
Copy link
Member

@mrajwa failed build.

@mrajwa
Copy link
Author

mrajwa commented Jun 11, 2020

@plbossart, yes nocodec built is failing I will fix that alongside next update.

if (ret < 0) {
dev_dbg(sdev->dev,
"error %d in %s: failed to trigger substream %s",
ret, __func__, substream->name);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mrajwa @plbossart is referring to for_each_pcm_streams()

@mrajwa
Copy link
Author

mrajwa commented Jun 17, 2020

Further updated

Copy link
Member

@plbossart plbossart left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

NAK on the addition of PCM_INFO_RESUME.

This was removed on purpose after very long threads with Takashi Iwai, and shall only be used when we can guarantee a sample-accurate restart when resuming. This is not the case and shall not be used on any Intel platform.

@mrajwa
Copy link
Author

mrajwa commented Jun 17, 2020

@plbossart without PCM_INFO_RESUME the flow is broken and works only according to current implementation of arecord.

Simple example: let's say you have whatever application, you suspend the system and during that initial suspend steps the application keeps sending RESUME request - with current implementation the second request will succeed. It only works because arecord currently sends that request once..

@plbossart
Copy link
Member

@plbossart without PCM_INFO_RESUME the flow is broken and works only according to current implementation of arecord.

Simple example: let's say you have whatever application, you suspend the system and during that initial suspend steps the application keeps sending RESUME request - with current implementation the second request will succeed. It only works because arecord currently sends that request once..

sorry man, it's not negotiable. When INFO_RESUME is not supported, the ALSA core will send a PREPARE and TRIGGER_START. It's fully documented and is the right way to do this.

@mrajwa
Copy link
Author

mrajwa commented Jun 17, 2020

@plbossart, heh okay I see you are very convinced to this approach so I will remove my changes related to INFO_RESUME. However I am afraid it may come back to us at some point.

I've simplified my patch as much as possible while keeping the s0ix functional.

@mrajwa mrajwa force-pushed the s0ix_entry_issue branch 2 times, most recently from df68ff2 to af52ec3 Compare June 17, 2020 09:29
@keyonjie
Copy link

@mrajwa can you try if simplify it more as below can also fix the issue:

diff --git a/sound/soc/sof/intel/hda-dsp.c b/sound/soc/sof/intel/hda-dsp.c
index 9e5ff8c18f99..e093c0aee3d2 100644
--- a/sound/soc/sof/intel/hda-dsp.c
+++ b/sound/soc/sof/intel/hda-dsp.c
@@ -692,6 +692,10 @@ int hda_dsp_resume(struct snd_sof_dev *sdev)
 {
        struct sof_intel_hda_dev *hda = sdev->pdata->hw_pdata;
        struct pci_dev *pci = to_pci_dev(sdev->dev);
+#if IS_ENABLED(CONFIG_SND_SOC_SOF_HDA)
+       struct hdac_bus *bus = sof_to_bus(sdev);
+       struct hdac_ext_link *hlink;
+#endif
        const struct sof_dsp_power_state target_state = {
                .state = SOF_DSP_PM_D0,
                .substate = SOF_HDA_DSP_PM_D0I0,
@@ -719,6 +723,18 @@ int hda_dsp_resume(struct snd_sof_dev *sdev)
                /* restore and disable the system wakeup */
                pci_restore_state(pci);
                disable_irq_wake(pci->irq);
+
+#if IS_ENABLED(CONFIG_SND_SOC_SOF_HDA)
+               /* turn on the links that were on before suspend */
+               list_for_each_entry(hlink, &bus->hlink_list, list) {
+                       if (hlink->ref_count)
+                               snd_hdac_ext_bus_link_power_up(hlink);
+               }
+
+               /* set up CORB/RIRB buffers if was on before suspend */
+               if (bus->cmd_dma_state)
+                       snd_hdac_bus_init_cmd_io(bus);
+#endif
                return 0;
        }
 
@@ -808,6 +824,14 @@ int hda_dsp_suspend(struct snd_sof_dev *sdev, u32 target_state)
                                                HDA_VS_INTEL_EM2_L1SEN,
                                                HDA_VS_INTEL_EM2_L1SEN);
 
+#if IS_ENABLED(CONFIG_SND_SOC_SOF_HDA)
+               /* power down all hda link */
+               snd_hdac_ext_bus_link_power_down_all(bus);
+
+               /* stop the CORB/RIRB DMA if it is On */
+               if (bus->cmd_dma_state)
+                       snd_hdac_bus_stop_cmd_io(bus);
+#endif
                /* enable the system waking up via IPC IRQ */
                enable_irq_wake(pci->irq);
                pci_save_state(pci);

@plbossart
Copy link
Member

@plbossart, heh okay I see you are very convinced to this approach so I will remove my changes related to INFO_RESUME. However I am afraid it may come back to us at some point.

I've simplified my patch as much as possible while keeping the s0ix functional.

I am not clear on what you are referring to. The system suspend/resume is handled when we get a callback from ACPI, along with the intended target state. I don't understand the points you made since last week about application behavior.

@plbossart
Copy link
Member

@plbossart, heh okay I see you are very convinced to this approach so I will remove my changes related to INFO_RESUME. However I am afraid it may come back to us at some point.
I've simplified my patch as much as possible while keeping the s0ix functional.

I am not clear on what you are referring to. The system suspend/resume is handled when we get a callback from ACPI, along with the intended target state. I don't understand the points you made since last week about application behavior.

see alsa-lib documentation:

\par SND_PCM_STATE_SUSPENDED
The device is in the suspend state provoked with the power management
system. The stream can be resumed using #snd_pcm_resume()
call, but not all hardware supports this feature. Application should check
the capability with the #snd_pcm_hw_params_can_resume().
In other case, the calls #snd_pcm_prepare(),
#snd_pcm_drop(), #snd_pcm_drain() can be used
to leave this state.

@mrajwa
Copy link
Author

mrajwa commented Jun 17, 2020

@keyonjie, as for your proposal - no it won't work, it is against HW design which requires links to be power gated after CORB/RIRB/DMA.

@plbossart, the problem I see is that we "lie" to the application. As you said/quoted we don't support resume so application sends PREPARE followed by TRIG_START commands. We detects that the stream they are referring to is the one which should remain active until we get a detection event (a callback from ACPI) and here we return 0 (success) - while w haven't resumed, we are still in suspend. I think the better approach is to return EAGAIN for resume requests.

The bottom line is, while it works I am afraid other application may have some problems with it. I think it is worth to put that in sod-docs - maybe in WoW integration chapter

BTW, can you paste a link to the doc you quoted above?

@plbossart
Copy link
Member

@keyonjie, as for your proposal - no it won't work, it is against HW design which requires links to be power gated after CORB/RIRB/DMA.

@plbossart, the problem I see is that we "lie" to the application. As you said/quoted we don't support resume so application sends PREPARE followed by TRIG_START commands. We detects that the stream they are referring to is the one which should remain active until we get a detection event (a callback from ACPI) and here we return 0 (success) - while w haven't resumed, we are still in suspend. I think the better approach is to return EAGAIN for resume requests.

The bottom line is, while it works I am afraid other application may have some problems with it. I think it is worth to put that in sod-docs - maybe in WoW integration chapter

BTW, can you paste a link to the doc you quoted above?

We do support resume, but the flow requires a prepare and trig-start. It's perfectly legal and applications shall support this mode. I don't understand what is broken here.

@mrajwa
Copy link
Author

mrajwa commented Jun 17, 2020

We do support resume, but the flow requires a prepare and trig-start.

What flow? The resume flow doesn't need this. This prepare and trig_start is a side action performed by "confused" application.
Anyway, we drifted away from the main subject, are you okay with my recent changes so we can proceed with merge?

@plbossart
Copy link
Member

plbossart commented Jun 17, 2020

We do support resume, but the flow requires a prepare and trig-start.

What flow? The resume flow doesn't need this. This prepare and trig_start is a side action performed by "confused" application.
Anyway, we drifted away from the main subject, are you okay with my recent changes so we can proceed with merge?

No I am not ok. You are ignoring ALSA precedent, making statements that are blatantly incorrect so why should I even trust the rest? Why are you focused on application behavior and why is this even related to S0i1 entry?

@mrajwa
Copy link
Author

mrajwa commented Jun 17, 2020

@plbossart, ok you are right - this is only hypothetical discussion therefore I agree lets end this.
Now lets focus on the main subject of this PR - the change in suspend/resume from hardware perspective. Here the flow has to be changed to enbale s0ix entry. Are you ok with changes this patch introduces?

Copy link
Member

@plbossart plbossart left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The lack of symmetry between the suspend and resume sequences is confusing. Either it's intentional and you have to document it, or it's not and it should be fixed.

return ret;
}
#endif

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you clarify if there is any specific reason why the first thing you do is power-up the link and enable RIRB, before you've changed the DSP power and restored the state?

If it was symmetrical with the suspend routine, you would first restore PCI state and disable wakes before re-initializing the links?

Either it's symmetrical by default, or it's not on purpose and you need to document why.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@plbossart Could be just that the surrounding existing is not symmetrical either. I.e. we change the i915_display_codec power state before changing DSP state, but order is same for suspend/resume. For i915 this is ok, but for the HDA shutdown in this patch, I think the ordering is right.

snd_sof_dsp_write(sdev, HDA_DSP_HDA_BAR,
sd_offset +
SOF_HDA_ADSP_REG_CL_SD_STS,
SOF_HDA_CL_DMA_SD_INT_MASK);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this also not symmetrical, on resume you don't touch the stream status. Is this intentional and if yes please document it in the resume sequence.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes I don't clear it on resume as there were no activity of the stream in between suspend and resume. As for documenting it - do you want me to write a comment on a resume saying that we don't clear stream status because of lack of activity during suspend?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was suggesting a comment such as:

"on a suspend, the stream status is cleared but the DMA is not stopped. During the resume operation, the TRIGGER_START will take care of reprogramming the stream status"

that said, I no longer understand this patch.

int hda_dsp_stream_trigger(struct snd_sof_dev *sdev

	case SNDRV_PCM_TRIGGER_SUSPEND:
	case SNDRV_PCM_TRIGGER_PAUSE_PUSH:
	case SNDRV_PCM_TRIGGER_STOP:
		snd_sof_dsp_update_bits(sdev, HDA_DSP_HDA_BAR,
					sd_offset,
					SOF_HDA_SD_CTL_DMA_START |
					SOF_HDA_CL_DMA_SD_INT_MASK, 0x0);

		ret = snd_sof_dsp_read_poll_timeout(sdev, HDA_DSP_HDA_BAR,
						sd_offset, run,
						!(run &	dma_start),
						HDA_DSP_REG_POLL_INTERVAL_US,
						HDA_DSP_STREAM_RUN_TIMEOUT);

		if (ret < 0) {
			dev_err(sdev->dev,
				"error: %s: cmd %d: timeout on STREAM_SD_OFFSET read\n",
				__func__, cmd);
			return ret;
		}

		snd_sof_dsp_write(sdev, HDA_DSP_HDA_BAR, sd_offset +
				  SOF_HDA_ADSP_REG_CL_SD_STS,
				  SOF_HDA_CL_DMA_SD_INT_MASK);

The last part is the same as what you are doing in hda-dsp.c, so why do we need to do this in the suspend routine?

Copy link
Author

@mrajwa mrajwa Jun 18, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What you miss is that we have done steps you quoted above only for streams that are not d0i3 capable. For those d0i3 capable, like WoV pipelines we don't call hda_dsp_stream_trigger() at all. Check this out

if (sdev->system_suspend_target == SOF_SUSPEND_S0IX &&
    spcm->stream[substream->stream].d0i3_compatible) {
			/*
			 * trap the event, not sending trigger stop to
			 * prevent the FW pipelines from being stopped,
			 * and mark the flag to ignore the upcoming DAPM
			 * PM events.
			 */
			spcm->stream[substream->stream].suspend_ignored = true;
			return 0;
}

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well then the TRIGGER_START does not reprogram the stream either, so how can this possibly work?

Put differently, with your sequence, where is the stream status restored?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, as there is nothing to restore - it's a 'status' register right? What would you like to restore?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I give up.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@plbossart I think the missing piece here might be the missing stream DMA stop and restart. The sequence should have a stop DMA followed by clearing the stream status during suspend and then restarting the DMA during resume.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ranj063 this really begs the question if we should not deal with all suspend/resume in platform-specific code, and not have any references to S0ix in the core. That way we could really refactor what is common and what's not, without re-inventing an S0ix code that looks the same except where different.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@plbossart but it would be hard to not have any reference to S0ix in the core. The PCM trigger for example needs to know whether the suspend trigger is for S3 or S0ix to do the right thing. I just talked with marcin about changing the sequence here a bit to call the platform-specific trigger for S0ix to stop/start the DMA. We'll get back with the change in a bit.

Copy link
Collaborator

@kv2019i kv2019i left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm OOO tomorrow, so I won't leave a RC pending, but in short, with the recent improvements, I'm good to merge current version. I would like to see some changes to git commit message (see inline).

return ret;
}
#endif

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@plbossart Could be just that the surrounding existing is not symmetrical either. I.e. we change the i915_display_codec power state before changing DSP state, but order is same for suspend/resume. For i915 this is ok, but for the HDA shutdown in this patch, I think the ordering is right.


/* Power down s0ix capable streams */
hda_trigger_ignored_streams(sdev, SNDRV_PCM_TRIGGER_SUSPEND);

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mrajwa Hmm, it seems stop_chip() would need some rework now. So I think the cleanest way would be to group individual steps like clearing interrupt status and disabling interrupts into common helpers functions that could be called from stop_chip() as well as here. This is something hdac_controller.c already does and is cleaner. I think this goes out of this PR though, so I'm good to go ahead with the current one.

struct sof_intel_hda_dev *hda = sdev->pdata->hw_pdata;
struct pci_dev *pci = to_pci_dev(sdev->dev);
const struct sof_dsp_power_state target_state = {
.state = SOF_DSP_PM_D0,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Commit message:

typo: "powered down"

And please add to commit description:
""The new flow is required on Intel Tiger Lake and newer. Powering down the links is the recommended flow also for older platforms, so adjust implementation accordingly and implement only one flow.""

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@kv2019i you are quoting old code version - after the recent update hda_trigger_ignored_streams is no longer there.

As for the commit message, well its not that this flow is needed only for newer platforms it is needed for older ones too but here's the thing - some of them disable s0ix entry criteria in the PMC (assuming this feature is supported for that platform) and therefore can enter low power state even when i.e DMA is running.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mrajwa For commit msg. Yes, I know that now and thus we change the implementation for all platforms. But we should have at least a minimal explanation in commit message why this change is needed now, and why the old code was passing tests before.

Copy link
Author

@mrajwa mrajwa Jun 19, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@kv2019i, good idea will update commit message to be more descriptive.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mrajwa please also fix the several typos in the commit message. Also, it is just not descriptive of what you're altering here and why?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for spotting typos, I have just corrected and updated commit message.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mrajwa I'd have opted for a shorter version and not go into details of PMC disqualifications, but this is good enough and ok for me. If you do another respin of the series, maybe explain what PMC means... (Power Management Controller) it's not a standard term in Linux kernel.

Copy link
Collaborator

@kv2019i kv2019i left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The discussion is getting a bit complex, but upon rereading the comments so far, my take is that there are a few clear issues left that need to be addressed. See my notes inline.

struct sof_intel_hda_dev *hda = sdev->pdata->hw_pdata;
struct pci_dev *pci = to_pci_dev(sdev->dev);
const struct sof_dsp_power_state target_state = {
.state = SOF_DSP_PM_D0,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mrajwa I'd have opted for a shorter version and not go into details of PMC disqualifications, but this is good enough and ok for me. If you do another respin of the series, maybe explain what PMC means... (Power Management Controller) it's not a standard term in Linux kernel.

snd_sof_dsp_write(sdev, HDA_DSP_HDA_BAR,
sd_offset +
SOF_HDA_ADSP_REG_CL_SD_STS,
SOF_HDA_CL_DMA_SD_INT_MASK);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mrajwa Hmm, I think @ranj063 raises a valid point. stop_cmd_io() only stops the CORB/RIRB transfers. Cleaning up stream status SDxSTS without stopping the stream DMA (SDxCTL) does not look correct. Either we need to stop the stream engines like in hda_dsp_ctrl_stop_chip(), or it's not needed and we don't even clear SDxSTS.

hda_codec_i915_display_power(sdev, true);

snd_hdac_bus_init_cmd_io(bus);
ret = snd_hdac_ext_bus_link_power_up_all(bus);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the logic is right, the links needs to be powered down via LCTL.SPA at S0ix suspend. But @mrajwa , I agree with @ranj063 , we need to mimic the logic in hda_resume() and shut down any links that were not active when we entered suspend. Otherwise S0ix resume will result in more links being powered on than what were powered on at S0ix suspend, and this is not right. But copying the logic in hda_resume() should be sufficient.

@mrajwa mrajwa force-pushed the s0ix_entry_issue branch 2 times, most recently from cedeb5e to 66a018f Compare June 23, 2020 23:26
Copy link
Collaborator

@kv2019i kv2019i left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, this version with pcm_platfrom_trigger for both stop and start seems cleaner. There's one issue in the new link power down code (see inline). And @mrajwa , if you make new versions, can you make the fix to commit message w.r.t PMC as well?

list_for_each_entry(hlink, &bus->hlink_list, list) {
if (hlink->ref_count)
ret = snd_hdac_ext_bus_link_power_up(hlink);
if (ret < 0) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mrajwa "ret" might be used uninitialized here. Maybe better to break from the loop on error and have the dev_dbg on error printed out outside list_for_each_entry.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@kv2019i I have an alternate proposal to call the platform trigger from the component driver (pcm.c) instead. Unfortunately, I cant test it out yet because apparently S0ix with WoV is still broken on the TGL board (even with this PR). Will update soon

Copy link
Author

@mrajwa mrajwa Jun 24, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ranj063 its not WoV what is broken but a platform setup (coreboot in this case)
When it comes to your alternate version which hopes to resume and suspend from pcm.c, as I said during recent meeting - suspending this way may work but resume not. As far as I know we don't resume in WoV case by a pcm event. The idea to suspend/resume from pcm.c is not a news for me, in fact I have also tried it a the very beginning as it is natural first choice.

@mrajwa mrajwa force-pushed the s0ix_entry_issue branch from 66a018f to cd60539 Compare June 26, 2020 16:07
This patch fixes the suspend & resume procedure to allow entry into the
low power states with some streams being active as a wake source - wake on
voice is a perfect example. The current implementation lacks the stoppage
of DMA engines as well as streams and links and therefore platform may not
enter s0ix state. The low power state sometimes can be achieved even
without power gating certain required resources thanks to PMC
qualifications disable.

Signed-off-by: Marcin Rajwa <marcin.rajwa@linux.intel.com>
@mrajwa mrajwa force-pushed the s0ix_entry_issue branch from cd60539 to 9a3a825 Compare June 26, 2020 22:35
Copy link
Member

@plbossart plbossart left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks much better @mrajwa, much less confusing with spurious settings removed.
Still quite a few cosmetic changes needed, see below.
I would also change the commit title to 'ASoC: SOF: Intel: fix S0ix suspend/resume sequences', and if you add a 'fix' you need a 'Fixes: ('title') for backports.
Thanks

}
if (ret < 0) {
dev_dbg(sdev->dev,
"error %x in %s: failed to power up links",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit-pick: we have a convention that dev_err() log start with error: and dev_dbg() don't.
pick one of the two solutions, thanks - same comment for all logs.

return 0;
}

/* Trigger streams which are ignored / stay active during suspend.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

comment remains valid...

return 0;
}

/* Trigger streams which are ignored / stay active during suspend.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the commit message is too convoluted.

"The current implementation does not power down links, stop command IO (CORB/RIRB) and stop the DMA streams, which on some platforms prevents actual use of S0ix. This patch modifies the suspend sequence to correct these three points, and applies a symmetric change for the resume sequence"

int ret;
struct snd_sof_pcm *spcm;
struct snd_sof_pcm_stream *stream;
struct snd_pcm_substream *substream;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please fix as asked by @kv2019i

int ret;
#if IS_ENABLED(CONFIG_SND_SOC_SOF_HDA)
struct hdac_bus *bus = sof_to_bus(sdev);
struct hdac_ext_link *hlink = NULL;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

initialization is not needed?

This patch disables traces in D0i3 no matter if CONFIG_SND_SOC_SOF_DEBUG
is enabled or not.

Signed-off-by: Marcin Rajwa <marcin.rajwa@linux.intel.com>
*/
if (hda_enable_trace_D0I3_S0 &&
sdev->system_suspend_target != SOF_SUSPEND_NONE)
if (sdev->system_suspend_target != SOF_SUSPEND_NONE)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

NAK. it's a kernel parameter added on purpose.
module_param_named(enable_trace_D0I3_S0, hda_enable_trace_D0I3_S0, bool, 0444);
MODULE_PARM_DESC(enable_trace_D0I3_S0,
"SOF HDA enable trace when the DSP is in D0I3 in S0"

why do you need to remove this?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@plbossart it is causing premature WoV wakeup

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mrajwa I think we should just change it to if (!hda_enable_trace_D0I3_S0 ||
sdev->system_suspend_target != SOF_SUSPEND_NONE)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree there's likely a logical confusion here, but if the traces were enabled I am not sure how we ever entered S0ix on other platforms. Something else must have changed?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In my DUT, I see this:
[ 44.128231] hda_dsp_set_D0_state 415 hda_enable_trace_D0I3_S0=0

So, how will this affect trace issue?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In my DUT, I see this:
[ 44.128231] hda_dsp_set_D0_state 415 hda_enable_trace_D0I3_S0=0

So, how will this affect trace issue?

Ignore my question please, I see we are setting HDA_PM_NO_DMA_TRACE flag.. got it.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree there's likely a logical confusion here, but if the traces were enabled I am not sure how we ever entered S0ix on other platforms. Something else must have changed?

@plbossart previous platforms had a bit more leniency in terms of S0ix entry qualifications. But with TGL, we dont have that anymore. But yes this logic was flawed here.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ranj063 @plbossart yes, the logic here is incorrect, it should be something like this:

		/*
		 * Trace DMA is disabled by default when the DSP enters D0I3.
		 * But it can be kept enabled when the DSP enters D0I3 while the
		 * system is in S0 (SUSPEND_NONE) for debug.
		 */
		if (!(hda_enable_trace_D0I3_S0 &&
		      sdev->system_suspend_target == SOF_SUSPEND_NONE))
			flags = HDA_PM_NO_DMA_TRACE;

@ranj063
Copy link
Collaborator

ranj063 commented Jul 7, 2020

@mrajwa @keyonjie can we update this patchset based on recent experiments and comments? As far as the suspend sequence goes, neither powering down the links nor stopping the DMA has any effect on S0ix entry. The only thing that is aboslutely required is that the CORB/RIRB DMA engines are stopped.

@keyonjie
Copy link

keyonjie commented Jul 8, 2020

@mrajwa @keyonjie can we update this patchset based on recent experiments and comments? As far as the suspend sequence goes, neither powering down the links nor stopping the DMA has any effect on S0ix entry. The only thing that is aboslutely required is that the CORB/RIRB DMA engines are stopped.

Yes, @mrajwa please update the commits, as the RC7 release is waiting for it also.

@keyonjie
Copy link

Hi @mrajwa @kv2019i @plbossart @ranj063 , I will submit another PR to replace this, which is a simple version and verified works on TGL.

@keyonjie keyonjie mentioned this pull request Jul 15, 2020
@keyonjie
Copy link

keyonjie commented Jul 15, 2020

The new PR is here:
#2276

@mrajwa
Copy link
Author

mrajwa commented Jul 20, 2020

This PR has been cleaned and merged here #2276. Therefore I am closing this one.

@mrajwa mrajwa closed this Jul 20, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants