Skip to content

Conversation

@keyonjie
Copy link

@keyonjie keyonjie commented Mar 8, 2020

There was some minor mismatch between the commit message and the code
change in the commit "ASoC: Intel: Work around to fix HW D3 potential
crash issue", the gap basically is:

  1. DSP trunck clock gating (DTCGE) can cause FW crashes, disable it in
    D0.
  2. Keep D3 power gating disabled (D3PGD = 1) in D3.

Fixes: 0d2135e ("ASoC: Intel: Work around to fix HW D3 potential crash issue")
Signed-off-by: Keyon Jie yang.jie@linux.intel.com

Copy link
Collaborator

@kv2019i kv2019i left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If tests are good on the affected devices, this is a better fix than reverting the whole change. I had one note on the commit message w.r.t. additional of core stalling that is in the patch.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@keyonjie This seems extra with regards to commit message. Probably good to mention why this is added.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@kv2019i yes, let me add it, thanks.

@ranj063 ranj063 requested a review from cujomalainey March 9, 2020 16:49
Copy link
Member

@plbossart plbossart left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@keyonjie the sequences still diverge from the recommendations?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@keyonjie

in the D0->D3 transition, you are missing the "Enable DTCG to save power"

    4. Shutdown PLL, disable MCLK(clkctl.smos = 0), Enable DTCG to save power

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@plbossart the DTCG was enabled at the end of the function (line 288), I thought it can meet the power saving requirement to enable it a little bit later here and didn't change it.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

for the D3->D0 transition, you are setting the D0 state before powering up the memories

4. Power up necessary SRAM and wait at least for 18 clock cycles for every
    bank you have powered up

You are also combining steps 6 and 7

 6. Restore MCLK (clkctl.smos, disabled in D3 entry point 4)
    7. Stall and reset core, set CSR

You are missing the last step

    11.Unstall core

also the hw_sleep() makes no sense, you are doing this:

/* put DSP into reset and stall */
	sst_dsp_shim_update_bits(sst, SST_CSR,
		SST_CSR_24MHZ_LPCS | SST_CSR_RST | SST_CSR_STALL,
		SST_CSR_RST | SST_CSR_STALL | SST_CSR_24MHZ_LPCS);

	hsw_set_dsp_D3(sst);

but withing hw_set_dsp_D3, you also play with the stall:

static void hsw_set_dsp_D3(struct sst_dsp *sst)
{
	u32 val;
	u32 reg;

	/* Disable core clock gating (VDRTCTL2.DCLCGE = 0) */
	reg = readl(sst->addr.pci_cfg + SST_VDRTCTL2);
	reg &= ~SST_VDRTCL2_DCLCGE;
	writel(reg, sst->addr.pci_cfg + SST_VDRTCTL2);

	/* Stall and reset core */
	sst_dsp_shim_update_bits_unlocked(sst, SST_CSR,
					  SST_CSR_RST | SST_CSR_STALL,
					  SST_CSR_RST | SST_CSR_STALL);

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

for the D3->D0 transition, you are setting the D0 state before powering up the memories

4. Power up necessary SRAM and wait at least for 18 clock cycles for every
    bank you have powered up

The necessary SRAM is enabled dynamically later when the FW module is reloaded, at this point here, we don't have information about which SRAM blocks are necessary. You are right, we not are following up the sequence exactly. But if we say none block is necessary here, then it's fine?

You are also combining steps 6 and 7

They are not combined, but maybe the fisrt CSR_STALL is superfluous?

	/* stall DSP core, set clk to 192/96Mhz */ //Keyon: The CSR_STALL here might be superfluous? 
	sst_dsp_shim_update_bits_unlocked(sst,
		SST_CSR, SST_CSR_STALL | SST_CSR_DCS_MASK,
		SST_CSR_STALL | SST_CSR_DCS(4));

	/* Set 24MHz MCLK, prevent local clock gating, enable SSP0 clock */ //Keyon: 6.Restore MCLK (clkctl.smos)
	sst_dsp_shim_update_bits_unlocked(sst, SST_CLKCTL,
		SST_CLKCTL_MASK | SST_CLKCTL_DCPLCG | SST_CLKCTL_SCOE0,
		SST_CLKCTL_MASK | SST_CLKCTL_DCPLCG | SST_CLKCTL_SCOE0);

	/* Stall and reset core, set CSR */ //Keyon: 7. Stall and reset core, set CSR
	sst_dsp_shim_update_bits_unlocked(sst, SST_CSR,
					  SST_CSR_RST | SST_CSR_STALL,
					  SST_CSR_RST | SST_CSR_STALL);
 6. Restore MCLK (clkctl.smos, disabled in D3 entry point 4)
    7. Stall and reset core, set CSR

You are missing the last step

    11.Unstall core

This is left and will do in hsw_boot(), we need unstall DSP core until FW is downloaded.

also the hw_sleep() makes no sense, you are doing this:

/* put DSP into reset and stall */
	sst_dsp_shim_update_bits(sst, SST_CSR,
		SST_CSR_24MHZ_LPCS | SST_CSR_RST | SST_CSR_STALL,
		SST_CSR_RST | SST_CSR_STALL | SST_CSR_24MHZ_LPCS);

	hsw_set_dsp_D3(sst);

but withing hw_set_dsp_D3, you also play with the stall:

static void hsw_set_dsp_D3(struct sst_dsp *sst)
{
	u32 val;
	u32 reg;

	/* Disable core clock gating (VDRTCTL2.DCLCGE = 0) */
	reg = readl(sst->addr.pci_cfg + SST_VDRTCTL2);
	reg &= ~SST_VDRTCL2_DCLCGE;
	writel(reg, sst->addr.pci_cfg + SST_VDRTCTL2);

	/* Stall and reset core */
	sst_dsp_shim_update_bits_unlocked(sst, SST_CSR,
					  SST_CSR_RST | SST_CSR_STALL,
					  SST_CSR_RST | SST_CSR_STALL);

Thanks, so adding stall and reset core here in my RFC commit looks not wise, let me keep it as it was.

Copy link
Author

@keyonjie keyonjie left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for reviewing @plbossart .
There was some historic reason that why it was not exactly following the recommendations:

  1. the code is for both HSW and BDW, and following the dsp ops(.boot(), .stall(). .wake(), .sleep()) from Baytrail, to avoid big framework change, I did some innovation as you mentioned for those not so important sequence(per my understanding) change, comparing with the recommendations.
  2. except the recommended steps, we still some other items to cover, e.g. FW downloading and unstall DSP core after that, CSR2.SDFD and HMDC for SSP DMA, IPC interrupts enable/disable, etc.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@plbossart the DTCG was enabled at the end of the function (line 288), I thought it can meet the power saving requirement to enable it a little bit later here and didn't change it.

@keyonjie
Copy link
Author

@keyonjie the sequences still diverge from the recommendations?

@plbossart as I stated below, there was some historic reasons that why it was not exactly following the recommendations.

The try to following the recommendations to spit DSP core reset/unreset from hsw_reset() is not a simple change, let me remove that change here.

There was some minor mismatch between the commit message and the code
change in the commit "ASoC: Intel: Work around to fix HW D3 potential
crash issue", the gap basically is:
1. DSP trunck clock gating (DTCGE) can cause FW crashes, disable it in
D0.
2. Keep D3 power gating disabled (D3PGD = 1) in D3.

Fixes: 0d2135e ("ASoC: Intel: Work around to fix HW D3 potential crash issue")
Signed-off-by: Keyon Jie <yang.jie@linux.intel.com>
@keyonjie
Copy link
Author

@plbossart @kv2019i @cujomalainey the commit is updated, please help review, thanks.

@cujomalainey
Copy link

Hi Keyon, we are running repro testing here

@keyonjie
Copy link
Author

Hi Keyon, we are running repro testing here

Thanks @cujomalainey, please help let us know once you got any results.

@cujomalainey
Copy link

We are having trouble reproducing the bug without the fix currently (at least Pierre can) but our testing is also a bit slowed by changing to WFH

@keyonjie
Copy link
Author

@cujomalainey got it, stay safe.

@cujomalainey
Copy link

cujomalainey commented Mar 12, 2020

Thanks, Keyon. 3k boots in and no luck, waiting to hear back from our BDW uprev team.

@cujomalainey
Copy link

Correction, I may have been producing the bug, but I had setup my test incorrectly. I was checking for no cards, but it looks like the HDMI driver is fine in a failure case. I will fix my test.

@plbossart
Copy link
Member

solution provided upstream. Thanks @keyonjie

@plbossart plbossart closed this Mar 30, 2020
@cujomalainey
Copy link

Thanks @keyonjie for helping get this thing sorted out

@keyonjie
Copy link
Author

Good that the issue is fixed, though I feel somewhat dizzy with those sequence changes.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants