Skip to content

Conversation

@kv2019i
Copy link
Collaborator

@kv2019i kv2019i commented May 24, 2019

The DSP hda controller cannot handle if hda codec is power cycled
while the controller is active. After commit b5a236c
("ALSA: hda - Enforces runtime_resume after S3 and S4 for each codec"),
frequent IPC timeouts are observed in suspend/resume stress testing.

Address the issue by blocking runtime PM in connected codecs
while controller is active.

Fixes #944

The DSP hda controller cannot handle if hda codec is power cycled
while the controller is active. After commit b5a236c
("ALSA: hda - Enforces runtime_resume after S3 and S4 for each codec"),
frequent IPC timeouts are observed in suspend/resume stress testing.

Address the issue by blocking runtime PM in connected codecs
while controller is active.

Fixes: b5a236c ("ALSA: hda - Enforces runtime_resume after S3 and S4 for each codec")
Signed-off-by: Kai Vehmanen <kai.vehmanen@linux.intel.com>
* power cycling of HDA codecs causes failures in HDA
* controller logic -> force codecs to be powered
*/
snd_hda_set_power_save(hbus, -1);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems a bit heavy-weight to me. In principle it would be enough to do snd_hda_set_power_save(hbus, SND_SOF_SUSPEND_DELAY_MS); once, and here just do some pm_runtime_get*() / put*(), similar to what is done in azx_vs_set_state():

			list_for_each_codec(codec, &chip->bus) {
				pm_runtime_suspend(hda_codec_dev(codec));
				pm_runtime_disable(hda_codec_dev(codec));
			}
...
			list_for_each_codec(codec, &chip->bus) {
				pm_runtime_enable(hda_codec_dev(codec));
				pm_runtime_resume(hda_codec_dev(codec));
			}

?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@lyakh Thanks. That's a good comment. And in fact, this is probably needed. THe current iteration of the patch does not prevent resumes (only suspends) and it seems if I run stress test long enough, I can still trigger problems. Let me work on a update patch.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@lyakh This proved harded, I tried pm_runtime_get(), pm_runtime_forbid(), pm_runtime_disable() and various combinations, but as this is called from within a suspend/resume, we hit issues.
I'll continue working on Monday. For now, the revert patch is the best cure.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@kv2019i the order of runtime suspend/resume is dictated by the parent-child relationship of the devices. In our case, the sof-audio-pci dev is the parent of the hda codec device. Therefore, the sof device will be resumed prior to the codec and will be suspended after the codec. I'm not sure it is a good idea to mess with it.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tested your PR, system hanged.

Copy link
Member

@plbossart plbossart left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry @kv2019i this doesn't look quite right. See comments below.

/* disable hda bus irq and i/o */
snd_hdac_bus_stop_chip(bus);

snd_hda_set_power_save(hbus, SND_SOF_SUSPEND_DELAY_MS);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we shouldn't muck with runtime_pm within a runtime_pm routine, this should be done on probe as done in Ranjani's patch.
Also this re-enabled runtime_pm in the suspend case and you disable it in resume, so this is very very odd.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@plbossart This is trying to enforce more ordering by preventing codec to be runtime-suspended while controller is running. But for many reasons, this patch is a no-go.

* power cycling of HDA codecs causes failures in HDA
* controller logic -> force codecs to be powered
*/
snd_hda_set_power_save(hbus, -1);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this boils down to callling:

pm_runtime_dont_use_autosuspend(dev);
    pm_runtime_forbid(dev);

which doesn't seem like a very good idea to me, and invalidates what you do on suspend.

@kv2019i
Copy link
Collaborator Author

kv2019i commented May 29, 2019

Closing the pull request. This patch misuses runtime-pm and does not cover all cases. Other options to address #944 are under investigation.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Bug][WHL] ipc timed out after S3 after '0x40010000: GLB_PM_MSG: CTX_SAVE'

5 participants