Skip to content

Conversation

@ranj063
Copy link
Collaborator

@ranj063 ranj063 commented May 24, 2019

Set the autosuspend delay for hda bus device. This prevents the hda codec device from entering runtime suspend when running stress tests involving opening/closing streams one after the other.

Fixes thesofproject/sof#1422

Likely fixes other isseus involving pulse audio as well. Needs to be checked.

@ranj063 ranj063 requested review from keyonjie and kv2019i May 24, 2019 07:02
@lyakh
Copy link
Collaborator

lyakh commented May 24, 2019

I'll copy my comment to this PR from my own tree:

But why is suspending and resuming bad? In principle too frequent suspend / resume should only result in some performance loss, but shouldn't break the functionality. I think we do want this patch eventually, but if we apply it now, I think it has a high potential of hiding the problem...

@keyonjie
Copy link

@ranj063 without your PR(setting timeout to 2s), what is the timeout value for runtime suspend of the codec? @RanderWang has some finding that we might need add PM ops to hdac_hda.c and call the hda codec PM via hdac_hda.c PM ops.

It's better if we can figure out more generic solution.

@lyakh
Copy link
Collaborator

lyakh commented May 24, 2019

@ranj063 without your PR(setting timeout to 2s), what is the timeout value for runtime suspend of the codec? @RanderWang has some finding that we might need add PM ops to hdac_hda.c and call the hda codec PM via hdac_hda.c PM ops.

It's better if we can figure out more generic solution.

@keyonjie without this patch autosuspend wouldn't be enabled at all, so, I think as soon as the ref-count reaches 0 the device(s) would be suspended with no delay.

@kv2019i
Copy link
Collaborator

kv2019i commented May 24, 2019

This might be related to same root problem as #978 is trying to address. If hda codec reset indeed causes unexpected side-effects to cAVS, then it would make sense fine-tuning the autosuspend delay will have impact (as codec suspend will kick less often).

I do agree with @lyakh we should find the rootcause. Once rootcause is clear, then workaround patches may be considered still, but at least then we know better the pros and cons. Otherwise we risk just making bugs harder to reproduce but still there.

@plbossart
Copy link
Member

The tests show that this PR does NOT fix the issue fully?

@ranj063
Copy link
Collaborator Author

ranj063 commented May 24, 2019

The tests show that this PR does NOT fix the issue fully?

@plbossart yes enabling autosuspend avoids 1422 from occuring with aplay stress test. But it still shows up in other stress test cases such as reboot/s3 stress test when the codec reset happens right before we send an IPC

Copy link

@keyonjie keyonjie left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch, how about move it to the bottom of hda_codec_probe_bus(), we need to call it only one time for the bus.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch, how about move it to the bottom of hda_codec_probe_bus(), we need to call it only one time for the bus.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is misplaced. The function returns in both preprocessor branches above

lgirdwood
lgirdwood previously approved these changes May 27, 2019
Copy link
Member

@lgirdwood lgirdwood left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will also means less chance of pops and clicks.

@lgirdwood
Copy link
Member

@ranj063 CI is showing issue, but its false. Conflicts too.

@RanderWang
Copy link

@ranj063 without your PR(setting timeout to 2s), what is the timeout value for runtime suspend of the codec? @RanderWang has some finding that we might need add PM ops to hdac_hda.c and call the hda codec PM via hdac_hda.c PM ops.

It's better if we can figure out more generic solution.

we don't need to add PM ops to hdac_hda.c. The codec driver in legacy hda has this PM ops.

Copy link
Collaborator

@lyakh lyakh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

a simple fix is needed

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is misplaced. The function returns in both preprocessor branches above

@keyonjie
Copy link

How about put it here:

@@ -95,6 +98,7 @@ static int hda_codec_probe(struct snd_sof_dev *sdev, int address)
 int hda_codec_probe_bus(struct snd_sof_dev *sdev)
 {
        struct hdac_bus *bus = sof_to_bus(sdev);
+       struct hda_bus *hbus = sof_to_hbus(sdev);
        int i, ret;

        /* probe codecs in avail slots */
@@ -111,6 +115,7 @@ int hda_codec_probe_bus(struct snd_sof_dev *sdev)
                }
        }

+       snd_hda_set_power_save(hbus, SND_SOF_SUSPEND_DELAY_MS);
        return 0;
 }
 EXPORT_SYMBOL(hda_codec_probe_bus);

@ranj063
Copy link
Collaborator Author

ranj063 commented May 28, 2019

@keyonjie @lyakh PR updated now

Copy link
Member

@plbossart plbossart left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good idea but known issues with this solution already handled in the legacy HDaudio driver.

@wenqingfu
Copy link

SOFCI TEST

Set the autosuspend delay to 2s for hda bus device.
This is needed to prevent the codec driver from entering
runtime PM during stress tests involving back-to-back
stream open/close.

The legacy driver forbids autosuspend for some devices
where a non-zero autosuspend delay causes pops/clicks.
This needs to be addressed in the SOF driver when those
devices are supported.

Signed-off-by: Ranjani Sridharan <ranjani.sridharan@linux.intel.com>
@ranj063 ranj063 force-pushed the fix/hda_bus_autosuspend branch from 5b9b769 to 56f58aa Compare May 29, 2019 01:05
@ranj063
Copy link
Collaborator Author

ranj063 commented May 29, 2019

@xiulipan why has jenkins failed? I get 404 page not found error when I click on it?

@xiulipan
Copy link

@ranj063
There are some network issue to sync data onto 01.org. So there are maybe some 404.
Now it is working and I see some errors.

 dpkg-source --before-build sofkernel_prs
 debian/rules build
ERROR: "snd_hda_set_power_save" [sound/soc/sof/intel/snd-sof-intel-hda.ko] undefined!
make[4]: *** [__modpost] Error 1
make[3]: *** [modules] Error 2
make[3]: *** Waiting for unfinished jobs....
make[2]: *** [build] Error 2
dpkg-buildpackage: error: debian/rules build subprocess returned exit status 2
make[1]: *** [bindeb-pkg] Error 2
make: *** [bindeb-pkg] Error 2

Copy link
Member

@plbossart plbossart left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the commit message isn't very clear.
We should explain this is a work-around for now. I don't think we have a root cause and I don't see how/when we are going to deal with the power_save quirks either.

@plbossart plbossart added the Unclear No agreement on problem statement and resolution label Jun 11, 2019
@plbossart
Copy link
Member

@ranj063 is this time to close this one?

@ranj063
Copy link
Collaborator Author

ranj063 commented Jun 11, 2019

@ranj063 is this time to close this one?

@plbossart yes I will close this one. Even though this is the right thing to do I have not seen any value with it so far. So maybe we can revisit later.

@ranj063 ranj063 closed this Jun 11, 2019
@lyakh
Copy link
Collaborator

lyakh commented Jun 12, 2019

@ranj063 @plbossart may I humbly disagree? I think this is an important feature. It's a real improvement over threshing the DSP immediately upon each close. I'm not sure how this translated to real-life use-cases, maybe a typical use-case is anyway you use a GUI audio player and it either keeps the driver awake anyway all the time, or it allows it to suspend, but then the user anyway doesn't restart the next playback within 2 seconds after finishing the previous one. Still, there are things like short system beeps and blips, notifications etc. I think we should merge this!

@plbossart
Copy link
Member

I have no objection but let's add this later when we are done with the functional parts.

@lyakh
Copy link
Collaborator

lyakh commented Jun 12, 2019

@plbossart can we keep it open until then then? It would at least help me, otherwise it'd would completely slip under my radar.

@lyakh
Copy link
Collaborator

lyakh commented Aug 16, 2019

@ranj063 @plbossart any argument against merging this? I really think it's a bug not setting the idle timeout and this patch trivially fixes it.

@plbossart
Copy link
Member

So it's not clear what this PR actually solves (what stress tests are broken?), and it's known to be problematic with legacy devices which don't work well with a non-zero autosuspend value.
Not sure if there's any value here, it feels like asking for trouble without much benefit.

@lyakh
Copy link
Collaborator

lyakh commented Aug 20, 2019

So it's not clear what this PR actually solves (what stress tests are broken?), and it's known to be problematic with legacy devices which don't work well with a non-zero autosuspend value.
Not sure if there's any value here, it feels like asking for trouble without much benefit.

@plbossart I don't think any tests should break without that delay, a "perfect" driver should work with both 0 or any length delay. It's rather a statistical improvement - lower latencies when resume cycles are avoided, especially as with SOF where they involve loading and booting the firmware, restoring pipelines etc.
Besides it's a matter of correctness, without this the use of autosuspend in SOF is inconsistent IMHO. See section 9 of Documentation/power/runtime_pm.txt for more details. What exactly failures is this patch causing (on legacy devices)?

@RanderWang
Copy link

@plbossart @lyakh
Actually, this patch was reverted on sof-v5.0 for DELL hda project for jack detection doesn't work when system booting up. And there is no such delay for legacy hda driver also.

@ranj063
Copy link
Collaborator Author

ranj063 commented Oct 25, 2019

Closing this one as it doesnt seem like it is needed anymore.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Unclear No agreement on problem statement and resolution

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[BUG][CML]ipc timed out for 0x80010000: GLB_DAI_MSG: CONFIG after system boot up.(intermittent)

9 participants