Skip to content

Conversation

@keyonjie
Copy link

@keyonjie keyonjie commented Apr 28, 2019

This fixes #724 and #887

We should not power up DSP core when it is already on, vice versa, don't
power down it when it is already off, here add checks to fix it.

Here also add ref_cnt for each DSP cores to manage the power status of those
cores.

@ranj063
Copy link
Collaborator

ranj063 commented Apr 28, 2019

@keyonjie we dont do anything if the core is already enabled. Look at

/* return if core is already enabled */

As for powering down the core, yes we do try to power it down multiple times if multitple pipelines are scheduled on the same core.
But this hasnt had any adverse effect on suspend/resume so far. Widget unload will never be called during suspend. It's only called during module unload.

But having said that,I do think we should add refcounts for the core and power it down only if the refcount is 0

@plbossart
Copy link
Member

@keyonjie it's not clear to me what 'chaos' you are referring to and what practical problem you are trying to fix? We shouldn't put too much logic in the core anyways, different platforms may have different ways of dealing with multiple cores, it's best to do the required housekeeping in platform-related stuff.

@keyonjie
Copy link
Author

keyonjie commented Apr 29, 2019

@plbossart I issue I want to address here is something like below in dmesg, the DSP core is power down many many times(you can refer to #887 for detail):

[ 1506.427349] input: sof-skl_hda_card HDMI/DP, pcm=13 Jack as /devices/pci0000:00/0000:00:1f.3/skl_hda_dsp_generic/sound/card0/input573
[ 1506.427556] sof-audio-pci 0000:00:1f.3: FW Poll Status: reg=0x1010f0f successful
[ 1506.427565] sof-audio-pci 0000:00:1f.3: FW Poll Status: reg=0x1000f0f successful
[ 1506.427569] sof-audio-pci 0000:00:1f.3: DSP core(s) enabled? 0 : core_mask 1
[ 1506.427584] sof-audio-pci 0000:00:1f.3: FW Poll Status: reg=0xf0f successful
[ 1506.427592] sof-audio-pci 0000:00:1f.3: FW Poll Status: reg=0xf0f successful
[ 1506.427596] sof-audio-pci 0000:00:1f.3: DSP core(s) enabled? 0 : core_mask 1
[ 1506.427608] sof-audio-pci 0000:00:1f.3: FW Poll Status: reg=0xf0f successful
[ 1506.427615] sof-audio-pci 0000:00:1f.3: FW Poll Status: reg=0xf0f successful
[ 1506.427620] sof-audio-pci 0000:00:1f.3: DSP core(s) enabled? 0 : core_mask 1
[ 1506.427634] sof-audio-pci 0000:00:1f.3: FW Poll Status: reg=0xf0f successful
[ 1506.427649] sof-audio-pci 0000:00:1f.3: FW Poll Status: reg=0xf0f successful
[ 1506.427653] sof-audio-pci 0000:00:1f.3: DSP core(s) enabled? 0 : core_mask 1
[ 1506.427664] sof-audio-pci 0000:00:1f.3: FW Poll Status: reg=0xf0f successful
[ 1506.427727] sof-audio-pci 0000:00:1f.3: FW Poll Status: reg=0xf0f successful
[ 1506.427733] sof-audio-pci 0000:00:1f.3: DSP core(s) enabled? 0 : core_mask 1
[ 1506.427745] sof-audio-pci 0000:00:1f.3: FW Poll Status: reg=0xf0f successful
[ 1506.427753] sof-audio-pci 0000:00:1f.3: FW Poll Status: reg=0xf0f successful
[ 1506.427757] sof-audio-pci 0000:00:1f.3: DSP core(s) enabled? 0 : core_mask 1
[ 1506.427767] sof-audio-pci 0000:00:1f.3: FW Poll Status: reg=0xf0f successful
[ 1506.427774] sof-audio-pci 0000:00:1f.3: FW Poll Status: reg=0xf0f successful
[ 1506.427777] sof-audio-pci 0000:00:1f.3: DSP core(s) enabled? 0 : core_mask 1
[ 1506.427788] sof-audio-pci 0000:00:1f.3: FW Poll Status: reg=0xf0f successful
[ 1506.427795] sof-audio-pci 0000:00:1f.3: FW Poll Status: reg=0xf0f successful
[ 1506.427799] sof-audio-pci 0000:00:1f.3: DSP core(s) enabled? 0 : core_mask 1
[ 1506.427809] sof-audio-pci 0000:00:1f.3: FW Poll Status: reg=0xf0f successful
[ 1506.427816] sof-audio-pci 0000:00:1f.3: FW Poll Status: reg=0xf0f successful
[ 1506.427819] sof-audio-pci 0000:00:1f.3: DSP core(s) enabled? 0 : core_mask 1

@keyonjie
Copy link
Author

keyonjie commented Apr 29, 2019

@keyonjie we dont do anything if the core is already enabled. Look at
linux/sound/soc/sof/intel/hda-dsp.c

Line 210 in af71673

/* return if core is already enabled */
As for powering down the core, yes we do try to power it down multiple times if multitple pipelines are scheduled on the same core.
But this hasnt had any adverse effect on suspend/resume so far. Widget unload will never be called during suspend. It's only called during module unload.

But having said that,I do think we should add refcounts for the core and power it down only if the refcount is 0

yes, I am concerning about that we might power down Core 0 too early in our case, in module unload/reload.

@xiulipan
Copy link

@keyonjie @ranj063
A good point here, the core power on/off need some new function to make sure it won't have any problem when we enable multiple core.

@plbossart
Copy link
Member

@keyonjie @ranj063
A good point here, the core power on/off need some new function to make sure it won't have any problem when we enable multiple core.

There is already a mask to indicate what cores need to be on or off, so what are you asking for? 'some new function' does not describe what the problem is and how it needs to be fixed.

@tlauda
Copy link

tlauda commented Apr 30, 2019

@keyonjie FW is already keeping the count of enabled cores, so this probably won't fix anything, but at least it'll spare us the additional IPC.

* TODO: add ref counts for each core, only power off it when
* ref counts is decreased by 0.
*/
if (!(sdev->enabled_cores_mask & (1 << pipeline->core)) ||
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you need locking here around mask ? what happens if a pipeline is started when another topology is being loaded ?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@lgirdwood that's good point, we might need locking at both load and unload, though we never had that yet.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@lgirdwood comments addressed, please help review.

@RanderWang
Copy link

This patch fixes #724
The error msg is not cached before dsp is powered off. And just caching error message before dsp is powered off is not work because dsp would be powered off a few times. only the error message before the first power-off is valid. The idea of this PR is a good method to make error message available

@plbossart plbossart added the Unclear No agreement on problem statement and resolution label May 10, 2019
@keyonjie keyonjie changed the title [RFC]ASoC: SOF: topology: refine multi-core power up/down to avoid core status chaos ASoC: SOF: topology: refine multi-core power up/down to avoid core status chaos May 13, 2019
@keyonjie
Copy link
Author

@plbossart The management to DSP cores could be generic, to implement ef_cnt for each DSP cores to manage the power status in sof core part in #887 should be next step. Here it is only bug fix for the existed solution. Please help review.

@keyonjie keyonjie requested a review from lgirdwood May 13, 2019 07:32
@plbossart
Copy link
Member

plbossart commented May 13, 2019 via email

@keyonjie
Copy link
Author

I will refine this later.

@keyonjie
Copy link
Author

Hi all @plbossart @tlauda @ranj063 @RanderWang @xiulipan @lgirdwood the PR is updated to support core ref count, they are verified on boot, fw runtime PM, and module unloading/reloading, please help review.

@RanderWang RanderWang self-requested a review May 21, 2019 09:07
RanderWang
RanderWang previously approved these changes May 21, 2019
Copy link

@RanderWang RanderWang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@keyonjie
Copy link
Author

keyonjie commented May 24, 2019

I really would rather use kref, if possible

I previously used that, @plbossart reminded and I agreed that we don't need the heavier kref here as we already have mutex to protect the same resource, the kref itself doesn't help much here for me:

  1. the kref_init() will initialize the refcount to be 1, not 0 that we want.
  2. the kref_get() doesn't support callback, so I still need use kref_read() + check and power up manually.

The kref_put_mutex() can help somewhat here with release callback for power down, but this is the only benefit I can imagine.

So here, I think the light int ref count looks better to me.

@lyakh @plbossart thoughts?

@keyonjie keyonjie requested a review from lyakh May 24, 2019 09:26
@xiulipan
Copy link

@keyonjie
Some error logs here. Not sure if we have some regression on PCM512X tplg or some otherthing.
See same at #983

[    4.456739 <    0.000010>] sof-audio-pci 0000:00:0e.0: tplg: 1 hw_configs found, default id: 0!
[    4.456743 <    0.000004>] sof-audio-pci 0000:00:0e.0: tplg: config SSP5 fmt 0x4001 mclk 24576000 bclk 3072000 fclk 48000 width (24)32 slots 2 mclk id 0 quirks 0
[    4.456748 <    0.000005>] sof-audio-pci 0000:00:0e.0: ipc tx: 0x80010000: GLB_DAI_MSG: CONFIG
[    4.456833 <    0.000085>] sof-audio-pci 0000:00:0e.0: ipc tx succeeded: 0x80010000: GLB_DAI_MSG: CONFIG
[    4.456837 <    0.000004>] sof-audio-pci 0000:00:0e.0: tplg: 1 hw_configs found, default id: 1!
[    4.456849 <    0.000012>] sof-audio-pci 0000:00:0e.0: ipc tx: 0x80010000: GLB_DAI_MSG: CONFIG
[    4.456913 <    0.000064>] sof-audio-pci 0000:00:0e.0: error: ipc error for 0x80010000 size 12
[    4.456971 <    0.000058>] sof-audio-pci 0000:00:0e.0: error: failed to set DAI config for direction:0 of HDA dai 0
[    4.457019 <    0.000048>] sof-audio-pci 0000:00:0e.0: error: failed to process hda dai link iDisp1
[    4.457061 <    0.000042>] sof-audio-pci 0000:00:0e.0: ASoC: physical link loading failed
[    4.457157 <    0.000096>] sof-audio-pci 0000:00:0e.0: error: tplg component load failed -5
[    4.457204 <    0.000047>] sof-audio-pci 0000:00:0e.0: error: failed to load DSP topology -22
[    4.457244 <    0.000040>] sof-audio-pci 0000:00:0e.0: ASoC: failed to probe component -22

@keyonjie
Copy link
Author

@keyonjie
Some error logs here. Not sure if we have some regression on PCM512X tplg or some otherthing.
See same at #983

[    4.456739 <    0.000010>] sof-audio-pci 0000:00:0e.0: tplg: 1 hw_configs found, default id: 0!
[    4.456743 <    0.000004>] sof-audio-pci 0000:00:0e.0: tplg: config SSP5 fmt 0x4001 mclk 24576000 bclk 3072000 fclk 48000 width (24)32 slots 2 mclk id 0 quirks 0
[    4.456748 <    0.000005>] sof-audio-pci 0000:00:0e.0: ipc tx: 0x80010000: GLB_DAI_MSG: CONFIG
[    4.456833 <    0.000085>] sof-audio-pci 0000:00:0e.0: ipc tx succeeded: 0x80010000: GLB_DAI_MSG: CONFIG
[    4.456837 <    0.000004>] sof-audio-pci 0000:00:0e.0: tplg: 1 hw_configs found, default id: 1!
[    4.456849 <    0.000012>] sof-audio-pci 0000:00:0e.0: ipc tx: 0x80010000: GLB_DAI_MSG: CONFIG
[    4.456913 <    0.000064>] sof-audio-pci 0000:00:0e.0: error: ipc error for 0x80010000 size 12
[    4.456971 <    0.000058>] sof-audio-pci 0000:00:0e.0: error: failed to set DAI config for direction:0 of HDA dai 0
[    4.457019 <    0.000048>] sof-audio-pci 0000:00:0e.0: error: failed to process hda dai link iDisp1
[    4.457061 <    0.000042>] sof-audio-pci 0000:00:0e.0: ASoC: physical link loading failed
[    4.457157 <    0.000096>] sof-audio-pci 0000:00:0e.0: error: tplg component load failed -5
[    4.457204 <    0.000047>] sof-audio-pci 0000:00:0e.0: error: failed to load DSP topology -22
[    4.457244 <    0.000040>] sof-audio-pci 0000:00:0e.0: ASoC: failed to probe component -22

This should be related with FW PR#1394, kernel part already merged, we need merge FW part also, otherwise, this issue will always happen.
thesofproject/sof#1354

@keyonjie
Copy link
Author

Hi @plbossart so do you still have anything unclear about this?


return ret;
/* increase ref count of the DSP core */
return snd_sof_dsp_core_get(sdev, pipeline->core);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this doesn't look like a very sensible power management idea. You power-up all the cores when the topology is loaded and release them when we unload the topology. They should be power-up when they are used!

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's true, powering up cores as late as possible and powering down them as soon as possible sounds fantastic, but I am not sure we can support this inside FW ATM, we even haven't verified that configuring a pipeline run on a non-0 core via topology can work as we expected(@tlauda please correct me if I am wrong).

To me, the changes of where we are calling snd_sof_dsp_core_get() is relative simple, we can change that when it is aligned on FW and driver and verified work.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@keyonjie your write-up suggests that you have not tested this patch in a multi-core configuration?
Also I don't see why we should first do something and then change it because it's a bad design. If you've tested multi-core then you can do the change directly. If you haven't tested and this is an enabler patch then test it further before we merge it.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@keyonjie your write-up suggests that you have not tested this patch in a multi-core configuration?
Also I don't see why we should first do something and then change it because it's a bad design. If you've tested multi-core then you can do the change directly. If you haven't tested and this is an enabler patch then test it further before we merge it.

Agree, let me close this PR and revisit it when multi-core feature is required and verified work on FW side.

Copy link
Member

@plbossart plbossart left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Need to address comments and powering-up cores on startup does not look like a smart idea. Should be when DAPM signals the pipeline is in use.
Still "Unclear" status, sorry.

@plbossart
Copy link
Member

also the branch has topic/sof-dev merged for some reason, please fix this as well.

@keyonjie
Copy link
Author

keyonjie commented Jun 3, 2019

also the branch has topic/sof-dev merged for some reason, please fix this as well.

if what you meant is need rebase to latest topic/sof-dev, now it is done.

@kv2019i
Copy link
Collaborator

kv2019i commented Jun 17, 2019

This is very long, but I'll add a quick note that this PR (at least the part that keeps core0 powered following driver runtime-PM status) would be very useful for common debugging tasks. See #887 (comment)

@keyonjie
Copy link
Author

This is very long, but I'll add a quick note that this PR (at least the part that keeps core0 powered following driver runtime-PM status) would be very useful for common debugging tasks. See #887 (comment)

Thanks for acking @kv2019i . Actually, I always run on my platforms with the PR applied. :-)

@plbossart
Copy link
Member

@keyonjie can you please address the comments from @lyakh
If you don't reply I don't know if you disagree or if you didn't look

@singalsu
Copy link

@keyonjie Thanks, this work helped me to trace a topology load time issue in FW.

@keyonjie
Copy link
Author

@keyonjie can you please address the comments from @lyakh
If you don't reply I don't know if you disagree or if you didn't look

@plbossart updated above.

Add atomic ref counts core_refs for each audio DSP cores, for power
management of each core based on usage on it, this is preparation for
audio DSP multi-core support.

Signed-off-by: Keyon Jie <yang.jie@linux.intel.com>
We are changing enabled_cores_mask dynamically during pipeline
load/unload, there might be race for enabled_cores_mask when a pipeline
loading and another pipeline(run on the same core) is unloading.

Here introduces a mutex to protect this enabled_cores_mask and
core_refs, and use it when never we need to read/write to it.

Signed-off-by: Keyon Jie <yang.jie@linux.intel.com>
Initialize ref counts to DSP cores to be 0s, at start of probing.

Once FW booted, update the ref counts of dsp cores, that is, for
each powered on core, set the ref count of it to be 1.

Reset ref counts of DSP cores to be 0s at suspend, to align with the
status before next FW boot at resume.

Signed-off-by: Keyon Jie <yang.jie@linux.intel.com>
Add a general DSP core enable/disable IPC interface, this is now aligned
with FW that sending the whole enable_mask, may consider changing to
send IPC for specific core only with each IPC in the future when FW
change is ready (ABI bump needed).

Signed-off-by: Keyon Jie <yang.jie@linux.intel.com>
We need get/put interfaces to manage the ref counts of DSP cores, call
the real core power up/down, and send ipc to align the core power status
with FW, the interfaces are typically used when new modules/pipelines
which are assigned to run on a specific DSP core.

Signed-off-by: Keyon Jie <yang.jie@linux.intel.com>
Refine DSP multi-core support to use the new defined APIs directlly.

For pipeline loading, call snd_sof_dsp_core_get(), it will increase ref
count of the specific DSP core, and power on the core when necessary (it
is on powered off stage).

For pipeline(scheduler widget) unloading, call snd_sof_dsp_core_put(),
it will decrease ref count of the specific DSP core, and power off it
when necessary (no more usage of it).

Signed-off-by: Keyon Jie <yang.jie@linux.intel.com>
@plbossart
Copy link
Member

closing since there's been no activity since August 5

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Unclear No agreement on problem statement and resolution

Projects

None yet

Development

Successfully merging this pull request may close these issues.

can't get any FW error logs once dsp is powered off

10 participants