Skip to content

Conversation

@ranj063
Copy link
Collaborator

@ranj063 ranj063 commented Sep 1, 2021

No description provided.

Copy link
Member

@plbossart plbossart left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am pretty sure I missed the big picture here, this isn't straightforward to review...

* audio component buffer - connects 2 audio components together in pipeline.
*
* The buffer is a hot structure that must be shared on certain cache
* incoherent architectures.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't there be some property that actually checks if the buffer will be used between two cores? What happens in the typical case where the entire pipeline runs on one core? then all this cache coherency is not really needed except for DMA buffers, is it?

Put differently, is the multi-core support going to impact or possible cripple performance when chaining processing on a single core?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

minimal impact for non shared objects - we still check the c->shared flag, so lock and cache ops are only performed on objects used by multiple cores.

__must_check static inline struct coherent *coherent_acquire(struct coherent *c,
							     const size_t size)
{
	/* assert if someone passes a cache/local address in here. */
	ADDR_IS_COHERENT(c);

	/* this flavour should not be used in IRQ context */
	CHECK_COHERENT_IRQ(c);

	/* access the shared coherent object */
	if (c->shared) {
		CHECK_COHERENT_CORE(c);

		spin_lock(&c->lock);

		/* invalidate local copy */
		dcache_invalidate_region(uncache_to_cache(c), size);
	}

	/* client can now use cached object safely */
	return uncache_to_cache(c);
}

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

concening @plbossart's remark on DMA use-cases: is this API actually supposed to be used for DMA memory too? Don't think it's well suited for that, since spinlocks don't protect against DMA. And I'm not sure why we'd need the Kconfig option then?

if (!sinks[i])
continue;
buffer_writeback(sinks[i], sinks_bytes[i]);
sinks[i] = buffer_release(sinks[i]);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it's rather unclear to me when a writeback is explicitly required and when it's handled by the release?

sink_bytes = frames * audio_stream_frame_bytes(&sink->stream);

sink = buffer_release(sink);

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so that one has no invalidate for the sources?


audio_stream_set_zero(&sink->stream, sink_bytes);
buffer_writeback(sink, sink_bytes);
comp_update_buffer_produce(sink, sink_bytes);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

that one is super odd, you release before writing to tthe sink buffer?

@ranj063 ranj063 changed the title Implement APIs for coherent access of shared objects [DO NOT REVIEW YET] Implement APIs for coherent access of shared objects Sep 1, 2021
Copy link
Member

@lgirdwood lgirdwood left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've fixed build on all platforms - needs squashed and kconfig fix in lrg/topic/coherent_api

@lgirdwood
Copy link
Member

I am pretty sure I missed the big picture here, this isn't straightforward to review...

Big picture is that we create a coherent object API so that

  1. All coherent objects embed struct coherent (used for access, iteration and debug).
  2. Objects must be acquired() before access (this means we get the lock and invalidate when lock is acquired).
  3. We can now use cache ptr to object with no loss of performance.
  4. Object is released() by user (this writes back and releases lock).

Further work would enhance this to support

  1. separate coherent buffer R + W locking.
  2. Opaque object types to enforce all object access through coherent APIs.

@singalsu
Copy link
Collaborator

singalsu commented Sep 2, 2021

"[DO NOT REVIEW YET]" ?

@ranj063
Copy link
Collaborator Author

ranj063 commented Sep 2, 2021

"[DO NOT REVIEW YET]" ?

Yes, please. still WIP and testing. I will remove the tag once it is ready for review

@ranj063 ranj063 changed the title [DO NOT REVIEW YET] Implement APIs for coherent access of shared objects [RFC] Implement APIs for coherent access of shared objects Sep 7, 2021
@ranj063 ranj063 force-pushed the topic/coherent_api branch 2 times, most recently from f865cbf to fdcc060 Compare September 7, 2021 20:28
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks like you have to make function parameters to struct comp_buffer **sink and struct comp_buffer **source to make sure the caller has valid pointers after function return, although in practice this has no effect

src/audio/host.c Outdated
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

...or maybe buffer_release() shouldn't be __must_check or maybe even return void. Wouldn't it be cleaner to use a separate handle variable for an "acquired buffer" and keep buffer as a generic object reference. But if they're both of type struct buffer * this creates danger of mistakenly accessing the object via the buffer pointer. Or you could make a separate struct buffer_handle type for it, but then you'd have to change more... Thinking out loud

src/audio/host.c Outdated
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

left-over

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

also probably has to be **buffer and in all other similar cases

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was thinking about making sure we don't have nested buffer_acquire() calls. So, maybe everywhere where a buffer is passed as a parameter it must be acquired by the caller?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nested will assert since the ptr alias being passed in will change..

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

don't component buffer lists have to be protected? I still don't understand our design of component, buffer and other element sharing well enough, but it seems like it can be or become a problem? I see buffers are added to the lists in pipeline_connect() with just local interrupts disabled, but probably even with dynamic pipelines it cannot run simultaneously on different cores? Say, if you have a mixer, where one core is running a pipeline. Then an IPC is arriving to start a pipeline on another core with the same mixer, so that core tries to add a buffer to its list?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The coherent list iterator is needed and is a subsequent PR.

@ranj063
Copy link
Collaborator Author

ranj063 commented Sep 8, 2021

SOFCI TEST

@lgirdwood
Copy link
Member

@ranj063 fwiw, the internal CI failures are all on TplgPipeComplete IPC after the first iteration with the multicore enabled in CI (so this may mean the multicore is not enabled in the Kconfig - and hence the failure)

@lgirdwood
Copy link
Member

@ranj063 what I Mean is the KConfig say multicore and teh topology says multicore, but coherent APi probably says unicore.

@lgirdwood
Copy link
Member

So the outstanding review comments are minor styling and could be fixed in a subsequent PR.

Certain data structures such as buffers, pipelines, comp_dev's
are used across cores and cache boundaries. Introduce APIs
to handle cached access to such objects by stipulating that
the lock is held during access and releasing the lock after.

For example:

/* uncached work */
if (buffer->flag == somevalue)
    do this;

/* cached work */
buffer = coherent_acquire(buffer);

buffer->state = true;

/* back to uncached */
buffer = coherent_release(buffer);

Also, add a new kconfig INCOHERENT which will need to ibe set for platforms that
need software-managed coherency.

Signed-off-by: Liam Girdwood <liam.r.girdwood@linux.intel.com>
Signed-off-by: Ranjani Sridharan <ranjani.sridharan@linux.intel.com>
Buffer are shared hot data structures that are used across core and cache
boundaries. Manage cached buffer access by stipulating that cached access
is granted by holding the buffer lock (which invalidates) and releasing
the lock (invalidates and writeback).

Signed-off-by: Liam Girdwood <liam.r.girdwood@linux.intel.com>
Signed-off-by: Ranjani Sridharan <ranjani.sridharan@linux.intel.com>
instead of buffer_lock/unlock() which will be deprecated.

Signed-off-by: Ranjani Sridharan <ranjani.sridharan@linux.intel.com>
instead of buffer_lock/unlock() which will be deprecated.

Signed-off-by: Ranjani Sridharan <ranjani.sridharan@linux.intel.com>
instead of buffer_lock/unlock() which will be deprecated.

Signed-off-by: Ranjani Sridharan <ranjani.sridharan@linux.intel.com>
instead of buffer_lock/unlock() which will be deprecated.

Signed-off-by: Ranjani Sridharan <ranjani.sridharan@linux.intel.com>
instead of buffer_lock/unlock() which will be deprecated.

Signed-off-by: Ranjani Sridharan <ranjani.sridharan@linux.intel.com>
instead of buffer_lock/unlock() which will be deprecated.

Signed-off-by: Ranjani Sridharan <ranjani.sridharan@linux.intel.com>
instead of buffer_lock/unlock() which will be deprecated.

Signed-off-by: Ranjani Sridharan <ranjani.sridharan@linux.intel.com>
instead of buffer_lock/unlock() which will be deprecated.

Signed-off-by: Ranjani Sridharan <ranjani.sridharan@linux.intel.com>
instead of buffer_lock/unlock() which will be deprecated.

Signed-off-by: Ranjani Sridharan <ranjani.sridharan@linux.intel.com>
instead of buffer_lock/unlock which will be deprecated.

Signed-off-by: Ranjani Sridharan <ranjani.sridharan@linux.intel.com>
instead of buffer lock/unlock which will be deprecated.

Signed-off-by: Ranjani Sridharan <ranjani.sridharan@linux.intel.com>
instead of buffer_lock/unlock which will be deprecated

Signed-off-by: Ranjani Sridharan <ranjani.sridharan@linux.intel.com>
instead of buffer_lock/unlock which will be deprecated

Signed-off-by: Ranjani Sridharan <ranjani.sridharan@linux.intel.com>
instead of buffer_lock/unlock() which will be deprecated.

Signed-off-by: Ranjani Sridharan <ranjani.sridharan@linux.intel.com>
instead of buffer_lock/unlock which will be deprecated.

Signed-off-by: Ranjani Sridharan <ranjani.sridharan@linux.intel.com>
instead of buffer_lock/unlock which will be deprecated

Signed-off-by: Ranjani Sridharan <ranjani.sridharan@linux.intel.com>
instead of buffer lock/unlock which will be deprecated

Signed-off-by: Ranjani Sridharan <ranjani.sridharan@linux.intel.com>
all users replaced with buffer_acquire/release

Signed-off-by: Ranjani Sridharan <ranjani.sridharan@linux.intel.com>
With the addition of struct coherent member in struct buffer,
the size of the struct exceeds 128 bytes. Increase the
256 block heap size and reduce the 128 block size to fix
out of memory issues when running multiple-pipeline-all test.

Signed-off-by: Ranjani Sridharan <ranjani.sridharan@linux.intel.com>
@ranj063 ranj063 force-pushed the topic/coherent_api branch from ca2a468 to 6b82d00 Compare November 1, 2021 15:00
@lgirdwood
Copy link
Member

@kkarask @wszypelt looks like the CI is stuck ? The logs are not loading, can you refresh it ? Thanks !

@kkarask
Copy link

kkarask commented Nov 2, 2021

PR passed after rerun.

@lgirdwood
Copy link
Member

SOFCI TEST

@lgirdwood
Copy link
Member

CI showing a timeout on CML where the script was killed before it completed.
https://sof-ci.01.org/sofpr/PR4702/build10972/devicetest/?model=CML_RVP_SDW&testcase=check-suspend-resume-with-playback-5

@lgirdwood lgirdwood merged commit b332dfa into thesofproject:main Nov 2, 2021
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ehm, not sure __coherent is a good name for it. It does nothing for coherency. I understand that it's used for coherency, but to me it sounds as if it could provide coherency. Also why packed?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sorry for late comments... Thinking about the name of coherent_acquire() - we are returning a cached alias, so it is incoherent actually. It's protected in software, yes, but that doesn't make it coherent...

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

at some later time when we use lockdep, we might want to define __acquires() similar to the kernel...

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we want to still use mixed cached / uncached access if CONFIG_INCOHERENT is disabled? Wouldn't it be possible and desirable to always use cached addresses in such cases?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no need to assign

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ditto

* audio component buffer - connects 2 audio components together in pipeline.
*
* The buffer is a hot structure that must be shared on certain cache
* incoherent architectures.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

concening @plbossart's remark on DMA use-cases: is this API actually supposed to be used for DMA memory too? Don't think it's well suited for that, since spinlocks don't protect against DMA. And I'm not sure why we'd need the Kconfig option then?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

also here - it looks like all these methods aren't used for DMA handling

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

wait, above you just asserted, that object is an uncached alias. And now you're using it for cache manipulation? Besides, if someone passes you an uncached object, doesn't it mean that there can be no stale cached data for it? Then why writing back its cache? Same in other similar locations

@marc-hb
Copy link
Collaborator

marc-hb commented Aug 22, 2023

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

8 participants