coherent: tighten up the API #5285

lyakh · 2022-01-28T13:49:37Z

This patch adds documentation to the coherent API header and cleans up several potentially harmful operations. Specifically, it isn't a good idea to modify memory via uncached addresses and then write back caches, because the latter can overwrite uncached writes. It works because there is no data in cache at that point, which also makes that call redundant. Similarly, we are sure not to have any data in caches when obtaining access to the memory, because otherwise our spinlock-acquisition operation could be overwritten by a speculative write back.

lgirdwood · 2022-01-31T11:27:06Z

src/include/sof/coherent.h

Can be used with DATA and BSS too. The API header can be embedded.

src/include/sof/coherent.h

lgirdwood · 2022-01-31T11:32:11Z

src/include/sof/coherent.h

Why are you removing this ? Our local cache may not be up to date with any changes other cores have written back.

that's the assumption that is documented above: that when a core acquires access to a memory area via this API, it has no cache lines associated with it. If it never accessed it, obviously it has no cache associated with it. If it used it before and released it, then when releasing it it invalidated all caches. So, while a core is not holding access to such a memory area, it is guaranteed to have no cached data for it.
Moreover, when a core gains access to such an area, it is guaranteed, that no other core has cache lines, associated with that memory.

We still need to invalidate local L1 as it has no way of knowing uncache has changed since our last usage. e.g.
core 0 -> core 1 -> core 0.
In the above core 0 local L1 needs to be invalidated so it can be coherent with whatever core1 wrote back to uncache.

@lgirdwood don't think so. We invalidate on release. When core 0 in your example last time released the memory, it invalidated its cache. It has no cached data now for this memory.

ok, your right - but I'm not going to merge this as it requires a ton of validation. I'm happy to accept a comment here though with a TODO.

I think it would be better to apply this to really make this API as water-tight as possible. To absolutely emphasise that we know what we are doing, and that only by following the rules of this API it can work. But then it will work reliably. But ok, in principle this doesn't hurt here and it shouldn't even hurt performance - we should have no cached data here so this will be a NOP. So, I can make this a comment if you prefer...

lgirdwood · 2022-01-31T11:32:28Z

src/include/sof/coherent.h

lgirdwood · 2022-01-31T11:35:52Z

src/include/sof/coherent.h

The WB/INV is still needed here, the only change that would be valid is that uncache_to_cache be removed since the object being passed in here to release() will be the cache alias.

why is it needed? We are just initialising the coherent API, so, we're guaranteed to not have caches for it, because we either haven't accessed it yet, or we accessed it and released it and dropped all caches for it. Note, that we might or might not have used the coherent API for it before, e.g. in the heap case. But rfree() drops caches anyway. And we must not write back caches here. We just wrote data to that memory bypassing cache, using uncached ("coherent") alias. So, if we really do happen to have cached data for it here and we write it back here - we just have overwritten our initialisation.

We dont know who and how the memory was used before we use it. This fixed issues that were validated.

@lgirdwood I think this is actually wrong. Again: we wrote to SRAM bypassing cache at lines 170-173 above and now we're (potentially) overwriting what we just have written with some stale data from cache - this is a bug. It only doesn't bite us because we actually don't have any dirty caches at this point, so, this is a NOP. But (1) it makes an impression that we might have dirty caches at this point and (2) if we did happen to have any, it would corrupt our data.
Again, I think this API cannot be used in the "we don't know" mode. It can only work if absolutely enforced. Any memory, designated to be used with this API must only be accessed via it. If this is ever violated, it can corrupt data. This cannot be made to work without a certainty, that the memory is only accessed via this API.
One seeming exception is heap memory, where of course the same memory could have been allocated before and used without this API. And that indeed can be a problem: if such a memory was only allocated, used and released on the same core - we're fine. rfree() will invalidate caches. But if that memory was accessed via its cached aliases on different cores - we have a problem. E.g. it could well happen that memory got allocated on core 0 or 1, then got used only by core 1 via cache - so data is safe, but then got freed by core 0. Then we have a problem. Core 1 can still hold dirty caches for that memory. When we then allocate it for coherent API use on core 0, nothing can prevent cache from core 1 from being written back at any moment of time overwriting our data.
So, we must always guarantee that dynamically allocated (heap) memory is either (1) only allocated, used and freed by the same core, or (2) shared via this API, or (3) shared but only accessed via uncached aliased.

lgirdwood · 2022-01-31T11:42:41Z

src/include/sof/coherent.h

Needed here.

@lgirdwood why? this is a free(). We finished using the memory already and released it. When we released it all caches have been synchronised and dropped. We cannot free such memory while it is used / not released.

The next user may not use the coherent API.

@lgirdwood sorry, I was trying to say, that we have no cached data at this point, there is nothing to write back or invalidate. This function - coherent_free() - can only be called either if we didn't use the memory at all, immediately after coherent_init() or after we used it and so we called coherent_acquire() / coherent_release() for every such use. The last call to coherent_release() has already invalidated caches. We have no cached data of this memory, nothing to write back or invalidate.
In fact, the next user either must continue using the coherent API - as long as this memory is kept, or this memory must be freed back to the heap, at which point caches will be invalidated again.

This patch adds documentation to the coherent API header and cleans up several potentially harmful operations. Specifically, it isn't a good idea to modify memory via uncached addresses and then write back caches, because the latter can overwrite uncached writes. It works because there is no data in cache at that point, which also makes that call redundant. Similarly, we are sure not to have any data in caches when obtaining access to the memory, because otherwise our spinlock-acquisition operation could be overwritten by a speculative write back. Signed-off-by: Guennadi Liakhovetski <guennadi.liakhovetski@linux.intel.com>

Align several line-continuation backslashes in macro definitions. Signed-off-by: Guennadi Liakhovetski <guennadi.liakhovetski@linux.intel.com>

lyakh · 2022-02-02T11:05:37Z

The new version only removes write-back operations that definitely would overwrite data with stale contents if cache lines happened to be associated with the memory.

lgirdwood · 2022-02-02T14:22:39Z

CI showing the "sof logger already dead" on some Zephyr tests and known Zephyr issue.

lyakh requested review from dbaluta, lbetlej, lgirdwood, mmaka1 and plbossart as code owners January 28, 2022 13:49

lyakh mentioned this pull request Jan 28, 2022

Convert spinlock to Zephyr API #5286

Merged

lgirdwood requested changes Jan 31, 2022

View reviewed changes

lyakh added 2 commits February 2, 2022 11:57

coherent: (cosmetic) align backslashes

ed12ade

Align several line-continuation backslashes in macro definitions. Signed-off-by: Guennadi Liakhovetski <guennadi.liakhovetski@linux.intel.com>

lyakh force-pushed the coherent branch from f40baf8 to ed12ade Compare February 2, 2022 11:02

lgirdwood approved these changes Feb 2, 2022

View reviewed changes

lgirdwood merged commit 2b4d559 into thesofproject:main Feb 2, 2022

lyakh deleted the coherent branch February 2, 2022 14:33

coherent: tighten up the API #5285

coherent: tighten up the API #5285

Uh oh!

Conversation

lyakh commented Jan 28, 2022

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

lgirdwood Jan 31, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

lyakh commented Feb 2, 2022

Uh oh!

lgirdwood commented Feb 2, 2022

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

lgirdwood Jan 31, 2022 •

edited

Loading