-
Notifications
You must be signed in to change notification settings - Fork 349
Heap refinement Part 4 -- zones merged #4747
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
src/lib/alloc.c
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if SOF_MEM_FLAG_COHERENT == 1, then we prefer uncache addr ? the flag name is confusing. Or named SOF_MEM_FLAG_UNCACHE
Kconfig.xtos-dbg
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
means uncached buffer ? I know cache coherence but what is coherence buffer ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, so coherent (a generic term) means uncache (coherent) memory on Intel HW.
b7ad7cf to
00633db
Compare
|
Can one of the admins verify this patch? |
00633db to
8739299
Compare
359dd45 to
029ad51
Compare
a4fdaec to
bf05d38
Compare
|
@lgirdwood the CI result is becoming better now. |
Update to initiaze the new allocated buffers to all 0s by rballoc_align() helper. Signed-off-by: Keyon Jie <yang.jie@linux.intel.com>
The rballoc_align() helper will initialize new allocated buffers with 0s, update comments to document this. Signed-off-by: Keyon Jie <yang.jie@linux.intel.com>
The fuzzy IPC testing reports there is memory leak if dma_trace_config IPC is sent multiple times, we should just return if the DMA trace buffer is already allocated, to avoid allocating new buffers and the leakage of the old buffers. Signed-off-by: Keyon Jie <yang.jie@linux.intel.com>
When freeing blocks, we need to perform writeback and invalidate to the dirty cache lines, otherwise, they will be evicted from the cache at some point in the future, which will break the usage of the same memory region from another DSP core. Introduce a free_ptr to make sure the original 'ptr' is not changed, so we can use it for this wb/inv operation. Signed-off-by: Keyon Jie <yang.jie@linux.intel.com>
Change the usage of the runtime_shared zone to the buffer zone, and the calling of rmalloc/rzalloc() from runtime_shared zone to rballoc() with SOF_MEM_ZONE_RUNTIME_SHARED flag. Change the rballoc to zeroing allocated buffers to meet the rzalloc() usage also. Signed-off-by: Keyon Jie <yang.jie@linux.intel.com>
Change to use rballoc(SOF_MEM_FLAG_COHERENT,) one now. Signed-off-by: Keyon Jie <yang.jie@linux.intel.com>
The mux cd could be bigger than 1KB, change to use rballoc() for allocating the buffer, to try to ensure the allocation is success. Signed-off-by: Keyon Jie <yang.jie@linux.intel.com>
We are allocating more buffers from BUFFER zone now, change size of ZONEs to reflect that. Signed-off-by: Keyon Jie <yang.jie@linux.intel.com>
Change the usage of the system_shared zone to the buffer zone, and the calling of rmalloc/rzalloc() from system_shared zone to rballoc() with SOF_MEM_ZONE_RUNTIME_SHARED flag. Signed-off-by: Keyon Jie <yang.jie@linux.intel.com>
We are allocating more buffers from BUFFER zone now, change size of ZONEs to reflect that. Signed-off-by: Keyon Jie <yang.jie@linux.intel.com>
Since there is no any usage of the runtime_shared zone now, we can remove it thoroughly now. Signed-off-by: Keyon Jie <yang.jie@linux.intel.com>
Since there is no any usage of the system_shared zone now, we can remove it thoroughly now. Signed-off-by: Keyon Jie <yang.jie@linux.intel.com>
Align to the new heap memory map and allocator, SYSTEM_SHARED and RUNTIME_SHARED zones are removed. Signed-off-by: Keyon Jie <yang.jie@linux.intel.com>
Now buffer zone should have bigger size, and runtime zone is smaller. Signed-off-by: Keyon Jie <yang.jie@linux.intel.com>
af098ac to
acbbfa5
Compare
lyakh
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the general idea of reducing the number of zones is good, but I don't think using rballoc() for all allocations is good. Could we do the opposite - remove rballoc() and use rmalloc() for all allocations?
| uint32_t alignment) | ||
| { | ||
| return malloc(bytes); | ||
| return calloc(bytes, 1); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sorry, why? I'm against this. The user can decide if they need to initialise the buffer. Often enough the whole of the buffer will be overwritten anyway. This would just waste cycles.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sorry, why? I'm against this. The user can decide if they need to initialise the buffer. Often enough the whole of the buffer will be overwritten anyway. This would just waste cycles.
this is good point, this is for alignment the testbench to the existed allocator only. In subsequent cleanup series, I will unify the allocation helpers, e.g. to remove the rballoc() thoroughly and use rmalloc()/rzalloc() correspondingly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we align this with Zephyr, i.e. if Zephyr clears then we should too, if Zephyr does not clear then we clear in the caller code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@lgirdwood are you planning to keep wrappers around Zephyr heap API or you want to use them directly? I think we should have wrappers. Firstly, they only have two kinds of allocations: aligned and unaligned. Secondly, their allocators take a heap instance as an argument, which applications commonly have no idea about. So, no zeroing (calloc()), no cache-coherency. And if we do have wrappers - let's design them and move towards that design.
|
|
||
| /* return if already initialized */ | ||
| if (buffer->addr) | ||
| return 0; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
shouldn't dma_trace_enable() return immediately in this case too?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
shouldn't
dma_trace_enable()return immediately in this case too?
Tried that but if failed a lot according to CI result. This is only to fix the fuzzy test failure reported, may be it should be split out as a separated PR.
| * future, on top of the memory region now being used for | ||
| * different purposes on another core. | ||
| */ | ||
| dcache_writeback_invalidate_region(ptr, block_map->block_size * hdr->size); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why writing it back? If memory is freed, its contents aren't needed any more? But anyway I guess this is superseded by #4851
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why writing it back? If memory is freed, its contents aren't needed any more? But anyway I guess this is superseded by #4851
yes, it was inspired by Andy's #4851 but this is for XTOS, doing DHWBI (Data-cache hit writeback invalidate) looks like a better operation for the memory freeing to me, it follows the willing that the last changed content flushed back to memory.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sorry, not following. "Better" must have an explanation. Why write it back? Nobody needs it.
| uint8_t *dynamic_vectors; | ||
|
|
||
| dynamic_vectors = rzalloc(SOF_MEM_ZONE_RUNTIME_SHARED, 0, 0, SOF_DYNAMIC_VECTORS_SIZE); | ||
| dynamic_vectors = rballoc(SOF_MEM_FLAG_COHERENT, 0, SOF_DYNAMIC_VECTORS_SIZE); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
would be good to explain why. Just because it's a relatively large allocation of 1K?
|
|
||
| /* allocate new buffer */ | ||
| buffer = rzalloc(SOF_MEM_ZONE_RUNTIME_SHARED, 0, SOF_MEM_CAPS_RAM, | ||
| buffer = rballoc(SOF_MEM_FLAG_COHERENT, SOF_MEM_CAPS_RAM, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sorry, I'm totally confused now. I thought rballoc() was only for audio buffers. Then some other uses cropped in. Now you're suggesting even more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sorry, I'm totally confused now. I thought
rballoc()was only for audio buffers. Then some other uses cropped in. Now you're suggesting even more.
It is just replacement here, as we will remove the RUNTIME_SHARED zone thoroughly in the subsequent commits.
The naming is not so important to me, for the allocator, it doesn't know about what the required buffer will be used for, @lgirdwood asked me to do a "Part 5" to unify the rxxalloc() helpers, at that time we will have only e.g. rmalloc() and rzalloc(), and the allocator internal will decide where (which zone) it will allocate the required buffer on.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@lyakh direction of travel is simplification of allocator to align with Zephyr
lgirdwood
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Lets align the clear/no clear with Zephyr.
| uint32_t alignment) | ||
| { | ||
| return malloc(bytes); | ||
| return calloc(bytes, 1); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we align this with Zephyr, i.e. if Zephyr clears then we should too, if Zephyr does not clear then we clear in the caller code.
|
@keyonjie @lyakh lets simplify further in alloc.c/h and in wrapper.c and take us closer to POSIX here (which will also align with Zephyr as updates are merged). Can we consolidate our memory APIs to /* reuse current caps */
#define SOF_MEM_CAPS_RAM (1 << 0)
#define SOF_MEM_CAPS_ROM (1 << 1)
#define SOF_MEM_CAPS_EXT (1 << 2) /**< external */
#define SOF_MEM_CAPS_LP (1 << 3) /**< low power */
#define SOF_MEM_CAPS_HP (1 << 4) /**< high performance */
#define SOF_MEM_CAPS_DMA (1 << 5) /**< DMA'able */
#define SOF_MEM_CAPS_CACHE (1 << 6) /**< cacheable */
#define SOF_MEM_CAPS_EXEC (1 << 7) /**< executable */
void *rmalloc(size_t size, uint32_t caps);
void rfree(void *ptr);
void *rcalloc(size_t nmemb, size_t size, uint32_t caps);
void *rrealloc(void *ptr, size_t size, uint32_t caps);This means we can delete |
Sure, maybe "rcalloc" -> "rzalloc". By the way, should the allocator always return uncached address as we don't have flag param for those APIs? |
Yep, that's fine.
Look here #define SOF_MEM_CAPS_CACHE (1 << 6) /**< cacheable */We already have a cached flag, so should use it. Btw, @kv2019i has now aligned Zephyr to use same cache/uncache mapping now as xtos so your updates should align on both xtos and zephyr :) |
Hi @lgirdwood let me try to clarify it. Now the CAPS_CACHE is somewhat confusion, the flag doesn't mean anything, it is listed in all zones in memory.c. And the usage of it from the topology don't tell any difference wrt this flag: So maybe we can add a new flag e.g. MEM_CAP_DIRECT_ADDRESS if we want to use it to denote if the zone is uncached address used only, but for this kind of zone, do we intend to return cached address/ptr to the user in allocation APIs? |
The point here is that API callers who pass |
That's doable, if we are aligned on that, we might need to revisit all API callers as most of them are not following that at the moment, almost none of the caller uses SOF_MEM_CAPS_CACHE explicitly when they are asking for cached address except the audio buffer from the topoloies: |
Great, that's what we need to do. Thanks ! |
|
This is only improvement not mandatory, I have no more time to work on this, closing it. |
Merge zones SYSTEM_SHARED and RUNTIME_SHARED to the BUFFER zone, and change the corresponding callers.