-
Notifications
You must be signed in to change notification settings - Fork 349
memory profiling: heap memory profiling at runtime #6525
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
952504e to
0ac416f
Compare
|
Btw, one more existing mechanism (that is arguably not so widely used) is the IPC3 src/ipc/ipc3/handler.c:ipc_glb_test_mem_usage() . This relies on OS implementing (heap_info()), which we don't do for Zephyr SOF builds. The nice thing with this was that this can be queries from kernel with an IPC message, so no need to parse the data from FW logs, the info is available even if FW logging is not, and so forth. I think this goes beyond this PR though, as this approach will be IPC dependent, but in the long term, we do want to expose Zephyr memory runtime data in this model. FYI to @lgirdwood @mwasko and @marcinszkudlinski |
app/perf_overlay.conf
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this need to be a debug overlay @kv2019i ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Make sense if put is as default debug overlay, since its functionality is for stack overflow detection.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry not following how the heap state helps with stack overflow detection, is this a different Zephyr option ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
they are separate functionality, no relation between heap and stack overflow detection.
CONFIG_SYS_HEAP_RUNTIME_STATS is used to collect heap allocated/free space, total is around 1.4MB.
CONFIG_STACK_SENTINEL this is used to enable stack overflow detection by check a magic number in the stack bottom, after enable this, once stack overflow happened, Zephyr logging will print "Stack overflow" with other information, so move it to debug build config is also good, please let me know, if you want to move it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@btian1 see above, we need these 2 Kconfigs to allow the developers to tune the output depending on what they are debugging.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@lgirdwood , I did not find in current SOF code base, add it from new and then add code to make these config work?
I also want to add this at the beginning, however, due to wrapped too much layer, I did not try with this.
It may need introduce new parameters for each function and code change may be big, any better ideas?
rmalloc/rzalloc...., need add pipeline and component info for each memory alloc and free.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
config SOF_HEAP_RUNTIME_ALLOC
@lgirdwood Maybe add "DEBUG" or "PRINT" somewhere? Otherwise this option looks to me like it enables heap allocations and without it rmalloc() won't work. Same for SOF_HEAP_RUNTIME_PIPELINE perhaps
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I also suspect that those options would have quite a large run-time / logging impact, so I'd really rather have them off by default and only enable them for debugging. The former would supposedly "only" print one line for each allocations, but we allocate quite a lot. We have many instances where we initialise a single functionality but use several allocations for it. Hopefully we don't allocate any memory for each .copy() / .process() iteration, but even when one pipeline is running and another one processes IPCs which will print excessive additional information for memory allocations - that can affect performance of the running pipeline.
Same holds for heap state dumps for each pipeline creation and destruction. Especially since we now have multiple pipelines for most or all streams. I know we have those dumps with XTOS and I find them quite intrusive and excessively polluting logs too (see #4744 )
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@RanderWang , Rander today introduced that close fw alloc memory based on pipeline, this is much more easy to do data collection and analysis.
For SOF, allocation based on rmalloc, any code call this module can get a buffer with specific size, the difficult part is we have to add all called place with pipe and component information.
Ack, I think we do need to extend IPC for development. i.e. debug, memory, status info, thread info, and a Zephyr shell. This could all come under a debug Kconfig menu. |
lgirdwood
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@kv2019i any comments ?
app/perf_overlay.conf
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry not following how the heap state helps with stack overflow detection, is this a different Zephyr option ?
app/perf_overlay.conf
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please enable this config option in a separate patch as this is totally different logical change.
Also, please imperative mood in the commit message.
e.g: memory: Add support for heap memory profiling
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is enabled in another patch:#6530, so I will drop here, and when profiling build, just need add -d option is fine
7f5665e to
8ce06dc
Compare
Yes I am fully agree, IPC implementation will be more convenient, no need print is great, I also feel difficult for this adding, without adding, we don't know the heap, if added, print a lot of fw logs, take a 48k as example, it finally allocated 0x9698 37k bytes. Please suggest next step, if we want to add ipc in near future, I am ok with that solution. Thanks |
|
@kv2019i , your test remind me whether we need add one more functionality with this PR. Let me know what do you think? Thanks |
zephyr/wrapper.c
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think for "normal" byte-counts decimal sizes are easier to read, i.e. I'd use %u. Or at least put 0x in front or use %#x
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Change to which format both ok, some developer have a brain can quick know real size with hex.
Ok, I can also change it to decimal, it is more straight forward to most people
Heap memory profiling is based on runtime zephyr API. It will print out each memory allocation with allocated bytes and free bytes, display as below: zephyr: heap allocatd: 22c0 free: 152a90 max allocated: 1000 Signed-off-by: Baofeng Tian <baofeng.tian@intel.com>
Heap memory profiling is based on runtime zephyr API.
It will print out each memory allocation with allocated bytes
and free bytes, display as below:
zephyr: heap allocatd: 22c0 free: 152a90
Signed-off-by: Baofeng Tian baofeng.tian@intel.com