Skip to content

Conversation

@ktrzcinx
Copy link
Member

@ktrzcinx ktrzcinx commented Sep 4, 2020

Such an mechanism may be used to monitor memory utilization and memory leaks.

Possible memory leaks tracking was main motivation for this mechanism. To scan them, memory usage should be scanned before topology creation, then some functional test should be done, topology deletion and iterate whole process a few times - when memory usage grows, then some memory leak is present. To efficiently track memory leaks, memory utilization must be scanned at strictly defined moments in FW life-cycle, it's why scan should be done on host request instead regular memory utilization printing (eq. in trace system).

kernel part: thesofproject/linux#2459

@ktrzcinx ktrzcinx added the ABI ABI change involved label Sep 4, 2020
@ktrzcinx ktrzcinx requested review from mmaka1 and zrombel September 4, 2020 06:51
Copy link
Collaborator

@kv2019i kv2019i left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

note: SW review as per "RFC" step process in https://thesofproject.github.io/latest/contribute/process/abiprocess.html#sof-abi-changes

Looks like a welcome change, but some comments on the naming and general question whether tihs exposes too much specific implementation information.

@ktrzcinx ktrzcinx force-pushed the mem-leak branch 4 times, most recently from 7abc2f0 to 6df7310 Compare September 4, 2020 11:14
@marc-hb
Copy link
Collaborator

marc-hb commented Sep 4, 2020

SOFCI TEST

EDIT:

  • CML devices were broken because of an internal script change merged 11 hours ago. Fixed by @fredoh9 just now.
  • Internal Intel CI System/merge/build is down for the WE for maintenance.

@lgirdwood
Copy link
Member

Jenkins CML CI passing now.

@ktrzcinx ktrzcinx force-pushed the mem-leak branch 2 times, most recently from b53f045 to 655595f Compare September 7, 2020 10:43
@ktrzcinx ktrzcinx changed the title [RFC] Add memory usage scan possibility Add memory usage scan possibility Sep 21, 2020
Such a function may be used to monitor memory leaks and system
utilization.

Signed-off-by: Karol Trzcinski <karolx.trzcinski@linux.intel.com>
Function name should correspond IPC name.

Signed-off-by: Karol Trzcinski <karolx.trzcinski@linux.intel.com>
@lgirdwood lgirdwood added this to the ABI-3.18 milestone Sep 23, 2020
@lgirdwood
Copy link
Member

@ktrzcinx @kv2019i any updates on how kernel will expose this info to userspace ?

@plbossart
Copy link
Member

@ktrzcinx @kv2019i any updates on how kernel will expose this info to userspace ?

thesofproject/linux#2459

Copy link
Collaborator

@kv2019i kv2019i left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Discussion on the kernel PR is continuing, but it's around the userspace<>kernel interface. It would seem we have no open concerns on the IPC interface change itself. Thus with my ABI hat on, +1 for the change.

Copy link
Member

@plbossart plbossart left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sounds good at the firmware level, for the kernel I have concerns that we need to prevent pm_runtime from kicking in, otherwise during tests we'll lose all the information. It's more a usability concern than with retrieving information from firmware.

Copy link
Member

@lgirdwood lgirdwood left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you make this a Kconfig menu entry with a flag in the manifest so kernel knows whether feature is built in.

@ktrzcinx
Copy link
Member Author

Added Kconfig option and flag in ext_man_config_data.

@lgirdwood
Copy link
Member

Seeing a build error

 27%] Building ASM object src/arch/xtensa/hal/CMakeFiles/hal.dir/syscache_asm.S.o
/home/test/jenkins/workspace/sof_prs/src/init/ext_manifest.c:95: error: ‘CONFIG_DEBUG_MEMORY_USAGE_SCAN’ undeclared here (not in a function)
src/init/CMakeFiles/ext_manifest.dir/build.make:62: recipe for target 'src/init/CMakeFiles/ext_manifest.dir/ext_manifest.c.o' failed
make[3]: *** [src/init/CMakeFiles/ext_manifest.dir/ext_manifest.c.o] Error 2
CMakeFiles/Makefile2:2229: recipe for target 'src/init/CMakeFiles/ext_manifest.dir/all' failed

@lgirdwood
Copy link
Member

@ktrzcinx can you check UT on CI.

This feature will be needed to monitor memory utilization and
memory leaks. It may be usable also in realese builds, so removed
conditional ipc_glb_test_message compilation.

Signed-off-by: Karol Trzcinski <karolx.trzcinski@linux.intel.com>
This feature is not needed to get functional firmware,
so may be disabled for platforms with low memory space,
like baytrail and cherrytrail.

Signed-off-by: Karol Trzcinski <karolx.trzcinski@linux.intel.com>
This component is conditionally compiled, so passing such an
information to driver, allows to check possible scanning failure
reason.

Signed-off-by: Karol Trzcinski <karolx.trzcinski@linux.intel.com>
@ktrzcinx
Copy link
Member Author

ktrzcinx commented Oct 2, 2020

Reduced number of reserved bytes, as suggested here: thesofproject/linux#2459 (comment)

@lgirdwood
Copy link
Member

CI is known issue boot issues on ICL. Could be a script race where we may need to cleanup SHM or message Qs.

The command "./scripts/docker-run.sh ./scripts/xtensa-build-all.sh -r $PLATFORM" exited with 0.
4.85s$ ./scripts/docker-qemu.sh ../sof.git/scripts/qemu-check.sh $PLATFORM
./xtensa-softmmu/qemu-system-xtensa -cpu cannonlake -M adsp_cnl -nographic -kernel /home/sof/sof.git/build_cnl_gcc/src/arch/xtensa/sof-cnl.ri -rom /home/sof/sof.git/build_cnl_gcc/src/arch/xtensa/rom-cnl.bin
bridge-io: msg send failed -9
qemu-system-xtensa: terminating on signal 15 from pid 14 (timeout)
ipc reg dump:
00000010  00 00 00 f0 00 00 00 00  80 00 00 00 00 00 00 00  |................|
ipc message dump:
00005000  6c 00 00 00 00 00 00 70  00 00 00 00 00 00 00 00  |l......p........|
00005010  00 00 00 00 00 00 00 00  3c 00 00 00 01 00 05 00  |........<.......|
00005020  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
00005030  00 00 00 00 00 00 00 00  00 00 35 37 34 64 32 00  |..........574d2.|
00005040  00 20 01 03 53 08 2f 28  00 00 00 00 00 00 00 00  |. ..S./(........|
Boot success
./xtensa-softmmu/qemu-system-xtensa -cpu icelake -M adsp_icl -nographic -kernel /home/sof/sof.git/build_icl_gcc/src/arch/xtensa/sof-icl.ri -rom /home/sof/sof.git/build_icl_gcc/src/arch/xtensa/rom-icl.bin
qemu-system-xtensa: terminating on signal 15 from pid 40 (timeout)
Error ipc reg failed
Error boot failed

@lgirdwood lgirdwood merged commit b119fe2 into thesofproject:master Oct 6, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ABI ABI change involved

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants