-
Notifications
You must be signed in to change notification settings - Fork 349
alloc: fix the general one-buffer memory allocation case #3646
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
SOFCI TEST |
|
@zrombel what about this one? Also device issues? |
|
@lyakh It looks like real issue. On platforms all platforms where Keyword Detection tests apply there is a DSP panic. |
|
@lyakh it's worth checking the KWD here, as it may be relying on the current behaviour - even if's non optimal. |
lgirdwood
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@lyakh can you check KWD before and after this PR. It maybe obvious from trace output.
|
This PR is causing failures on test platforms. I've temporary blacklisted it. Please let me know when PR will be ready for testing and merging. |
|
Can one of the admins verify this patch? |
@lgirdwood I was waiting for the latest CI run to complete, is it in "expected" state because @zrombel blocked it? If so, yes, please unblock, all the other tests look good. |
|
@zrombel any comment here, it appears that CI has been in expected state for 19hrs ? |
|
I've removed this PR from blacklist and scheduled it for build and testing. Results should be available during the day. |
@zrombel thanks. Could you also explain a bit how that blacklisting works? PRs get re-tested by the CI only when they are updated, when a new revision is committed, right? So why is there a need to blacklist PRs? |
|
Yes, CI is triggered only when PR's are updated. This PR was updated couple of times and each time it got tested on CI platforms it caused platform failures so they needed manual restarts and causing CI failures of other PR's. I couldn't ques if this PR would be frequently updated or not so I put it on black list for sake of other PR's. Hopefully we won't be needing blacklist again :) |
@zrombel I see, thanks. Maybe it would be possible and make sense to find out why devices were crashing hard and maybe implement automatic recovery / restart? |
|
@zrombel sorry, confused again. Are tests still running or have they completed? They seem to report completion, but the output is very small and there are no failures there although the complete test reports failure. Could it be that you unblocked this PR only partially? |
|
This PR must be cursed, first it has been crashing our platforms and now logs were not uploaded to server ;) I've rerun it and now everything went as it should and you can see logs. There are two FAILS, both caused by DSP Panic in KdDmicD0ix test. |
|
Regarding recovery/restart procedure - CI infrastructure does not suport DUT hard reset at this point. Changes that would be needed to achieve this functionality would result in many others issues which we intentionally trying to avoid. Since issues like caused by this PR happens rather rarely I would keep CI as it is. |
|
Lets see if we can reduce the curse :) @lyakh can you split this PR into 3 PRs - one for each patch. It should then be simpler to see what's blocking the CI and will be able to merge the other 2. One of these patches could be uncovering a bug in the code ..... |
|
@zrombel sorry, confused again. Are tests still running or have they completed? They seem to report completion, but the output is very small and there are no failures there although the complete test reports failure. Could it be that you unblocked this PR only partially?
@lgirdwood sorry, it is a good method in general, but in this case patches 2 and 3 are functional dummies, they are purely cosmetic / theoretical. |
|
@zrombel You also know contents of individual quickbuild tests, right? How are those keyword detection tests performed? There is also a KWD test in sof-test. I wanted to run it, but it requires a USB audio card. Is this also how respective quickbuild tests work? Can I reproduce somehow or at least get a DSP log? |
|
@lyakh PR was fully tested. The reason why FW trace logs are so short is 07_05_TestKdDmicD0ix16000Hz24b32b2ch test puts DSP in D0ix state and to achieve that trace logs have to be disabled. And that what makes KD D0ix issues so hard to debug. But from what I can see the problem here isn't KD it self. Test fails when stream is created so probably there is some problem with memory allocation. I can run KD tests without D0ix transition manually on Monday and provide you trace logs. Regarding KD D0ix tests: |
|
@aiChaoSONG are you able to help @lyakh here since you know the test ? @mengdonglin fyi - needed for v1.7 |
To reproduce this issue I'm also trying to run the keyword detection sof-test script, and that doesn't seem to run at all in my setup (see my today's internal mails) |
|
Found and fixed one bug. The result improved, but still not 100%...
I cannot directly relate these failures to the PR, but I cannot exclude causation either, especially in the latter case... More debugging needed, but I don't have access to BSW hardware. |
|
@lyakh looks good on internal CI - can you check the build CI |
@lgirdwood Yes, I don't know what to do with sporadic failures. The previous version was the same only without one debug print. Now the internal CI had no failures. The fw-build failure is ".text segment too large." Presumably, my debug print crossed the border... Waiting for the device-test now. |
|
Build failure on BYT, due to .text section too big. @lyakh @lgirdwood , on-device-test will not run if there is build failure currently. |
@aiChaoSONG building only failed for one platform and this will block testing on all devices?.. |
|
SOFCI TEST |
|
@aiChaoSONG seems the same happened again - one compilation failed and all further testing is blocked |
|
@lyakh I triggered a test on this PR, check our internal report, 1062 1063, except the build failure on byt, no issue found. |
The condition "size + alignment <= block_size" for allocating memory from a signle buffer is sufficient but not precise enough. For example if we want to allocate 20 bytes with 64-byte alignment, a 32-byte buffer *might* be sufficient if it's suitably aligned. Fix the algorithm to account for such cases. Signed-off-by: Guennadi Liakhovetski <guennadi.liakhovetski@linux.intel.com>
temp_bytes is only used if CONFIG_DEBUG_BLOCK_FREE is defined. Limit its scope to only such configurations. Signed-off-by: Guennadi Liakhovetski <guennadi.liakhovetski@linux.intel.com>
alloc_block() will call platform_shared_commit() on the map too, no need to do that twice. Signed-off-by: Guennadi Liakhovetski <guennadi.liakhovetski@linux.intel.com>
|
Current CI failures are: (1) device-test on ZGL - last 3 tests timed out, (2) travis - 2 builds failed because of docker rate-limits |
Note this has to wait for #3642 to be merged first to then get rebased on top of it.
The condition "size + alignment <= block_size" for allocating memory from a signle buffer is sufficient but not precise enough. For example if we want to allocate 20 bytes with 64-byte alignment, a 32-byte buffer might be sufficient if it's suitably aligned. Fix the algorithm to account for such cases.