SOF: ABI: fix implicit type overflow #2524

cujomalainey · 2020-10-22T18:15:35Z

Implicit values have a length of 15bits (s16) so we need to declare the
proper size so we we don't get undefined behaviour. Ubsan discovered
this bug in the firmware.

Signed-off-by: Curtis Malainey cujomalainey@chromium.org

include/sound/sof/header.h

lyakh

Sorry, I'm not getting this. Which compiler under which circumstances has 16-bit (short) integers as default? Which calculations fail? I would be surprised if we really had this problem in the kernel, a couple of things would be failing.

cujomalainey · 2020-10-23T07:25:18Z

@lyakh this is clang's default (for at least the fuzzer build). I think the compiler is smart enough to promote the type or it is optimized out when running optimized builds, but the fuzzer is spotting this and is having none of it.

lyakh · 2020-10-26T09:45:15Z

@lyakh this is clang's default (for at least the fuzzer build). I think the compiler is smart enough to promote the type or it is optimized out when running optimized builds, but the fuzzer is spotting this and is having none of it.

@cujomalainey I don't want to block this if everybody else finds this correct, but I'm still puzzled, sorry. First of all I think it would be good to provide the details, that you mentioned in your reply to me in either the PR description or the commit message or both. And even actually more details than that. What is a fuzzer build? Can this be a bug in the fuzzer? Could you describe what exactly is a possible scenario when that can be a bug? Do you mean, that when the fuzzer is trying to created fuzzed data for IPC testing it's creating 16-bit fields from that or something similar? Besides, long is 64 bits in my kernel builds for x86_64 (I think it was 32 bits before, but probably my memory is failing me on this). So, setting this to long seems making it even more wrong. It shouldn't hurt once this gets assigned to a 32-bit variable, but it really doesn't seem like a correct fix to me. And I'd also assume the kernel has MANY similar cases, don't think all of them should be fixed in the same way?

kv2019i

I'm ok with the change. @cujomalainey updating the version on the kernel side is a bit iffy and we have no precedent how to do it for PATCH level (in kernel side), so I'd just leave that out from this one as there are many PRs in-flight which will update the MINOR revision in kernel and will conflict with this one.

include/uapi/sound/sof/abi.h

kv2019i · 2020-10-26T16:41:36Z

include/sound/sof/header.h

 /* Global Message - Generic */
 #define SOF_GLB_TYPE_SHIFT			28
-#define SOF_GLB_TYPE_MASK			(0xf << SOF_GLB_TYPE_SHIFT)
+#define SOF_GLB_TYPE_MASK			(0xfL << SOF_GLB_TYPE_SHIFT)


While it's a bit uncommon on kernel side (for long types) -- there are a few instances of this (just in include/linux/sched.h for SCHED_FIXEDPOINT_SCALE). And if this allows to minimize delta in the header versus FW file (thesofproject/sof#3543 which is already merged), I think this is worth it.

cujomalainey · 2020-10-26T18:23:54Z

@lyakh this is clang's default (for at least the fuzzer build). I think the compiler is smart enough to promote the type or it is optimized out when running optimized builds, but the fuzzer is spotting this and is having none of it.

@cujomalainey I don't want to block this if everybody else finds this correct, but I'm still puzzled, sorry. First of all I think it would be good to provide the details, that you mentioned in your reply to me in either the PR description or the commit message or both.

Fair enough, I included the following output in the firmware patch /src/sof/src/ipc/handler.c:1385:9: runtime error: left shift of 15 by 28 places cannot be represented in type 'int'

And even actually more details than that. What is a fuzzer build?

One of the topics that has been discussed at the steering committee since early this year is fuzzing (the idea of using special software to feed bad inputs to try and crash your program and find bad edge cases.) @ranj063 and I hosted a GSOC intern to integrate some fuzzing for us using the testbench. Later this year I got us accepted into oss-fuzz (a continuous fuzzing platform by Google.) Right now I am in the middle of integrating the firmware to the stub for oss-fuzz. The fuzzer feeds bad ipcs into the firmware in an attempt to find bad inputs that cause the firmware to crash or handle events badly. The fuzzer is build with various sanitizers (address, undefined, etc) to help detect various incorrect behaviours. This case was discovered by the undefined behaviour sanitizer.

Can this be a bug in the fuzzer?

Unlikely, the sanitizer found this bug and the call stack is in the firmware and the error has nothing to do with the fuzzer stub.

Could you describe what exactly is a possible scenario when that can be a bug?

As mentioned this is likely optimized by the preprocessor in a regular build to a pre-calculated value therefore it works, but it is semantically incorrect. That being said, given that the firmware is 32bit this may be restricted to the platform and the change strictly on the firmware side will bring it into alignment.

Do you mean, that when the fuzzer is trying to created fuzzed data for IPC testing it's creating 16-bit fields from that or something similar?

No this has nothing to do with the fuzzed data since the error is originating from macros/defines that are all static the to the ABI.

Besides, long is 64 bits in my kernel builds for x86_64 (I think it was 32 bits before, but probably my memory is failing me on this). So, setting this to long seems making it even more wrong.

See comment above about this might only be needed on firmware due to 32bit platform

It shouldn't hurt once this gets assigned to a 32-bit variable, but it really doesn't seem like a correct fix to me. And I'd also assume the kernel has MANY similar cases, don't think all of them should be fixed in the same way?

The issue is due to the brackets of the macros.

The failing line in the ipc handler
type = iGS(hdr->cmd);

The following defines are relevant

#define iGS(x) ((x) & SOF_GLB_TYPE_MASK)
#define SOF_GLB_TYPE_SHIFT                      28
#define SOF_GLB_TYPE_MASK                       (0xf << SOF_GLB_TYPE_SHIFT)

So if we expand this it becomes
type = ((hdr->cmd) & (0xf << 28))

The issue comes from calculating (0xf << 28) since it has no type it defaults to int16 (on the firmware at least) which results in the undefined behaviour sanitizer failing. As mentioned, given that the kernel is not on the same arch this change may not be relevant.

lyakh · 2020-10-27T11:02:02Z

@lyakh this is clang's default (for at least the fuzzer build). I think the compiler is smart enough to promote the type or it is optimized out when running optimized builds, but the fuzzer is spotting this and is having none of it.

@cujomalainey I don't want to block this if everybody else finds this correct, but I'm still puzzled, sorry. First of all I think it would be good to provide the details, that you mentioned in your reply to me in either the PR description or the commit message or both.

Fair enough, I included the following output in the firmware patch /src/sof/src/ipc/handler.c:1385:9: runtime error: left shift of 15 by 28 places cannot be represented in type 'int'

@cujomalainey ok, the thing is, that indeed in C the minimum size of the type int is 16 bits. So, from that PoV it is possible, that some C compilers on some platforms will generate a 16-bit 0 constant from that. But (1) gcc on all platforms has sizeof(int) == 4. (2) quoting Al Viro in https://lore.kernel.org/lkml/20050403230551.GX8859@parcelfarce.linux.theplanet.co.uk/

pretty much everything in the kernel assumes that 4 = sizeof(int) <= sizeof(long)...

So, I'd say that we're safe there. I think the amount of code in the kernel, that takes for granted that sizeof(int) == 4 is so huge, that we don't need to worry. And making it long really doesn't look right.

kv2019i · 2020-10-27T15:23:21Z

@lyakh wrote:

So, I'd say that we're safe there. I think the amount of code in the kernel, that takes for granted that sizeof(int) == 4 is so huge, that we don't need to worry. And making it long really doesn't look right.

I think we agree that kernel side doesn't need this.

But as we just discussed on #2459 (comment) , our current policy is to keep the FW/kernel headers as close as possible, and any deviation needs to have solid grounds:

In Add memory usage scan possibility #2459 we decided to allow an exception as XCC is producing incorrect code (flexible arrays)
Earlier we've chosen to use different notation for packed structs, to silence kernel-side checkpatch error.

In this case, I'm not sure there is enough merit to deviate in the header definitions. Adding the "L" suffix won't really create new problems on kernel side -- it's just not strictly needed. On FW side, this does silence an error that is currently preventing integration.

So I'd go with same definition in this case and use same code on both sides (and merge this PR).

kv2019i

As per rationale given in my last comment, approving this. @lyakh raises a valid argument, and may be something we need to revisit in our ABI process, but for this specific change, my judgement goes towards keeping the definitions the same.

lyakh · 2020-10-28T07:39:42Z

@kv2019i the "firmware header uniformity" is the only valid reason to apply this change, yes, although that uniformity is broken in multiple other more essential instances (flexible arrays). And we have to make sure nothing comes up with the idea to take sizeof() of this macros, because that would be wrong now, I think. Maybe you could do something like ((u32)0xf << 28). Another possibility could be (0x8000000f << 28) which probably would tell any checkers / compilers, that that's a 32-bit value, but none of those options is particularly pretty.
Also I really don't think the commit message "Implicit values have a length of 15bits (s16)" is correct

kv2019i

Oops, few more noted in commit message.

include/sound/sof/header.h

kv2019i · 2020-10-28T14:44:24Z

@plbossart @lgirdwood @ranj063 @bardliao ping, ok for you? this is one PR we could push in before the 5.10-rc1 rebase.

bardliao

Agree that FW/kernel headers should be as close as possible.

ranj063 · 2020-10-28T18:07:09Z

@plbossart @lgirdwood @ranj063 @bardliao ping, ok for you? this is one PR we could push in before the 5.10-rc1 rebase.

@cujomalainey @kv2019i there's a change id in the commit message

cujomalainey · 2020-10-28T18:45:50Z

@plbossart @lgirdwood @ranj063 @bardliao ping, ok for you? this is one PR we could push in before the 5.10-rc1 rebase.

@cujomalainey @kv2019i there's a change id in the commit message

gah, chromium commit hooks got me, thanks, fixed now

Implicit may values have a length of 15bits (s16) so we need to declare the proper size so we don't get undefined behaviour. The appears to be arch and compiler dependent. This commit is to keep the headers aligned between the firmware and kernel. UBSan discovered this bug in the firmware. Signed-off-by: Curtis Malainey <cujomalainey@chromium.org>

kv2019i

Thanks. There are two more typos ("implicit may value" and "The appears") but I can squish those manually.

lyakh

yes, please, let's fix commit message typos

kv2019i · 2020-10-29T09:38:02Z

Merged manually (with typos fixed). Thanks @lyakh for quick ack.

cujomalainey requested review from dbaluta, juimonen, keyonjie, lyakh and ranj063 as code owners October 22, 2020 18:15

cujomalainey mentioned this pull request Oct 22, 2020

abi: fix macro size overflow thesofproject/sof#3543

Merged

ranj063 reviewed Oct 22, 2020

View reviewed changes

include/sound/sof/header.h Show resolved Hide resolved

cujomalainey force-pushed the fuzz-v1 branch from bc918b7 to 5187524 Compare October 22, 2020 18:37

cujomalainey requested a review from ranj063 October 22, 2020 18:37

cujomalainey added the ABI involves ABI changes, need extra attention label Oct 22, 2020

lyakh requested changes Oct 23, 2020

View reviewed changes

kv2019i requested review from bardliao and plbossart October 26, 2020 14:12

kv2019i requested changes Oct 26, 2020

View reviewed changes

cujomalainey force-pushed the fuzz-v1 branch from 5187524 to f964618 Compare October 26, 2020 19:02

kv2019i previously approved these changes Oct 27, 2020

View reviewed changes

lyakh previously approved these changes Oct 28, 2020

View reviewed changes

kv2019i reviewed Oct 28, 2020

View reviewed changes

include/sound/sof/header.h Show resolved Hide resolved

bardliao previously approved these changes Oct 28, 2020

View reviewed changes

cujomalainey dismissed stale reviews from bardliao, lyakh, and kv2019i via e0ccc4b October 28, 2020 18:45

cujomalainey force-pushed the fuzz-v1 branch from f964618 to e0ccc4b Compare October 28, 2020 18:45

cujomalainey force-pushed the fuzz-v1 branch from e0ccc4b to a9d64bf Compare October 28, 2020 19:15

cujomalainey requested review from bardliao, kv2019i and lyakh October 28, 2020 19:21

kv2019i approved these changes Oct 29, 2020

View reviewed changes

lyakh approved these changes Oct 29, 2020

View reviewed changes

kv2019i closed this Oct 29, 2020

cujomalainey deleted the fuzz-v1 branch November 28, 2022 23:28

SOF: ABI: fix implicit type overflow #2524

SOF: ABI: fix implicit type overflow #2524

Uh oh!

Conversation

cujomalainey commented Oct 22, 2020

Uh oh!

Uh oh!

lyakh left a comment

Choose a reason for hiding this comment

Uh oh!

cujomalainey commented Oct 23, 2020

Uh oh!

lyakh commented Oct 26, 2020

Uh oh!

kv2019i left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

kv2019i Oct 26, 2020

Choose a reason for hiding this comment

Uh oh!

cujomalainey commented Oct 26, 2020

Uh oh!

lyakh commented Oct 27, 2020

Uh oh!

kv2019i commented Oct 27, 2020

Uh oh!

kv2019i left a comment

Choose a reason for hiding this comment

Uh oh!

lyakh commented Oct 28, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

kv2019i left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

kv2019i commented Oct 28, 2020

Uh oh!

bardliao left a comment

Choose a reason for hiding this comment

Uh oh!

ranj063 commented Oct 28, 2020

Uh oh!

cujomalainey commented Oct 28, 2020

Uh oh!

kv2019i left a comment

Choose a reason for hiding this comment

Uh oh!

lyakh left a comment

Choose a reason for hiding this comment

Uh oh!

kv2019i commented Oct 29, 2020

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

lyakh commented Oct 28, 2020 •

edited

Loading