Skip to content

Conversation

@andyross
Copy link
Contributor

In IPC4, the component ID is entirely under the control of the input protocol, and as components of multiple types exist in the list, it's possible for protocol messages to fetch an incorrectly typed object, leading to crashes at runtime.

Force all usage to go through a get_comp() utility that does type checking. There was already a similar utility for the special case of pipelines (which have a separate ID to query).

@andyross
Copy link
Contributor Author

crash case found via fuzzing, IPC4 fuzz updating coming in an hour or two once I'm sure it's shaken out

@andyross
Copy link
Contributor Author

Seems like the build failures are continuing fallout of the cache API mishap fixed in zephyrproject-rtos/zephyr#57298 I think? That's merged now in Zephyr, not sure how it propagates to the SOF CI.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@andyross assuming there are 2 different types of components with the same ID, wouldnt this always fail if the first component in the list is not for the specified type even if the other one exists in the list?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would, but by my reading that would be a bug in the design? The list as currently used is already populated with polymorphic components, and already assuming that the first entry with a matching ID is the one desired. Are there circumstances where it would be possible to have duplicate IDs?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With IPC3, the kernel generated unique comp_id's for each component as the list of components is being parsed from topology. And I think IPC4 follows the same rule. So in real devices, this is unlikely.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah... OK, so if the component ID is input to the protocol, this actually is something subject to injected bugs that we'll need to address, I think. I'll take a look, could be as simple as traversing the list and erroring out at creation time.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indeed, I see it now. You're absolutely right, the initial component ID is set by the kernel, not the firmware. So there needs to be protection in the firmware against duplicates, I'll see what I can work up and attach it to this PR, as it's related. The fix here is protecting us against the results of the duplication (and even if there aren't duplicates, this remains an externally triggerable crash which we have to fix), but there is a rooter-cause that needs to be addressed too.

@marc-hb
Copy link
Collaborator

marc-hb commented Apr 27, 2023

Seems like the build failures are continuing fallout of the cache API mishap fixed in zephyrproject-rtos/zephyr#57298 I think?

Yes https://github.com/thesofproject/sof/actions/runs/4822123309/jobs/8588936882?pr=7528

That's merged now in Zephyr, not sure how it propagates to the SOF CI.

Please review Zephyr uprev in sof/west.yml:

Copy link
Collaborator

@kv2019i kv2019i left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While should not happen with the actual kernel driver, agreed this needs to be checked.

@ranj063
Copy link
Collaborator

ranj063 commented Apr 28, 2023

While should not happen with the actual kernel driver, agreed this needs to be checked.

yeah we can deal with the check in a follow up PR.

andyross added 2 commits May 2, 2023 10:43
In IPC4, the component ID is entirely under the control of the input
protocol, and as components of multiple types exist in the list, it's
possible for protocol messages to fetch an incorrectly typed object,
leading to crashes at runtime.

Force all usage to go through a get_comp() utility that does type
checking.  There was already a similar utility for the special case of
pipelines (which have a separate ID to query).

Signed-off-by: Andy Ross <andyross@google.com>
In the SOF scheme, component IDs are allocated by the host driver.  So
they need to be validated before use when modifying the global
component list to prevent duplicate entries from being inserted.

IPC3 was doing this already.  IPC4 had copied the form of the code in
one spot, but had missed that the ID is global across types and was
only checking for duplicate pipelines.

Signed-off-by: Andy Ross <andyross@google.com>
@andyross
Copy link
Contributor Author

andyross commented May 2, 2023

Update as described to add a fix for the creation of duplicate components too. IPC3 didn't need it, it already had checking.

(Also: sorry for the churn -- pushed the whole fuzz branch on top of this one by mistake)

@cujomalainey cujomalainey merged commit 1236f9d into thesofproject:main May 2, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants