Skip to content

Conversation

@keyonjie
Copy link

@keyonjie keyonjie commented May 9, 2020

This is the series for UUID support -- Kernel part, it should be used together with the tplg part(thesofproject/sof#2919) and Firmware part(thesofproject/sof#2920).

Todo: wait for the readiness of UUID dictionary from extended manifest.

@ranj063
Copy link
Collaborator

ranj063 commented May 10, 2020

@keyonjie should there be an ABI update for this?

@keyonjie
Copy link
Author

@keyonjie should there be an ABI update for this?

Yep, the UUID support should happen after Karol's extended manifest for UUID dictionary, so we might need wait a little bit to know what the exact version it should be.

Copy link
Member

@plbossart plbossart left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the cAVS world modules are handled with a UUID. Here the UUID is used to deal with flavors/subtype, that seems to be a very large semantic change? see also comments below

@ranj063
Copy link
Collaborator

ranj063 commented May 12, 2020

@keyonjie should there be an ABI update for this?

Yep, the UUID support should happen after Karol's extended manifest for UUID dictionary, so we might need wait a little bit to know what the exact version it should be.

@keyonjie can u elaborate on what the dictionary is supposed to do and how it would work? It will help folks reviewing this PR understand the need and it's completeness.

@keyonjie
Copy link
Author

@keyonjie should there be an ABI update for this?

Yep, the UUID support should happen after Karol's extended manifest for UUID dictionary, so we might need wait a little bit to know what the exact version it should be.

@keyonjie can u elaborate on what the dictionary is supposed to do and how it would work? It will help folks reviewing this PR understand the need and it's completeness.

Sure the uuid dictionary is actually an array of [16Bytes UUID string] --> [32bit uuid entry FW internal address] mapping, which will be added and packed in the extended manifest by Karol.
See details here:
thesofproject/sof#2914

Karol will add kernel PR to parse these items, before they can be used as the look up table in sof_get_uid_entry() here.

@keyonjie
Copy link
Author

In the cAVS world modules are handled with a UUID. Here the UUID is used to deal with flavors/subtype, that seems to be a very large semantic change? see also comments below

Yes, we are not following cAVS here.

@keyonjie keyonjie force-pushed the uuid branch 2 times, most recently from 0b12b74 to 655b1f0 Compare May 12, 2020 10:57
Copy link
Collaborator

@kv2019i kv2019i left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @keyonjie for pushing this forward. I do have reservations about using the dictionary for component creation. Let's exhaust this discussion whether this is really, really mandatory -> @mrajwa

If we end up doing the uuid->key lookup in kernel, we need to use some other name than "uid", this is very confusing. Also this means we will have a hard-dependency to ext-manifest feature.

@keyonjie
Copy link
Author

Thanks @keyonjie for pushing this forward. I do have reservations about using the dictionary for component creation. Let's exhaust this discussion whether this is really, really mandatory -> @mrajwa

If we end up doing the uuid->key lookup in kernel, we need to use some other name than "uid", this is very confusing. Also this means we will have a hard-dependency to ext-manifest feature.

How about rename it to "uuid_entry" on both side? Yes, if we want to transfer the "entry" instead of the UUID 16Bytes itself, we will have to add the dictionary into the extended manifest.

@mmaka1
Copy link

mmaka1 commented May 13, 2020

[32bit uuid entry FW internal address]

That 32bit value should not be perceived as "(internal) address". It is outside any valid address space accessible by the DSP. It is just short 32bit representation used internally by the FW instead of handling full 16bytes values. Note that 32bit value is unique for any given build and may change from build to build (important when analyzing logs, should be resolved via dictionary present in the build). Similar mechanism is used in cAVS where uuid is translated to 16bit module id used in communication with fw.

@keyonjie
Copy link
Author

Thank you for your feedback @paulstelian97 and @mmaka1.

Hi @kv2019i I think this is consistent with what we discussed today.

@keyonjie keyonjie force-pushed the uuid branch 2 times, most recently from 1b2c7fb to 28d5a8f Compare May 29, 2020 08:28
@keyonjie keyonjie changed the title [RFC]UUID Kernel support UUID Kernel support May 29, 2020
@keyonjie
Copy link
Author

I had a look at the logs of a few of the numerous "not tested" failures in https://sof-ci.01.org/linuxpr/PR2090/build3952/devicetest/

In the ones I looked at the device was not responsive and timed out after trying to boot the kernel modified by this PR. So these failures are very likely caused by this PR.

Thanks, which platform does it fail on? Basically it should only impact the topology loading stage, FW loading and booting should not be impacted. sof-logger logs may help to analyze that kind of errors.

@keyonjie
Copy link
Author

one minor open to me is that maybe we should not append UUID for buffer, it is in buffer_new() not comp_new() in FW, though the buffer is using sof_ipc_comp struct also.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is private->size truly no longer relevant?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I actually can't understand why we put this private->size here previously, @juimonen @singalsu do you recall it?

Copy link
Author

@keyonjie keyonjie Jun 19, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @paulstelian97 I think the private->size was added by mistake, I just test with my removal, the process components works fine, including the sof-ctl to get/set tlv data for it.

I think the reason why no any issue reported for this is that though we are passing bigger ipc_size, but we only fill data to sof_ipc_comp_process and .data[], and the fw takes only "sizeof(struct sof_ipc_comp_process) + ipc_data_size", so this error here is not fatal and nobody have found it. :)

@paulstelian97
Copy link

Barring those few questions this does look good, so fix the obviously incorrect kzalloc so I can approve.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I actually can't understand why we put this private->size here previously, @juimonen @singalsu do you recall it?

keyonjie added 9 commits June 17, 2020 22:34
Append the extended data to the end of the struct sof_ipc_comp_mixer,
construct the ipc for COMP_NEW during the topology load stage.

Signed-off-by: Keyon Jie <yang.jie@linux.intel.com>
Append the extended data to the end of the struct sof_ipc_comp_volume,
construct the ipc for COMP_NEW during the topology load stage.

Signed-off-by: Keyon Jie <yang.jie@linux.intel.com>
Append the extended data to the end of the struct sof_ipc_buffer,
construct the ipc for BUFFER_NEW during the topology load stage.

Signed-off-by: Keyon Jie <yang.jie@linux.intel.com>
Append the extended data to the end of the struct sof_ipc_comp_host,
construct the ipc for COMP_NEW during the topology load stage.

Signed-off-by: Keyon Jie <yang.jie@linux.intel.com>
Append the extended data to the end of the struct sof_ipc_comp_src,
construct the ipc for COMP_NEW during the topology load stage.

Signed-off-by: Keyon Jie <yang.jie@linux.intel.com>
Append the extended data to the end of the struct sof_ipc_comp_asrc,
construct the ipc for COMP_NEW during the topology load stage.

Signed-off-by: Keyon Jie <yang.jie@linux.intel.com>
Append the extended data to the end of the struct sof_ipc_comp_tone,
construct the ipc for COMP_NEW during the topology load stage.

Signed-off-by: Keyon Jie <yang.jie@linux.intel.com>
Append the extended data to the end of the struct sof_ipc_comp_process,
construct the ipc for COMP_NEW during the topology load stage.

Signed-off-by: Keyon Jie <yang.jie@linux.intel.com>
Append the extended data to the end of the struct sof_ipc_comp_mux,
construct the ipc for COMP_NEW during the topology load stage.

Signed-off-by: Keyon Jie <yang.jie@linux.intel.com>
@keyonjie
Copy link
Author

Barring those few questions this does look good, so fix the obviously incorrect kzalloc so I can approve.

Thanks @paulstelian97 , updated now, for the last private->size one, I can't recall why it was there but it looks wrong with it, let me try to confirm about that tomorrow.

@paulstelian97
Copy link

Barring those few questions this does look good, so fix the obviously incorrect kzalloc so I can approve.

Thanks @paulstelian97 , updated now, for the last private->size one, I can't recall why it was there but it looks wrong with it, let me try to confirm about that tomorrow.

Alright, I'll just wait for that.

@marc-hb
Copy link
Collaborator

marc-hb commented Jun 17, 2020

github tip: to reduce the spa... number of email notifications by one order of magnitude, can you prefer the "Start Review" green button? It's available below any comment in the "Files Changed" tab.
The "Start Review" button is not available in the "Conversation" tab, not even below the same comments, sorry :-(

@keyonjie
Copy link
Author

github tip: to reduce the spa... number of email notifications by one order of magnitude, can you prefer the "Start Review" green button? It's available below any comment in the "Files Changed" tab.
The "Start Review" button is not available in the "Conversation" tab, not even below the same comments, sorry :-(

Good point, will follow that, thanks @marc-hb

Copy link
Member

@plbossart plbossart left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If it looks like a duck, behaves as a duck, sounds like duck, it's probably a duck.

s/duck/variable_array/g

This feels like re-inventing the wheel, without benefiting from tools such as struct_size.

Also this doesn't look at all like the previous versions of the PR? I remember seeing changes to merge core and UUID, which are no longer present. If I am right, it'd be better to close the previous PR and restart a new one with better context. Reviewer fatigue is very real I am afraid.

/* reserved for future use */
uint32_t reserved[1];
/* extended data offset, 0 if no extended data */
uint32_t ext_data_offset;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

seems to me like you've re-invented the concept of variable array?

this is exactly like we've done in other places e.g. with
uint32_t data_size;
uint8_t data[];

In both cases, this is a final change to the structure, you cannot add anything after the variable array, so it'd be wise indeed to keep some reserved space before the variable array.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

uint8_t data[] is an incomplete array type.

A variable array is something like int n; uint8_t data[n] instead.

When it's the last member of a structure, an incomplete array type is more specifically called a flexible array member.

Sorry to be pedantic but the right terms are needed when searching documentation and more generally looking for information. For instance information about best initialization practices (see previous sizeof/memset discussion) or that the kernel is now VLA-free
https://www.phoronix.com/scan.php?page=news_item&px=Linux-Kills-The-VLA
https://lwn.net/Articles/749064/

Plus memory (mis)management is where most security issues lie so some pedantism can't hurt there :-)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks for being pedantic @marc-hb, my point remains that it's probably better to be explicit about data structures than implicit. There are tools that will detect that a flexible array is the last in a structure, if it's not even declared we'd be writing them off.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

seems to me like you've re-invented the concept of variable array?

this is exactly like we've done in other places e.g. with
uint32_t data_size;
uint8_t data[];

In both cases, this is a final change to the structure, you cannot add anything after the variable array, so it'd be wise indeed to keep some reserved space before the variable array.

No, it is not the flexible array here, it means explicit one uint32_t member. It used to be reserved[2] and is reserved[1] after .core is introduced.

I think the reason that keeping these 'reserved' member is to avoid bumping MAJOR frequently, once they are used up, we have to extend it with a MAJOR bump when any new member being added.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

uint8_t data[] is an incomplete array type.

A variable array is something like int n; uint8_t data[n] instead.

When it's the last member of a structure, an incomplete array type is more specifically called a flexible array member.

Sorry to be pedantic but the right terms are needed when searching documentation and more generally looking for information. For instance information about best initialization practices (see previous sizeof/memset discussion) or that the kernel is now VLA-free
https://www.phoronix.com/scan.php?page=news_item&px=Linux-Kills-The-VLA
https://lwn.net/Articles/749064/

Plus memory (mis)management is where most security issues lie so some pedantism can't hurt there :-)

@marc-hb thanks for the details! I somehow completely missed the "get rid of VLAs" movement (shame). I'm wondering though: those articles use the expression "VLAs within structures" - are they referring to structures declared within functions like

void foo(int x)
{
	struct {
		int a[x];
	} b;

? or do they mean what you call incomplete / flexible arrays?

Copy link
Collaborator

@marc-hb marc-hb Jun 19, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm wondering though: those articles use the expression "VLAs within structures" - are they referring to structures declared within functions like

Yes, the former (gcc abomination).

There are a number of differences between Variable Length Arrays and Flexible Array Members. I believe the simplest to remember and also most relevant is:

  1. The variable memory of all VLAs is automatically allocated an de-allocated at runtime by code automatically added by the compiler.
  2. You must manage the changing memory of Flexible Array Members all by yourself.

I didn't know about gcc VLAs in struct, I read about them today. While plain VLAs were bad, gcc VLAs in structs raise Linus' language to yet another level:
http://thelinuxjedi.blogspot.com/2014/02/why-vlais-is-bad.html
So gcc VLAs in struct are a much worse form of standard VLAs with no relation to standard Flexible Array Members. The terminology is consistent, let's stick to it.

As opposed to all VLAs, Flexible Array Members (= what Pierre actually suggested) are totally welcome in the kernel and a great replacement for 0-sized arrays (bad) or 1-sized arrays (a catastrophe)
https://www.phoronix.com/scan.php?page=news_item&px=Linux-5.8-Flexible-Array-Member

PS: maybe not so pedantic after all? ;-)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it possible that flexible arrays in IPC can be moved by also having an offset member which says where the array is beginning? This way you don't need to add reserved fields before the array itself, but IPC itself must be aware of this (and made slightly more complex due to it) so it can fill in those intermediate fields with zeros if necessary.

@marc-hb marc-hb self-requested a review June 18, 2020 23:03
Copy link
Collaborator

@marc-hb marc-hb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I noticed there are 15 commits, one per struct. Are these commits really independent from each other? I mean can I checkout any of these commits somewhere in the middle, build it and use it for real in some use case or not totally artificial test case? If not please reduce the number of commits down to that point.

Also this doesn't look at all like the previous versions of the PR? I remember seeing changes to merge core and UUID, which are no longer present

While Github makes this very hard when force-pushing, you often still can look at older versions. Here are some painfully handcrafted URLs

Looking at these two pages side by side, they seem to have very few lines in common.

(Open these links ASAP before github garbage-collects these old commits, you never know)

If I am right, it'd be better to close the previous PR and restart a new one with better context. Reviewer fatigue is very real I am afraid.

Based on the links above I think you are right.

This PR now has 130 comments, that seems totally disproportionate and very tedious to proof-read for a relatively small number of lines.

@keyonjie Please close and re-submit. Make sure you carefuly review all these 130 comments one by one, extract all the ones still relevant and address them before re-submitting to avoid repeating any past discussion and prevent additional "review fatigue".

If it's not obvious how to resolve a significant number of these previous comments, for instance because some unresolved disagreements, then don't resubmit yet but discuss in the github issue and/or meetings first.

@keyonjie
Copy link
Author

I noticed there are 15 commits, one per struct. Are these commits really independent from each other? I mean can I checkout any of these commits somewhere in the middle, build it and use it for real in some use case or not totally artificial test case? If not please reduce the number of commits down to that point.

Yes, we can build and use them one by one, e.g. if we eventually decide the commit for the buffer one (ASoC: SOF: append extended data to sof_ipc_buffer) is actually not needed, we can drop it with other ones remained.

Also this doesn't look at all like the previous versions of the PR? I remember seeing changes to merge core and UUID, which are no longer present

While Github makes this very hard when force-pushing, you often still can look at older versions. Here are some painfully handcrafted URLs

Looking at these two pages side by side, they seem to have very few lines in common.

(Open these links ASAP before github garbage-collects these old commits, you never know)

If I am right, it'd be better to close the previous PR and restart a new one with better context. Reviewer fatigue is very real I am afraid.

Based on the links above I think you are right.

This PR now has 130 comments, that seems totally disproportionate and very tedious to proof-read for a relatively small number of lines.

Yes, let me open a new PR to help the review.

@keyonjie Please close and re-submit. Make sure you carefuly review all these 130 comments one by one, extract all the ones still relevant and address them before re-submitting to avoid repeating any past discussion and prevent additional "review fatigue".

If it's not obvious how to resolve a significant number of these previous comments, for instance because some unresolved disagreements, then don't resubmit yet but discuss in the github issue and/or meetings first.

This is good point, let me address all of them before open the new PR.

@keyonjie
Copy link
Author

The log comments here make the reviewing difficult and tedious, and the latest solution has been changed a lot, closing it and will open a new one.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ABI involves ABI changes, need extra attention

Projects

None yet

Development

Successfully merging this pull request may close these issues.