Skip to content

Conversation

@ujfalusi
Copy link
Collaborator

@ujfalusi ujfalusi commented Apr 9, 2021

Hi,

This is the first part of the reworked pull last sent by @fredoh9.
I have decided to split up the patches into at least two batch to make it easier to review and then fix up based on the comments.

The first part adds the basic support for the auxbus and introduces the term 'sof client'
The Kconfig has been changed so the the client support is selected by the user to keep things easier for the users.
It also includes the conversion of the first client: IPC flood test.
This is a generic test client so instead of doing the registration per architecture as it was in the first series we do it in the generic code. This will allow ARM to keep using it (it was available or them as well).
The Kconfig remained the same, but we now have option to have more than one IPC flood running at the same time. I have also made sure that we don't break testing scripts via debugfs.

Looking forward: the probe test will be converted along the same line as the IPC flood test as that one is also a generic test and it was (well, it is) available on ARM also.
On the Intel side we might drop most of the code for now until we can introduce the clients (HDMI, etc).
I believe the gdb and firmware debugging support might be also platform independent, but I need to check that and plan accordingly.

After addressing the surely coming comments I will include the probe test client into v2.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe create a function snd_sof_register_flood_clients and make it empty if SOF_DEBUG_IPC_FLOOD_TEST is not enabled. Otherwise, snd_sof_register_clients will be very hard to read.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@dbaluta , yes that's an option. I'm not sure how it will look like if we pile up more platform independent clients like this.
The error handling might be much simpler with a separate function for each selectable client.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@dbaluta, you are right, it is going to make it much cleaner, I'll update for v2.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's probably also worth a comment stating the future flow i.e. that capabilities will be read from FW manifest and devices enumerated as found in the manifest.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would add the comment when we have the code which does that

Copy link
Collaborator

@kv2019i kv2019i left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The one checkpatch CHECK issue plus I noticed one weird code comment (probably has been there for long), but otherwise, this looks good to go!

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ranj063 @fredoh9 @ujfalusi Not sure I noticed this in original review, but what does this comment mean? What is a "certain client" ?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is no need for this comment, imho. I'll remove it.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it's a truncated version of an original comment, should be fixed, yes

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I dont remember adding this either :)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I should have just removed the old comment instead, the original comment was:
/* If registering certain clients fails, unregister the previously registered clients. */
and it is no longer valid.

Copy link
Collaborator

@lyakh lyakh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

some minor stuff, looks rather good to me otherwise

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it's a truncated version of an original comment, should be fixed, yes

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's probably also worth a comment stating the future flow i.e. that capabilities will be read from FW manifest and devices enumerated as found in the manifest.

ranj063 added 3 commits April 13, 2021 08:33
A client in the SOF (Sound Open Firmware) context is a driver that needs
to communicate with the DSP via IPC messages. The SOF core is responsible
for serializing the IPC messages to the DSP from the different clients.

One example of an SOF client would be an IPC test client that floods the
DSP with test IPC messages to validate if the serialization works as
expected.

Multi-client support will also add the ability to split the existing audio
cards into multiple ones, so as to e.g. to deal with HDMI with a dedicated
client instead of adding HDMI to all cards.

This patch introduces descriptors for SOF client driver and SOF client
device along with APIs for registering and unregistering a SOF client
driver, sending IPCs from a client device and accessing the SOF core
debugfs root entry.

Along with this, add a couple of new members to struct snd_sof_dev that
will be used for maintaining the list of clients.

Signed-off-by: Ranjani Sridharan <ranjani.sridharan@linux.intel.com>
Co-developed-by: Fred Oh <fred.oh@linux.intel.com>
Signed-off-by: Fred Oh <fred.oh@linux.intel.com>
Signed-off-by: Dave Ertman <david.m.ertman@intel.com>
Signed-off-by: Peter Ujfalusi <peter.ujfalusi@linux.intel.com>
Add new ops for registering/unregistering clients based
on DSP capabilities and/or DT information.

Signed-off-by: Ranjani Sridharan <ranjani.sridharan@linux.intel.com>
Signed-off-by: Dave Ertman <david.m.ertman@intel.com>
Signed-off-by: Peter Ujfalusi <peter.ujfalusi@linux.intel.com>
Move the IPC flood test code out from the debug file as separate SOF client
driver.

Based on the kernel configuration, the device registration for the new IPC
flood test is going to happen in the core.
With the separate client driver it is going to be possible to run multiple
flood tests in parallel to increase the stress, the new Kconfig option can
be used to select this (defaults to 1).
In order to preserve backward compatibility with existing SW/scripts, the
first IPC flood test's debugfs files have been linked to the old files.

Signed-off-by: Ranjani Sridharan <ranjani.sridharan@linux.intel.com>
Co-developed-by: Fred Oh <fred.oh@linux.intel.com>
Signed-off-by: Fred Oh <fred.oh@linux.intel.com>
Signed-off-by: Dave Ertman <david.m.ertman@intel.com>
Signed-off-by: Peter Ujfalusi <peter.ujfalusi@linux.intel.com>
@ujfalusi ujfalusi force-pushed the peter/sof/auxiliary_bus_part1 branch from 092ed2f to d5e064b Compare April 13, 2021 08:04
Copy link
Member

@plbossart plbossart left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

bear with me, some comments are based on my flaky memory of code from last Summer, I've forgotten quite a bit of context there.

If unsure, select "N".

config SND_SOC_SOF_CLIENT
bool
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is there a reason not to make this a tristate? This will force the auxiliary bus to be built-in, it's not really necessary, is it?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I use it as a flag to include the sof-client support or not.

ifneq ($(CONFIG_SND_SOC_SOF_CLIENT),)
snd-sof-objs += sof-client.o
endif

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we need to do this, or can we have a cleaner cut and have sof-client be a separate module?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The problem with a separate .ko is that we will have circular dependency due to the fact that the generic, no platform clients are registered from the core.
snd-sof-client.ko uses symbols from snd-sof.ko and snd-sof.ko uses symbols from snd-sof-client.ko
This was not an issue with the original series as there the generic, platform independent sof clients were registered from platform code (intel code and intel code only).
ipc flood and probes are generic clients and can be used by other architectures.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

that's not completely true. I believe the probes are Intel HDaudio only, or at least they cannot be enabled on legacy platforms so don't belong in the core.

As for the circular dependency for ipc, I am usure why 'snd-sof.ko uses symbols from snd-sof-client.ko'. that seems wrong.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

right, we only registered the probe for HDA, true, but afaik the probe code on the firmware is not tied to HDA or is it?

Well, the sof-core is the one which manages and branches out the auxbus for the SOF clients.
One option is to relocate the auxbus device registration calls to the file where most of the IPC core code is, I think that might solve the circular dependency issue and it might be even better place.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One option is to relocate the auxbus device registration calls to the file where most of the IPC core code is, I think that might solve the circular dependency issue and it might be even better place.

good suggestion.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

right, we only registered the probe for HDA, true, but afaik the probe code on the firmware is not tied to HDA or is it?

I think the probes only exist on platforms that have HDA. The DMA also assumes a circular list (BDL) that doesn't exist with the GP-DMA on legacy platforms.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, in the draft pull I create the aux device for probe only in case of HDA.
I hope.

The code for the probe was in the generic, core part of SOF, that's why I have kept it there and the original series did the same as well.

/*
* Copyright(c) 2021 Intel Corporation. All rights reserved.
* Author: Ranjani Sridharan <ranjani.sridharan@linux.intel.com>
*/
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mark Brown says "The whole block in C++ style please"

we'll have to adjust older files but let's follow his convention.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, If Mark said, then let's do that. Other subsystems require this style....

return cdev;
}

int sof_client_dev_register(struct snd_sof_dev *sdev, const char *name, u32 id)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I must admit I don't recall what the id is needed for.
A comment wouldn't hurt or better some kernel-doc to make sure there's no ambiguity on the parameters.

Edit: now I think I remember, each id is handled by the parent but is only valid within the scope of the 'name', e.g the name can be ipc-test, and there can be multiple instances identified by an id. the uniqueness of the id is handled by the parent. Right?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, the ID is to have unique dev name for the devices. The name is auxbus_name + device_name + id.

struct auxiliary_device auxdev;
struct snd_sof_dev *sdev;
struct list_head list; /* item in SOF core client dev list */
void *data;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cdata?

There should be some comment here, or kernel-doc to explain what this is used for.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

or priv. I think most of the struct lacks the proper documentation.


/* client-specific ops, all optional */
struct sof_client_ops {
int (*client_ipc_rx)(struct sof_client_dev *cdev, u32 msg_cmd);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this doesn't really look optional if the tx ipc function is used?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the TX is sync and the client_ipc_rx is not used, if I'm not mistaken.
I'll check and fix the comment if needed.

struct list_head client_list;

/* mutex to protect client list */
struct mutex client_mutex;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am start to wonder if we should start using stronger naming conventions with 'ipc_client' as a prefix, e.g. sof_ipc_client, ipc_client_mutex, etc.

Can the client do anything other than IPCs?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It must not do anything else than IPC as we say the a client in SOF is a functionality on the DSP side or a feature over the IPC.
I agree that we should aim for easily recognizable naming convention. This is why I renamed the ipc test to sof-client-ipc-flood.

config SND_SOC_SOF_DEBUG_IPC_FLOOD_TEST_NUM
int "Number of IPC flood test clients"
range 1 32
default 1
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can't recall why we said 1 is the default. If we want to test IPC concurrency 2 seems like a natural number, no?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't want to introduce any functionality change. The original code provided one IPC flood test, we do the same by default but later we can change the default.

if (sof_ops(sdev) && sof_ops(sdev)->unregister_clients)
sof_ops(sdev)->unregister_clients(sdev);

snd_sof_unregister_ipcflood_test(sdev);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why is this all in core.c? This doesn't seem right to me.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is a generic client and does not belong to platform code, it has to be in the core.
We had it in the debug.c

We have introduced a callback for client registration and I think it is better to keep all client device creation, destruction in one place to minimize the chances of random ordering issues.

We could move the generic debug feature registrations to debug.c, if that makes more sense?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

...or to sof-client.c, maybe even move snd_sof_register_clients() and snd_sof_unregister_clients() there. Maybe even move all generic IPC-flood (de)registration to sof-client-ipc-test.c

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

right, let me see how that will look like, thanks.

@ujfalusi
Copy link
Collaborator Author

A new take on the auxiliary bus support (SOF client drivers) will be pushed soon, closing this one, thank you for the feedbacks, most if not all will be addressed by the take2

@ujfalusi ujfalusi closed this Jun 23, 2021
ranj063 pushed a commit to ranj063/linux that referenced this pull request Jun 24, 2021
The device can be requested to be attached despite being not probed.
This situation is possible if devlink reload races with module removal,
and the following kernel panic is an outcome of such race.

 mlx5_core 0000:00:09.0: firmware version: 4.7.9999
 mlx5_core 0000:00:09.0: 0.000 Gb/s available PCIe bandwidth (8.0 GT/s PCIe x255 link)
 BUG: unable to handle page fault for address: fffffffffffffff0
 #PF: supervisor read access in kernel mode
 #PF: error_code(0x0000) - not-present page
 PGD 3218067 P4D 3218067 PUD 321a067 PMD 0
 Oops: 0000 [#1] SMP KASAN NOPTI
 CPU: 7 PID: 250 Comm: devlink Not tainted 5.12.0-rc2+ thesofproject#2836
 Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014
 RIP: 0010:mlx5_attach_device+0x80/0x280 [mlx5_core]
 Code: f8 48 c1 e8 03 42 80 3c 38 00 0f 85 80 01 00 00 48 8b 45 68 48 8d 78 f0 48 89 fe 48 c1 ee 03 42 80 3c 3e 00 0f 85 70 01 00 00 <48> 8b 40 f0 48 85 c0 74 0d 48 89 ef ff d0 85 c0 0f 85 84 05 0e 00
 RSP: 0018:ffff8880129675f0 EFLAGS: 00010246
 RAX: 0000000000000000 RBX: 0000000000000001 RCX: ffffffff827407f1
 RDX: 1ffff110011336cf RSI: 1ffffffffffffffe RDI: fffffffffffffff0
 RBP: ffff888008e0c000 R08: 0000000000000008 R09: ffffffffa0662ee7
 R10: fffffbfff40cc5dc R11: 0000000000000000 R12: ffff88800ea002e0
 R13: ffffed1001d459f7 R14: ffffffffa05ef4f8 R15: dffffc0000000000
 FS:  00007f51dfeaf740(0000) GS:ffff88806d5c0000(0000) knlGS:0000000000000000
 CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
 CR2: fffffffffffffff0 CR3: 000000000bc82006 CR4: 0000000000370ea0
 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
 Call Trace:
  mlx5_load_one+0x117/0x1d0 [mlx5_core]
  devlink_reload+0x2d5/0x520
  ? devlink_remote_reload_actions_performed+0x30/0x30
  ? mutex_trylock+0x24b/0x2d0
  ? devlink_nl_cmd_reload+0x62b/0x1070
  devlink_nl_cmd_reload+0x66d/0x1070
  ? devlink_reload+0x520/0x520
  ? devlink_nl_pre_doit+0x64/0x4d0
  genl_family_rcv_msg_doit+0x1e9/0x2f0
  ? mutex_lock_io_nested+0x1130/0x1130
  ? genl_family_rcv_msg_attrs_parse.constprop.0+0x240/0x240
  ? security_capable+0x51/0x90
  genl_rcv_msg+0x27f/0x4a0
  ? genl_get_cmd+0x3c0/0x3c0
  ? lock_acquire+0x1a9/0x6d0
  ? devlink_reload+0x520/0x520
  ? lock_release+0x6c0/0x6c0
  netlink_rcv_skb+0x11d/0x340
  ? genl_get_cmd+0x3c0/0x3c0
  ? netlink_ack+0x9f0/0x9f0
  ? lock_release+0x1f9/0x6c0
  genl_rcv+0x24/0x40
  netlink_unicast+0x433/0x700
  ? netlink_attachskb+0x730/0x730
  ? _copy_from_iter_full+0x178/0x650
  ? __alloc_skb+0x113/0x2b0
  netlink_sendmsg+0x6f1/0xbd0
  ? netlink_unicast+0x700/0x700
  ? netlink_unicast+0x700/0x700
  sock_sendmsg+0xb0/0xe0
  __sys_sendto+0x193/0x240
  ? __x64_sys_getpeername+0xb0/0xb0
  ? copy_page_range+0x2300/0x2300
  ? __up_read+0x1a1/0x7b0
  ? do_user_addr_fault+0x219/0xdc0
  __x64_sys_sendto+0xdd/0x1b0
  ? syscall_enter_from_user_mode+0x1d/0x50
  do_syscall_64+0x2d/0x40
  entry_SYSCALL_64_after_hwframe+0x44/0xae
 RIP: 0033:0x7f51dffb514a
 Code: d8 64 89 02 48 c7 c0 ff ff ff ff eb b8 0f 1f 00 f3 0f 1e fa 41 89 ca 64 8b 04 25 18 00 00 00 85 c0 75 15 b8 2c 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 76 c3 0f 1f 44 00 00 55 48 83 ec 30 44 89 4c
 RSP: 002b:00007ffcaef22e78 EFLAGS: 00000246 ORIG_RAX: 000000000000002c
 RAX: ffffffffffffffda RBX: 0000000000000003 RCX: 00007f51dffb514a
 RDX: 0000000000000030 RSI: 000055750daf2440 RDI: 0000000000000003
 RBP: 000055750daf2410 R08: 00007f51e0081200 R09: 000000000000000c
 R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000
 R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000
 Modules linked in: mlx5_core(-) ptp pps_core ib_ipoib rdma_ucm rdma_cm iw_cm ib_cm ib_umad ib_uverbs ib_core [last unloaded: mlx5_ib]
 CR2: fffffffffffffff0
 ---[ end trace 7789831bfe74fa42 ]---

Fixes: a925b5e ("net/mlx5: Register mlx5 devices to auxiliary virtual bus")
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Reviewed-by: Parav Pandit <parav@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants