-
Notifications
You must be signed in to change notification settings - Fork 140
DSP Ops tester - Take 3 #4874
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
DSP Ops tester - Take 3 #4874
Conversation
61b5f1f to
e15cf2c
Compare
plbossart
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not on-board with the directions, the simple "echo 1 > boot" or echo 3 > dsp_state is too simple for scripting. These transitions take time, so there's no way for a script to know how long it needs to wait and if the transition was successful.
I would recommend writing a 2-page summary of what we want to achieve, requirements included, otherwise we'll have lots of code review without first agreeing on what we want to achieve.
sound/soc/sof/core.c
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would put this DSP_OPS_TEST as a new bitfield in sof_debug
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Document the DSPless VS ops_test mode and print if the DSPless mode is ignored due to the ops_test taking precedence.
Or should DSPless mode be the one which should have the higher priority?
sound/soc/sof/debug-dsp-ops.c
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this doesn't seem related to the boot but more a shutdown...
sound/soc/sof/debug-dsp-ops.c
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yikes. power up the DSP and dsp_runtime_resume doesn't seem like the same thing at all to me.
sound/soc/sof/debug-dsp-ops.c
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No control on whether we boot from IMR?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Always a fresh boot with the boot option. I will need to add another option to do IMR boot later
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The IMR boot is the target, I don't think anyone is interested in a fresh boot, historically this created all kinds of issues in stress tests.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok, fixed now to do IMR boot as default
sound/soc/sof/debug-dsp-ops.c
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The problem I have with this API is that it's not quite clear how a script would know that the dsp_power_state has been reached.
How long should the script wait and does it get any error codes?
This seems to simple to be useful IMHO.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@plbossart i have exposed both the current fw boot state and the dsp power state that the script can poll to see if the desired state has been reached. How long to poll something that we can set based on the current kernel poll wait times. As for error codes, Im afraid I have no way to pass that on to userspace but we have everything we need in the dmesg isnt it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well, it's rather problematic if you don't have an error code for an API, and rely on something else to find if the routine worked or not... If you start fuzzing or doing all kinds of stress tests, you may have difficulties finding out when exactly the problem occured. it's just very hard to parse such messages. Exhibit A is the SOF CI, where we need to have a start marker and we stop on dev_err but that leaves lots of issues undetected...
It's also problematic if we have to duplicate in scripts the timeouts we already use for the kernel.
I do think we ought to work first on the design/architecture before going into the code review space. It's a very large effort we are starting with scripts that will have to be maintained. We need to do this right from the beginning.
plbossart
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are multiple comments not addressed, not sure if they were missed or github messes with the comments.
The ordering of the patches is weird, we should first have code moves then the addition of new functionality.
sound/soc/sof/Kconfig
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should make it a SND_SOC_SOF_DEBUG_DSP_OPS_TEST_ENABLE, to let distros opt-out, and then set the actual boolean with a kernel parameter.
it's too much work for everyone to have yet another build.
sound/soc/sof/debug-dsp-ops.c
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well, it's rather problematic if you don't have an error code for an API, and rely on something else to find if the routine worked or not... If you start fuzzing or doing all kinds of stress tests, you may have difficulties finding out when exactly the problem occured. it's just very hard to parse such messages. Exhibit A is the SOF CI, where we need to have a start marker and we stop on dev_err but that leaves lots of issues undetected...
It's also problematic if we have to duplicate in scripts the timeouts we already use for the kernel.
I do think we ought to work first on the design/architecture before going into the code review space. It's a very large effort we are starting with scripts that will have to be maintained. We need to do this right from the beginning.
to note:
|
ujfalusi
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ranj063, I'm still not sure about the use of this.
We will introduce a new flow which will be tested and it is different than what we normally use, and we draw conclusion based on the test results?
I also think that if this mode is selected then we should not load topology, not create sound card. I see this as a really dangerous feature which can introduce unexpected side-effects on the normal use of the stack.
sound/soc/sof/core.c
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Document the DSPless VS ops_test mode and print if the DSPless mode is ignored due to the ops_test taking precedence.
Or should DSPless mode be the one which should have the higher priority?
sound/soc/sof/amd/acp-loader.c
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What about the fw_data_bin? acp needs two files to be loaded...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ahh I missed this. I don't know how to handle this for ACP with 2 files. Let me fix the ACP loader to ignore the pass argument for now
sound/soc/sof/debug-dsp-ops.c
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
how do you start the trace after powering on the DSP? Usually we need to send IPC messages. Does the sof_resume() will be called to take care of it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Our regular sequence for starting trace after boot is: run_firmware() -> trace_init().
So, in the test script, we'd do the same. boot the firmware and then call init_trace. And then do whatever else we want to do say for example load libaries, topologies etc
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you share the test script? It is hard to visualize how this is used.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oooohh, so that's why you haven't been reviewing the test script @ujfalusi ? Mere lack of marketing? The PR is here:
Yes, agreed. Thats what the mode does. There's no audio functionality when this mode is selected |
sound/soc/sof/intel/hda-dsp.c
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the 'during suspend' is redundant, no?
Is it even correct in case we remove the drivers and the power off fails?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sorry not following the "is it correct?" comment? Are you asking if this is correct to call this during driver remove or are you asking is it OK to fail?
sound/soc/sof/debug-dsp-ops.c
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I cannot put the 'dsp_ops' to anywhere to be honest, strange name.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what would you suggest I call it instead of "dsp_ops"?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"dsp_test" | "dsp_test_mode" | "test_mode" ?
Probably the "test_mode" is better as it is not really testing the DSP as such.
But I'm bad at naming.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This will hopefully be used to find more bugs in the firmware/software than bugs in the DSP hardware! So how about fw_debug_ops? Then you can all the now redundant fw_ prefixes in fw_debug_ops/fw_* below.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok renamed to fw_debug_ops. But I will leave the fw_ prefix for now because there will also be tplg_filename/ path etc.
sound/soc/sof/ipc4-loader.c
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On module remove you are going to leak these, we rely on devm to handle these (see sof_ipc4_exit())
sound/soc/sof/ipc4-loader.c
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As for IPC3, I was merely asking if you have checked how this works with ipc_type=0.
sound/soc/sof/debug-dsp-ops.c
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I still don't get the meaning of this comment....
ujfalusi
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ranj063, it looks much cleaner now for sure.
I still have reservation on the usefulness of such interface. We test isolated and narrowed down cases which will never going to be possible to use in real use.
Probably it is beneficial, and I just cannot see that. The added code and complexity is minimal, which is a good sign.
sound/soc/sof/debug-dsp-ops.c
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"dsp_test" | "dsp_test_mode" | "test_mode" ?
Probably the "test_mode" is better as it is not really testing the DSP as such.
But I'm bad at naming.
sound/soc/sof/debug-dsp-ops.c
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we tell the user that what is tried is going to be a NOP and the previously loaded kernel is going to remain?
sound/soc/sof/sof-priv.h
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
and overrides the DSPless mode.
sound/soc/sof/debug-dsp-ops.c
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is "In progress"?
sound/soc/sof/debug-dsp-ops.c
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I kind of like the numeric values as well (got used to them in the kernel log)...
Not sure if it makes sense to include it in the print (6: BOOT_COMPLETE)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
sound/soc/sof/debug-dsp-ops.c
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, what I mean is that we have this information already, but it is just returning the raw u32.
I don't know if it is used by anyone.
sound/soc/sof/debug-dsp-ops.c
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you share the test script? It is hard to visualize how this is used.
sound/soc/sof/core.c
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i would really flip the logic and only take the DSP test mode when the user didn't explicitly bypass the DSP.
sound/soc/sof/sof-priv.h
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I really believe this is only relevant IF the DSPless mode is NOT enabled. It's more logical to me...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok I've flipped the logic to have DSPLESS mode take priority over ops testing
sound/soc/sof/debug-dsp-ops.c
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We need to be clear on what we mean by 'new firmware'. if someone overrides an existing firmware in /lib/firmware, they will face all kinds of caching issues. We cannot really add a new firmware with SSH and hope that it's used immediately, that's really asking for trouble.
The assumption should be that you test the firmware that was available at boot.
sound/soc/sof/debug-dsp-ops.c
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The IMR boot is the target, I don't think anyone is interested in a fresh boot, historically this created all kinds of issues in stress tests.
sound/soc/sof/debug-dsp-ops.c
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was also unable to understand the comment.
Modify the signature of the load_firmware op to take the firmware filename as an argument. The firmware filename should be include the full path of the firmware file to be loaded. This is in preparation to add a debug op that will be used to boot a firmware with the filename passed from userspace. Signed-off-by: Ranjani Sridharan <ranjani.sridharan@linux.intel.com>
Before the first firmware boot, set the initial state of the DSP to D3. The DSP D3 state is not the same as the SOF PCI device D3. Instead, it represents that the DSP is in the powered off state. After firmware boot is completed successfully, set the state to D0 to indicate that the DSP has been powered on. Signed-off-by: Ranjani Sridharan <ranjani.sridharan@linux.intel.com>
This will allow for the set_power_state op to be used for switching the DSP to the D3 state. The logic for powering up the DSP when setting the state to D0 also needs to be moved and will be done in a follow up. But for now, this will allow stress testing firmware boot. Signed-off-by: Ranjani Sridharan <ranjani.sridharan@linux.intel.com>
|
(can't reply to the original anymore because Github does not like force pushes)
I don't understand this. AFAIK, the purpose of this PR is to debug boot issues. We know booting from IMR versus not have had different sets of issues in the past. So shouldn't this PR allow testing both? I don't known which default would be best but I feel like both should be possible. |
plbossart
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks ok to me, I am just not sure what the next steps are?
@plbossart the first tiny next step is to have the fw boot script run with our CI to do the stress test. I also want to keep building the test script library to add more things like PM flows, DMA flows, topology loading, IPC etc |
Introduce a new debug option to enable testing of DSP operations. With this ioption enabled, normal audio operations will be disabled. The API for the testing interface will be introduced in the follow up patches. Signed-off-by: Ranjani Sridharan <ranjani.sridharan@linux.intel.com>
Add debugfs entries for testing DSP ops. This patch only adds the firmware filename and path entries to pass them from userspace. These will be saved in the newly added field for saving the test profile in struct snd_sof_dev. These will be used for testing DSP ops such as booting firmware and setting the DSP power state in the following patches. Signed-off-by: Ranjani Sridharan <ranjani.sridharan@linux.intel.com>
Add a new op to test firmware boot. Make sure to the DSP is powered off before attempting to boot a new firmware. Signed-off-by: Ranjani Sridharan <ranjani.sridharan@linux.intel.com>
Add a new op to test DSP power state changes. For now only D3 is supported. More states will be added in follow up patches. Signed-off-by: Ranjani Sridharan <ranjani.sridharan@linux.intel.com>
Add a read-only debugfs entry to read the current FW boot state. Signed-off-by: Ranjani Sridharan <ranjani.sridharan@linux.intel.com>
Unload any existing firmware and set the fw_state to PREPARE in preparation for a subsequent firmware boot. Signed-off-by: Ranjani Sridharan <ranjani.sridharan@linux.intel.com>
These will be used to debugging purposes when testing the different DSP ops. Signed-off-by: Ranjani Sridharan <ranjani.sridharan@linux.intel.com>
|
@plbossart just a minor change to add the fw filename and fw_path string lengths |
|
@ujfalusi any objection to merging this? This will help me build upon it and add more test cases. |
This PR adds the API for enabling testing the DSP ops in the driver for testing the FW boot flow, PM flow etc. It introduces a new kernel config to enable the testing debugfs interface. When enabled, normal audio operations will be disabled.
For now the following ops are supported for testing: