Skip to content

Conversation

@singalsu
Copy link
Collaborator

@singalsu singalsu commented Feb 9, 2022

This patch replaces the audio stream read/write frag based
access to source and sink by block processing based on
audio_stream_bytes_without_wrap() bytes count.

Signed-off-by: Seppo Ingalsuo seppo.ingalsuo@linux.intel.com

@singalsu
Copy link
Collaborator Author

singalsu commented Feb 9, 2022

Here's test run with forced generic C version and 250 tap stereo FIR for xtensa build. The gcc build was not fast enough for realtime.

Screenshot from 2022-02-09 20-17-52

Processing load in original was 324 MCPS, in optimized 309 MCPS, saving is 15 MCPS. The load is nearly same for all formats s16/s24/s32.

Note: There's some strange variation in processing times in end of s16 playback with 10 us and 1500 us processing times seen and not steady 770 us. Need to find out why it happens in optimized version. I tried twice and got the same.

@singalsu singalsu force-pushed the fir_read_write_frag_optimize branch from e5f588e to b279e79 Compare February 9, 2022 18:30
@lgirdwood
Copy link
Member

@singalsu GCC generic C build will likely need a reduced TAP count and hence performance. This can be in the Kconfig though so it's spelled out. i.e. a Kconfig setting with TAP count and comments directing users to low tap count for GCC.

@singalsu
Copy link
Collaborator Author

singalsu commented Feb 10, 2022

The add of 10% limit over nominal fixed the issue that happened with 16 bit data. Now the max 875 us is hit with a few times 52 frames processed. The plot looks similar as before but no more 1500 us processing. The 10 us processings happen when FIR is scheduled with 0 frames available. This is still with xtos.

Screenshot from 2022-02-10 15-14-26

@singalsu singalsu force-pushed the fir_read_write_frag_optimize branch from b279e79 to fbd8146 Compare February 10, 2022 13:23
@singalsu singalsu marked this pull request as ready for review February 10, 2022 13:37
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be best coming from topology and be generic. The first patch is fine.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, I'm not aware of such. How?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, I'll remove the 2nd commit, configuration that results to high load is unusual. The non-even data flow must originate from host PCM. Other high load components use also this way of limiting processing time: SRC, ASRC, TDFB.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done, but this needs a fix somewhere in the framework.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yep, it's a framework fix.

This patch replaces the audio stream read/write frag based
access to source and sink by block processing based on
audio_stream_bytes_without_wrap() bytes count.

In a test with forced generic C for xtensa build processing load in
original was 324 MCPS, in optimized 309 MCPS, saving is 15 MCPS. The
load is nearly same for all formats s16/s24/s32. The base load was
very high in test due to a very long used FIR filter. The MCPS saving
should be the same for all stereo 48k streams.

Signed-off-by: Seppo Ingalsuo <seppo.ingalsuo@linux.intel.com>
@singalsu singalsu force-pushed the fir_read_write_frag_optimize branch from fbd8146 to 4ad3e1f Compare February 11, 2022 08:08
@singalsu singalsu requested a review from lgirdwood February 11, 2022 08:10
@lgirdwood
Copy link
Member

@marc-hb fyi the test is reporting PASS in the console but a TIMEOUT in the dashboard. https://sof-ci.01.org/sofpr/PR5339/build11990/devicetest/?model=CML_SKU0955_HDA&testcase=check-suspend-resume-with-playback-5

@lgirdwood lgirdwood added this to the v2.1 milestone Feb 11, 2022
@lgirdwood lgirdwood merged commit 2b6001e into thesofproject:main Feb 11, 2022
Copy link
Contributor

@jsarha jsarha left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't understand about to context too much, but the optimizations themselves make sense and look correct.

@singalsu singalsu deleted the fir_read_write_frag_optimize branch September 15, 2022 13:16
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants