Audio: Google RTC Audio Proc: Process copy() without read/write frag #5579

singalsu · 2022-03-21T17:44:14Z

This patch optimizes the copy() function by avoiding the per sample
done circular wrap check. The source, sink, and reference buffers
are read and written in maximum size blocks returned by audio stream
function audio_stream_samples_without_wrap_s16() and
audio_stream_frames_without_wrap().

The saving with mock-up algorithm version is 16 MCPS, from 29 to 13 MCPS.

Signed-off-by: Seppo Ingalsuo seppo.ingalsuo@linux.intel.com

singalsu · 2022-03-21T17:46:05Z

@lkoenig This draft will need a rebase after merge of your #5576.

src/audio/google_rtc_audio_processing.c

cujomalainey · 2022-03-21T18:41:13Z

src/audio/google_rtc_audio_processing.c

Isn't this just samples?

Yes, it is I was wondering how to name this. In first loop it's samples, in second loop it's frames. It seems that the optimizer makes variables more local so it should be the same to have two variables without increase in stack usage: samples and frames.

lkoenig

This is a very nice pull request. Thanks a lot @singalsu to put it up.

lkoenig · 2022-03-22T09:01:49Z

src/audio/google_rtc_audio_processing.c

Can you rename it to num_sample_remaining ?

I was rethinking it, and maybe we should stick to frames and handle the number of samples in the inner loop. The reason is that we can have different number of channel for:

aec reference

capture input

capture output
But the number of frame should match.

Sure, I'll rename if no issues with too long lines. Yep, I kept the possibility for different number of channels in ref, src, and snk.

The samples counting is more efficient so I used it in the reference loop. The frames without wrap function is using division that is slow on xtensa.

lkoenig · 2022-03-22T09:05:16Z

src/audio/google_rtc_audio_processing.c

we could also write: remain = cl.frames * cd->num_capture_channels where we initialize cd->num_capture_channels = 1 in google_rtc_audio_processing_create
I think that would make more sense.

Technically we could have (That might be for a follow up PR) different number of channels for capture input and capture output.

I leave that up to you. I'm OK to change but counting frames here preserves possibility for different number of source and sink channes. What is your preference?

I would prefer counting frame everywhere converting to samples when needed.

OK, here the frames counting is needed. The implementation is not most efficient, there's divide and a switch case in audio_stream_frame_bytes().

static inline uint32_t audio_stream_frames_without_wrap(const struct audio_stream *source, const void *ptr) { uint32_t bytes = audio_stream_bytes_without_wrap(source, ptr); uint32_t frame_bytes = audio_stream_frame_bytes(source); return bytes / frame_bytes; }

While this function is a lot lighter, so I used in reference where it was possible.

static inline int audio_stream_samples_without_wrap_s16(const struct audio_stream *source, const void *ptr) { int to_end = (int16_t *)source->end_addr - (int16_t *)ptr; assert((intptr_t)source->end_addr >= (intptr_t)ptr); return to_end; }

andrula-song · 2022-03-23T06:46:36Z

looks good to me.

lgirdwood

@cujomalainey @lkoenig wont merge until Google approves

lkoenig

I tested the code as it is and it did not work due to the small details.

Once I corrected those details, it work as intended.
Thanks a lot for putting the work in that.

lkoenig · 2022-03-24T10:26:37Z

src/audio/google_rtc_audio_processing.c

Note that the number of AEC reference channels passed to the AEC might be lower than the number of channel in the stream. See comment in prepare about that.
Should be like:
num_samples_remaining = num_aec_reference_frames * cd->aec_reference->stream.channels;

lkoenig · 2022-03-24T10:37:43Z

src/audio/google_rtc_audio_processing.c

Same here as cd->num_aec_reference_channels could be lower or equal cd->aec_reference->stream.channels one should make sure ref is incremented by cd->aec_reference->stream.channels at the end of the loop.

In that implementation, it is only incremented by cd->num_aec_reference_channels.

lkoenig · 2022-03-24T14:53:34Z

@singalsu Here is the diff I used to make it work:

@@ -372,16 +377,16 @@ static int google_rtc_audio_processing_copy(struct comp_dev *dev)
 
        buffer_stream_invalidate(cd->aec_reference, num_aec_reference_bytes);
 
-       num_samples_remaining = num_aec_reference_frames * cd->num_aec_reference_channels;
+       num_samples_remaining = num_aec_reference_frames * cd->aec_reference->stream.channels;
        while (num_samples_remaining) {
                nmax = audio_stream_samples_without_wrap_s16(&cd->aec_reference->stream, ref);
                n  = MIN(num_samples_remaining, nmax);
                for (i = 0; i < n; i += cd->num_aec_reference_channels) {
                        j = cd->num_aec_reference_channels * cd->aec_reference_frame_index;
                        for (channel = 0; channel < cd->num_aec_reference_channels; ++channel) {
-                               cd->aec_reference_buffer[j++] = *ref;
-                               ref++;
+                               cd->aec_reference_buffer[j++] = ref[channel];
                        }
+                       ref += cd->aec_reference->stream.channels;
                        ++cd->aec_reference_frame_index;
 
                        if (cd->aec_reference_frame_index == cd->num_frames) {

cujomalainey · 2022-04-05T21:16:55Z

ping to @singalsu

singalsu · 2022-04-10T14:11:20Z

I'm sorry @lkoenig @cujomalainey @lgirdwood ! I got twice ill this spring that impacted my work, plus next week is vacation. I plan to start with with this after. If you feel this is needed earlier please take over this patch.

cujomalainey · 2022-04-10T16:01:20Z

No worries, just didnt want it to drop to the bottom of your inbox. Hope you are taking care of yourself.

lkoenig · 2022-04-11T14:19:00Z

@singalsu Please do take care of you and if you are on vacation do enjoy your vacations.
Have a look at this one once you are rested and recovered.
Take care !

sys-pt1s · 2022-04-20T06:27:42Z

Can one of the admins verify this patch?

singalsu · 2022-04-21T16:33:53Z

@lkoenig Thanks! I've now included your changes and will test tomorrow that it works here before update the PR.

This patch optimizes the copy() function by avoiding the per sample done circular wrap check. The source, sink, and reference buffers are read and written in maximum size blocks returned by audio stream function audio_stream_samples_without_wrap_s16() and audio_stream_frames_without_wrap(). The saving with mock-up algorithm version is 16 MCPS, from 29 to 13 MCPS in test with stereo 48 kHz 16 bit playback and capture. Signed-off-by: Seppo Ingalsuo <seppo.ingalsuo@linux.intel.com>

singalsu · 2022-04-22T08:33:57Z

@lkoenig This version seemed to work here with mock-up algorithm, playback was mixed to first capture channel. Can you please check it's OK with the real version.

cujomalainey · 2022-05-13T16:03:11Z

Ping to @lkoenig

cujomalainey

Logic looks good, but I definitely think we can improve the code with some macros/helpers across all of the codebase, especially if we are getting c99 support.

singalsu requested a review from lkoenig March 21, 2022 17:46

singalsu commented Mar 21, 2022

View reviewed changes

src/audio/google_rtc_audio_processing.c Outdated Show resolved Hide resolved

cujomalainey reviewed Mar 21, 2022

View reviewed changes

lkoenig reviewed Mar 22, 2022

View reviewed changes

singalsu force-pushed the grtcproc_read_write_frag_optimize branch 2 times, most recently from e44c18a to 08045a4 Compare March 22, 2022 13:13

singalsu marked this pull request as ready for review March 22, 2022 13:31

singalsu requested review from dbaluta, lbetlej, lgirdwood, mmaka1 and plbossart as code owners March 22, 2022 13:31

singalsu requested review from andrula-song, cujomalainey and lkoenig March 22, 2022 13:31

singalsu mentioned this pull request Mar 22, 2022

[FEATURE] Optimize audio processing components' source/sink buffer access, deprecate read/write frag #4967

Closed

andrula-song approved these changes Mar 23, 2022

View reviewed changes

lgirdwood approved these changes Mar 24, 2022

View reviewed changes

lkoenig requested changes Mar 24, 2022

View reviewed changes

lgirdwood added this to the v2.2 milestone Apr 19, 2022

singalsu force-pushed the grtcproc_read_write_frag_optimize branch from 08045a4 to 5da6554 Compare April 22, 2022 08:31

singalsu requested a review from lkoenig April 22, 2022 08:34

cujomalainey approved these changes May 13, 2022

View reviewed changes

lgirdwood merged commit 89e50e0 into thesofproject:main May 18, 2022

singalsu deleted the grtcproc_read_write_frag_optimize branch August 26, 2022 11:39

Audio: Google RTC Audio Proc: Process copy() without read/write frag #5579

Audio: Google RTC Audio Proc: Process copy() without read/write frag #5579

Uh oh!

Conversation

singalsu commented Mar 21, 2022

Uh oh!

singalsu commented Mar 21, 2022

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

lkoenig left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

andrula-song commented Mar 23, 2022

Uh oh!

lgirdwood left a comment

Choose a reason for hiding this comment

Uh oh!

lkoenig left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

lkoenig commented Mar 24, 2022

Uh oh!

cujomalainey commented Apr 5, 2022

Uh oh!

singalsu commented Apr 10, 2022

Uh oh!

cujomalainey commented Apr 10, 2022

Uh oh!

lkoenig commented Apr 11, 2022

Uh oh!

sys-pt1s commented Apr 20, 2022

Uh oh!

singalsu commented Apr 21, 2022

Uh oh!

singalsu commented Apr 22, 2022

Uh oh!

cujomalainey commented May 13, 2022

Uh oh!

cujomalainey left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants