Skip to content

Conversation

@btian1
Copy link
Contributor

@btian1 btian1 commented Sep 29, 2022

Current cycle is based on 38400000, and perf cnt module did not consider
the clock wrap case, i.e the later cycle maybe small than the previous one.
so add code to handle this situation.

Signed-off-by: Baofeng Tian baofeng.tian@intel.com

@btian1 btian1 force-pushed the perfcnt branch 2 times, most recently from dcd2dfb to 8c4a1d5 Compare September 29, 2022 13:39
@btian1 btian1 marked this pull request as ready for review September 29, 2022 13:52
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this 64-bit arithmetics? If it's 32-bit then that will overflow: UINT32_MAX + plat_ts will be calculated first and it doesn't fit in 32 bits. Just change that to UINT32_MAX - (pcd)->plat_ts) + plat_ts. And I'm wondering why one would decide to use parentheses in

x = (a + b);

...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, changed based on your comments, please review again, if no other issues, please +1 for merge.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ditto. Please remove all external parentheses

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

@btian1 btian1 force-pushed the perfcnt branch 3 times, most recently from f1cc90b to 2e058da Compare October 8, 2022 02:43
@btian1 btian1 requested a review from lyakh October 10, 2022 00:57
Copy link
Collaborator

@lyakh lyakh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

possibly for a separate PR: these are getting rather big for macros, wondering if they could be converted to inline functions

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lines 122 and 123 aren't needed? Are .plat_ts and .cpu_ts updated elsewhere now?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

line 122 and 123 will be updated with perf_cnt_init before each module beginning, then after each module, will calculate the diff and decide the data.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this already split from original #6322 , how about keep it, too much prs will make upstream more difficult? it is not complicate actually, regarding change to inline, due to still pass function into the macro, how about keep for now?
I even want to rewrite perf_cnt, but due to I am new, and don't know too much about history, so I change based on current, in future, if there is new request, I may rewrite this module.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sorry, I don't understand. perf_cnt_init() initialises .plat_ts and .cpu_ts but when calculating the current deltas, you have to compare to the previous call, not to the initialisation values? What am I missing?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

in: perf_cnt_init()
get initial clock for plat_ts and cpu_ts.
then followed by function calling.
....
then get current clock with local plat_ts and cpu_ts
then get the delta and peak for this function calling.
record the delta and accumulated it with 1024 times to get average.

it will loop around per 1 seconds.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, I see now, so it looks like it is redundant in the current implementation? Would be good to have that change - just removing those two lines as a separate commit, but well, at least I know now why they are removed, thanks

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, sorry, I don't understand how this works. At every function we calculate "(pcd)->plat_delta_last = plat_ts - (pcd)->plat_ts;" , but if "(pcd)->plat_ts" is not update (as removed in this patch), the delta is not correctly calculated (it's delta to initial value, not delta to previous call). Did I miss where a new place where "(pcd)->plat_ts" is updated?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @btian1 , so the old code really was wrong.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, I see now, so it looks like it is redundant in the current implementation? Would be good to have that change - just removing those two lines as a separate commit, but well, at least I know now why they are removed, thanks

Copy link
Collaborator

@kv2019i kv2019i left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @btian1 . Splitting PRs into simple separate changes is a good way to get changes incrementally done. But in this case, I believe I'm missing something essential as I don't get how the delta calculation works after this PR. Please see my comment inline.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, sorry, I don't understand how this works. At every function we calculate "(pcd)->plat_delta_last = plat_ts - (pcd)->plat_ts;" , but if "(pcd)->plat_ts" is not update (as removed in this patch), the delta is not correctly calculated (it's delta to initial value, not delta to previous call). Did I miss where a new place where "(pcd)->plat_ts" is updated?

@btian1
Copy link
Contributor Author

btian1 commented Oct 11, 2022

@kv2019i , (pcd)->plat_ts will be updated in perf_cnt_init before each module running.
You can refer to:
https://github.com/thesofproject/sof/pull/6344/files#diff-1986b2a8b4fbd60e9bdf7b2829d35eee6c9150e6794201f7fff09dc8d54dbcacR248

due to required to split to multiple patches, changes may cause some mis-understanding.
Sorry for that, I will submit PRs with more careful next time.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @btian1 , so the old code really was wrong.

Copy link
Collaborator

@RanderWang RanderWang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@kv2019i
Copy link
Collaborator

kv2019i commented Oct 13, 2022

And same CI error on this PR as well. FYI @wszypelt

Current hw cycle is based on 38400000, and perf cnt module did not consider
the clock wrap case, i.e the later cycle maybe small than the previous one.
so add code to handle this situation.

Signed-off-by: Baofeng Tian <baofeng.tian@intel.com>
@kv2019i
Copy link
Collaborator

kv2019i commented Oct 18, 2022

Known fail with Intel IPC4 test set, otherwise looks good, merging.

@kv2019i kv2019i merged commit 177b614 into thesofproject:main Oct 18, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants