-
Notifications
You must be signed in to change notification settings - Fork 349
Add new mtrace_printf() and use it for DMA init error handling and demoting banner to INFO level #4389
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
CC @iuliana-prodan The commit "timer.h: add platform_safe_get_time(timer) and use it in trace.c" is potentially interesting after the discussions yesterday. |
@paulstelian97 I was writing my review/questions while you were writing this :) |
paulstelian97
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Other than existing comments this looks good. I'll wait for those to be resolved before approve though.
Thanks @lyakh , very useful. I changed the comment and... tried to use panics, which cost me 1+ day because of bug #1346 :-( To avoid this happening again I'm adding some panic.c comments and a workaround suggestion in #4401 |
lgirdwood
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good, just one question on whether we want a new timer API or just make the current one safe.
src/include/sof/drivers/timer.h
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would it not be better to make the default version safe ? (and return time 0 to indicate no timer running)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe, I don't know. You tell me. So far I've been spending at least 80% of my time trying to fix issues fairly distant from my actual objectives so this time I tried to minimize the changes but maybe I shouldn't have?
There is a variety of platform_timer_get(timer) implementations. Some implementations don't even look at their timer argument: so these platforms would then lose even more timestamps for no real reason.
At least one platform performs an assert(timer), maybe because it was considered valuable not to keep running with broken timers? Tracing is still useful without timestamps but I wouldn't be so sure about other areas: I can easily imagine some threads stuck forever.
Considering such "diversity" I chose not to open that can of worms and to change only the area in this PR. As of now the new "safe" function is meant to be used only in areas with tricky boot dependencies; like tracing.
(and return time 0 to indicate no timer running)
As explained in the comment, I'm not returning zero but on small value on purpose: because zero looks like something is wrong.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
0 looks perfectly fine to me. Look at dmesg soon after a system boot - the first messages are marked with time 0. Looks much more logical than any random "small value" to me.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This non-zero value is useful to make the difference with platform_timer_get(timer) returning zero (for good or bad reasons). When everyone and everything returns zero including buggy code (but not just), then there's no way to tell where it came from.
|
I just pushed v3 5c9f2e8 that adds mtrace_printf() for Zephyr too, I had forgotten to extract that from my stack or other WIP. EDIT: https://sof-ci.01.org/sofpr/PR4389/build9504/devicetest is suspiciously all green |
lyakh
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
a couple of minor suggestions, should be good after that
src/include/sof/drivers/timer.h
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
0 looks perfectly fine to me. Look at dmesg soon after a system boot - the first messages are marked with time 0. Looks much more logical than any random "small value" to me.
src/trace/trace.c
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why (void *)? the first argument of put_header() is of type uint32_t *
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Because packet is a char []. I'll make the first argument of put_header() a void *, that requires no other change anywhere, not even in the put_header() code.
src/trace/dma-trace.c
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why is this needed? It's true, that memory cannot be freed, but the allocated buffer is still there and it's valid... Any specific need for this change? dma_trace_initialized() would return false without it too anyway?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
dma_trace_initialized() would return false without it too anyway?
Yes but some part of the code not using dma_trace_initialized() would not and would crash. So this is safer and costs practically nothing.
src/trace/dma-trace.c
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it's NULL here, why not just print a meaningful description like "not available" or something?
marc-hb
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the very thorough review! I will apply all the other, unanswered suggestions.
src/include/sof/drivers/timer.h
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This non-zero value is useful to make the difference with platform_timer_get(timer) returning zero (for good or bad reasons). When everyone and everything returns zero including buggy code (but not just), then there's no way to tell where it came from.
src/trace/dma-trace.c
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
dma_trace_initialized() would return false without it too anyway?
Yes but some part of the code not using dma_trace_initialized() would not and would crash. So this is safer and costs practically nothing.
src/trace/trace.c
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Because packet is a char []. I'll make the first argument of put_header() a void *, that requires no other change anywhere, not even in the put_header() code.
Zephyr's printk does not work like Linuxs printk Signed-off-by: Marc Herbert <marc.herbert@intel.com>
Found by chance while looking for something else. This makes no difference right now because we checked the "length" in the line immediately before but let's do the right thing anyway, at the very least it's more readable. Signed-off-by: Marc Herbert <marc.herbert@intel.com>
Returns a small constant instead of crashing platform_timer_get() when timer is NULL. This is required to restore early tracing before sof->timer is initialized. Signed-off-by: Marc Herbert <marc.herbert@intel.com>
Direct access to mbox shared memory logging when DMA tracing is either not initialized yet or disabled or found broken for any reason. Signed-off-by: Marc Herbert <marc.herbert@intel.com>
Replace some tr_err() too. Signed-off-by: Marc Herbert <marc.herbert@intel.com>
... down from ERROR. It was never an error, this was a hack to be duplicated to both trace and etrace. This will make it easier to search for actual errors. Also make the banners slightly different between etrace and trace thanks to new SHM: and DMA: prefixes. This makes identification of log files easier. Signed-off-by: Marc Herbert <marc.herbert@intel.com>
marc-hb
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
https://sof-ci.01.org/sofpr/PR4389/build9518/devicetest and everything else for that commit is green.
|
The addition of |
5 commits, per-commit review is recommended.
This is a re-submission/recycling of 85% of the earlier "restore early tracing" PR previously reviewed in #4334: everything except the actual and final restoration of early tracing which is currently failing on BDW and BYT which I have no time to debug right now. Maybe later.