Skip to content

Conversation

@kv2019i
Copy link
Collaborator

@kv2019i kv2019i commented Apr 3, 2019

Current trace implementation gets out of sync when sof device is
put to suspend. The debugfs file handle is kept open, but
firmware will reset its state. After resume, debugfs client's read
offset will not be synchronized to firmware and this may result
in traces read in incorrect order and/or stale data being read
after resume.

Changelog:

  • V5 7th May: new bug introduced in V4, still more work to improve patch and description understandability
  • V4 26th Apr: Split into multiple patches, dropped renames.
  • V3 24th Apr: Updated patchset with simplified code. Reworked to signal EOF only once so that this works with old and new sof-loggers.
  • V2 11th Apr: Revised patchset to address comments so far, no longer RFC
  • V1 3th Aprh: initial patchset (RFC)

Related patches:

~~ Issues handled in previous iterations~~:

  • this is not full alternative but has overlap with issue SOF DMA Trace: avoid logs clearance at runtime suspend/resume #306 (and related PR) -> this can be merged as incremental update
  • locking needs to be revisited still (I'll go through this if we move forward) -> handled in PR800
  • having multiple u32 entries for various single-bit pieces of dtrace info in sof-priv.h is ugly, I'd like to combine these -> sorted in 11th April push

@ranj063
Copy link
Collaborator

ranj063 commented Apr 3, 2019

@singalsu can you please give this a try and see if it fixes the problem you faced with missing traces with runtime PM enabled?

@singalsu
Copy link

singalsu commented Apr 4, 2019

@ranj063 I just tried this with the sof patch rebuilt tools but now I get empty trace output. I'll check tomorrow with Kai what I've missed.

@singalsu
Copy link

singalsu commented Apr 4, 2019

@ranj063 @kv2019i Correction, I didn't realize that despite the patched trace didn't show the prior content from boot until this moment when start it shows it later!

Now when I had left sof-logger on and started music playback it traced out normally and I saw now the params() output that got dropped previously when started. Also the traces from boot are seen (component new() stuff) is seen when I scroll back the screen.

So, this looks usable to me now and improvement from non-patched version. Though the start of showing the events should be tested with some lower amount of output test case. Maybe pressing volume up/down from alsamixer and see if it activates the trace.

@ranj063
Copy link
Collaborator

ranj063 commented Apr 4, 2019

@singalsu thanks for confirming. I think the missing piece is @keyonjie 's work to copy trace starting from host_offset instead of 0 upon resume.

Copy link

@keyonjie keyonjie left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In general, it's good pre-work before we can go further to support retrieving logs belonged to stage before suspend, I have concern about reset sdev->host_offset at file open, we need figure out that.

@kv2019i kv2019i force-pushed the topic/trace-suspend-fix branch 2 times, most recently from b9ec601 to bda0e5c Compare April 11, 2019 13:42
@kv2019i
Copy link
Collaborator Author

kv2019i commented Apr 11, 2019

@keyonjie wrote:

before suspend, I have concern about reset sdev->host_offset at file open, we need figure out that.

Ok, I figured out a way to capture the early logs if sof-logger is started after FW has been active. Please see my comments above I wrote today, and the new patch I uploaded today.

@kv2019i kv2019i marked this pull request as ready for review April 11, 2019 13:59
@kv2019i kv2019i changed the title [RFC] ASoC: sof: force end-of-file for debugfs trace at suspend ASoC: SOF: force end-of-file for debugfs trace at suspend Apr 11, 2019
@kv2019i kv2019i force-pushed the topic/trace-suspend-fix branch from bda0e5c to d5897a3 Compare April 24, 2019 16:13
@kv2019i
Copy link
Collaborator Author

kv2019i commented Apr 24, 2019

* 24th Apr: Updated patchset with simplified code. Reworked to signal EOF only once so that this works with old and new sof-loggers.

New patchset available. This tries to simplify the logic while still keeping the main feature of resyncing position at suspend/resume cycles. A new sof-logger patch was needed (see PR1312).

Copy link
Member

@plbossart plbossart left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No real comments on the functionality proper but I find this PR a bit messy. It mixes renames, splitting code and addition of new things. It'd be nicer with separate patches.

@kv2019i kv2019i force-pushed the topic/trace-suspend-fix branch from d5897a3 to 45f0248 Compare April 26, 2019 11:03
@kv2019i kv2019i requested a review from plbossart April 26, 2019 11:05
Copy link
Member

@plbossart plbossart left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @kv2019i, much simpler than the previous version but I am still missing a couple of points, see below.

Copy link
Member

@plbossart plbossart left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry @kv2019i still confused about the expected transitions for the dtrace_eof boolean. this might be a missing set of comments to explain how this might be used.

Move duplicated code in sof_wait_trace_avail() to a helper function.

Signed-off-by: Kai Vehmanen <kai.vehmanen@linux.intel.com>
@kv2019i kv2019i force-pushed the topic/trace-suspend-fix branch from 0e97ea4 to 54dacdc Compare May 7, 2019 12:10
Current trace implementation gets out of sync when sof device
is put to suspend. The debugfs file handle is kept open, but
firmware will reset its state. After resume, debugfs client's
read offset will not be synchronized to firmware and this may
result in traces read in incorrect order and/or stale data being
read after resume.

Add logic to signal end-of-file to read() when firmware tracing
has ended, and all trace data has been read. This allows debugfs
client to capture all trace data, and reopen the trace file to
ensure proper synchronization with firmware after reopening
the node.

Signed-off-by: Kai Vehmanen <kai.vehmanen@linux.intel.com>
@kv2019i kv2019i force-pushed the topic/trace-suspend-fix branch from 54dacdc to 891b86b Compare May 7, 2019 12:26
@kv2019i
Copy link
Collaborator Author

kv2019i commented May 7, 2019

@plbossart wrote:

Sorry @kv2019i still confused about the expected transitions for the dtrace_eof boolean. this might be a missing set of comments to explain how this might be used.

Eh, sorry, this round was waste of your time, I slipped a bug in the V4 patchset that further obfuscated intended usage of dtrace_eof variable (although functionally still ok and passed my tests). No excuses, should have been more careful.

Even without my mistake, I agree 'dtrace_eof' is still not ideal, too many comments are needed to explain it. I went back and forth with different names and ended up to rename the variable to 'dtrace_draining' in V5 patchset and further iterated the comments (shorter, just focusing on WHY).

This is still a kludge, I agree, but has a lot of practical utility for sof-logger usage, so I'll still update this PR. If you think this has no chance in upstream, then we might consider dropping altogether and wait until we have time to implement a bigger rework (current WA is to disable runtime PM when taking traces).

@kv2019i kv2019i requested a review from plbossart May 7, 2019 12:48
Copy link
Member

@plbossart plbossart left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @kv2019i, I think I get it now and it looks self-explanatory enough.
I'll need a second approver to merge - yes I know I sound like a broken record but this is annoying to see so few reviewers.

@ranj063
Copy link
Collaborator

ranj063 commented May 8, 2019

@kv2019i something is not right with this methodology suggested in this PR. This is what I tried.

  1. After boot: ./sof-logger -l sof-apl.ldc -t
  2. stop logger and start again -> no output
  3. resume DSP with aplay
  4. run sof-logger and now I can see the output again.
  5. stop logger and start it again - > no output

So everytime I stop the logger and start it again its output is empty. This is likely because you reset the host_offset to 0 in trace_release. I find this behaviour odd though. I think sof-logger should always be able to read the trace irrespective of whether it is opened for the first time after the DSP is suspended or closed & opened again.

@keyonjie
Copy link

keyonjie commented May 8, 2019

Acked on @ranj063, please make sure the PR really works as we expected before we can merge it, the debugfs trace is used to share FW logs for debugging, so the criteria here should be if it really helps for the debugging, we need show the logs as much as possible, resetting the host_offset and clearing the log output at each logger open sounds really bad idea to me.

@kv2019i
Copy link
Collaborator Author

kv2019i commented May 8, 2019

Ranjani wrote:

1. After boot: ./sof-logger -l sof-apl.ldc -t
2. stop logger and start again -> no output 
3. resume DSP with aplay
4. run sof-logger and now I can see the output again. 
5. stop logger and start it again - > no output

[...]

So everytime I stop the logger and start it again its output is empty. This is likely because
you reset the host_offset to 0 in trace_release. I find this behaviour odd though. I think
sof-logger should always be able to read the trace irrespective of whether it is opened for
the first time after the DSP is suspended or closed & opened again.

@ranj063 and @keyonjie , yes, this is exactly the expected behaviour. I sent an RFC about the change of behaviour to the SOF mailing list in April when I started to work on this:
https://mailman.alsa-project.org/pipermail/sound-open-firmware/2019-April/003277.html

Note you will not lose traces. You'll always get the last (upto 64K) traces FW sent even if you did not have logger running.

So with the patch, the DMA trace works like "dmesg -c" in kernel. You can get logs out reliably, but you can only read them out once. Until we have implemented offset trace (or trace double-buffering), we must reset the host_offset so SW/FW are resycnhronized. Currently you will often get duplicated traces, or not traces at all -> it's very undeterministic now.

@kv2019i
Copy link
Collaborator Author

kv2019i commented May 8, 2019

@keyonjie wrote:

the debugging, we need show the logs as much as possible, resetting the host_offset and clearing the
log output at each logger open sounds really bad idea to me.

Ack, but this is not done in the patch. We clear offset at close, specifically to allow user-space to read out the traces before we sync on the position.

@ranj063
Copy link
Collaborator

ranj063 commented May 8, 2019

So with the patch, the DMA trace works like "dmesg -c" in kernel. You can get logs out reliably, but you can only read them out once. Until we have implemented offset trace (or trace double-buffering), we must reset the host_offset so SW/FW are resycnhronized. Currently you will often get duplicated traces, or not traces at all -> it's very undeterministic now.

@kv2019i but this is quite inconvenient if you can only read the trace once after the DSP is suspended. We can live with it for a short while but we need the bigger change to implement trace offset and should be done soon. Users will start complaining about not being able to read the trace the second time around.

@kv2019i
Copy link
Collaborator Author

kv2019i commented May 9, 2019

@ranj063 wrote:

@kv2019i but this is quite inconvenient if you can only read the trace once after the DSP is suspended. > We can live with it for a short while but we need the bigger change to implement trace offset and
should be done soon. Users will start complaining about not being able to read the trace the second
time around.

But in context of this PR, the question is which is now a bigger problem:

  1. getting duplicate or no traces in normal case unless you disable runtime PM, or
  2. inability to to read the trace node multiple times

That's the call we need to make for this PR -> @ranj063 @keyonjie @plbossart
I'm arguing (1) is the bigger problem and am requesting merging this.

Longer term, I'll reiterate my stance that the offset patch does not solve all issues. The trace buffer remains a scratch buffer and even with offset support implemented, user-space has no way to discover where the DSP is writing to. So only reliable way to use DMA trace is to run sof-logger all the time and store traces to user-space continuously. Offset support will not fix this.

@ranj063
Copy link
Collaborator

ranj063 commented May 9, 2019

I'm arguing (1) is the bigger problem and am requesting merging this.

@kv2019i I personally feel we can't deal with either of them. A few months ago I'd have agreed with you and said (1) is the bigger problem but now we've got external folks using SOF and it is not easy to explain to them that the logger can only read once with PM enabled.

@ranj063
Copy link
Collaborator

ranj063 commented May 9, 2019

@kv2019i I have a simple suggestion. I agree this PR is necessary but its annoying to not be able to read the data the second time around. Can we possibly change the logger application to read and save the data at first read (possibly in a file) and read from there instead?

@plbossart plbossart added the Unclear No agreement on problem statement and resolution label May 10, 2019
@kv2019i
Copy link
Collaborator Author

kv2019i commented May 10, 2019

@ranj063 wrote:

you and said (1) is the bigger problem but now we've got external folks using SOF and it is not easy to > explain to them that the logger can only read once with PM enabled.

But why the expectation of a "trace store"? Can you elaborate the usage model a bit. For debugging, you should be storing traces continuously to a file. If you want to debug an error after something strange happens, you should cp out the files (trace, exception, memory dumps) and analyze offlien (and attach to bug issue). What is the usage where you read the trace may times?

I think the interface should be viewed as a "trace port" with some intermediate buffering, but it's not a trace store and you'll end with issues if you expect it to be a store (might work sometimes, but not reliable).

@kv2019i
Copy link
Collaborator Author

kv2019i commented May 10, 2019

@ranj063 wrote:

read the data the second time around. Can we possibly change the logger application to read and
save the data at first read (possibly in a file) and read from there instead?

But this is already supported, right. You 'cp /sys/kernel/debug/sof/trace logfile' and run sof-logger with '-i logfile'. Having sof-logger cache data implicitly sounds a complicated. Where to write, how many traces to keep, potential confusion of combining invalid data from state file cache and valid data from a target data, etc, etc. If really, really need, I'd add a carbon-copy option where sof-logger copies the verbatim binary data all the data to a file, in addition to showing out the parsed log.

@ranj063
Copy link
Collaborator

ranj063 commented May 10, 2019

@ranj063 wrote:

read the data the second time around. Can we possibly change the logger application to read and
save the data at first read (possibly in a file) and read from there instead?

But this is already supported, right. You 'cp /sys/kernel/debug/sof/trace logfile' and run sof-logger with '-i logfile'. Having sof-logger cache data implicitly sounds a complicated. Where to write, how many traces to keep, potential confusion of combining invalid data from state file cache and valid data from a target data, etc, etc. If really, really need, I'd add a carbon-copy option where sof-logger copies the verbatim binary data all the data to a file, in addition to showing out the parsed log.

@kv2019i sorry didnt realize we had the option of reading from a file with the logger. I am OK with it as long we shout out loud and clear that this is a handicap with the logger at the moment.
@keyonjie what do you think?

@kv2019i
Copy link
Collaborator Author

kv2019i commented May 16, 2019

@keyonjie what do you think?

@keyonjie ping, see above

@keyonjie
Copy link

Sorry @kv2019i didn't have much time to go to details of your commits and explaining, I am seeing DMA trace logs lost issue(sometimes no any DMA trace logs show even the FW is running and theoretically we should get logs e.g. trigger commands, hw_params commands) when doing KWD debugging these days.

Just went through your commits, my understand to it is something like this:

  1. snd_sof_init_trace() // reset host_offset, set dtrace_is_enabled, clear dtrace_draining.
  2. file->open() // nothing special we do for trace.
  3. file->read()
    sof_wait_trace_avail() // if has enough data, return; otherwise, wait for event.
  4. file->release() // e.g. Ctrl+c to exit sof-logger.
  5. snd_sof_release_trace() // clear dtrace_is_enabled, set dtrace_draining.

Without suspend/resume, the typical flow is 1->2->3->4(first sof-logger run)->2->3->4...->2->3->4 (the Nth sof-logger run), this works fine as dtrace_draining is never set and we do nothing in 4) actually. dtrace_is_enabled is true, dtrace_draining is false, and host_offset is keep aligned with FW position update.

When suspend happens, let's consider about 2 scenarios:

a. suspend without sof-logger run, that is 1->[2->3->4]->5, after 5), dtrace_is_enabled is false, dtrace_draining is true, and host_offset is aligned with the last update from FW position update.

Here, when we run sof-logger, 2->3(read out all logs, and then clear dtrace_draining, works fine here)->4(reset host_offset, that means if we try open sof-logger again, no logs will be showed at the 2nd iteration, this is also fine IMO).

Then 1->[2->3->4] again, everything same with the first boot, we won't get logs that belong to the previous active stage, this also looks legit(so for first boot fail issues e.g. tplg IPC failure, we can run sof-logger before we turn the driver back to resume via any audio commands, to debug those issues).

So in general, this case a is fine for me.

b. suspend with sof-logger run, that is 1->2->3->5, after 5), dtrace_is_enabled is false, dtrace_draining is true, and host_offset is aligned with the last update from FW position update.

Here, the last logs before suspend is already showed, so if
b_1. 4) is run(Ctrl+c) at suspended, we will reset host_offset(No problem, we have already read them).
b_2. go back to 1) with resume(sof-logger keep running), the next trace position update from FW will start from 0(might be >0 actually), then the sof-logger will output staled(or not existed) [host_offset, buffer_size) logs before it is aligned to the new logs data, this is a little bit confused here, but more logs/noise is better than no logs, I can be tolerant to this actually.

So this case b is acceptable to me.

So, after long analyzing and chatter here, I think this is good PR to be merged now, thank you so much.

Copy link

@keyonjie keyonjie left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After long analyzing for different cases, and the test on my side shows this is good PR to be merged now, thank you so much.

@kv2019i
Copy link
Collaborator Author

kv2019i commented May 16, 2019

Thank you @keyonjie for the very detailed response and review. Your analysis is correct. I only have one note for this:

b. suspend with sof-logger run, that is 1->2->3->5, after 5), dtrace_is_enabled is false, dtrace_draining is true, and host_offset is aligned with the last update from FW position update.
[...]
b_2. go back to 1) with resume(sof-logger keep running), the next trace position update from FW will start from 0(might be >0 actually), then the sof-logger will output staled(or not existed) [host_offset, buffer_size) logs before it is aligned to the new logs data, this is a little bit confused here, but more logs/noise is better than no logs, I can be tolerant to this actually.

In fact we can cover this case as well. With recent patches to sof-logger, when you run sof-logger in trace mode (-t) and this b_2 case happens, sof-logger will notice the end of draining phase (read returns once with 0 bytes available) and reopens the trace node. So when FW starts again, sof-logger will continue reading from 0.

If you use an older sof-logger binary, then the result will be as per your analysis above and some stale traces will be output.

So, after long analyzing and chatter here, I think this is good PR to be merged now, thank you so much.

Great, thank you! @plbossart This would seem to be good to go!

@keyonjie
Copy link

b_2. go back to 1) with resume(sof-logger keep running), the next trace position update from FW will start from 0(might be >0 actually), then the sof-logger will output staled(or not existed) [host_offset, buffer_size) logs before it is aligned to the new logs data, this is a little bit confused here, but more logs/noise is better than no logs, I can be tolerant to this actually.

In fact we can cover this case as well. With recent patches to sof-logger, when you run sof-logger in trace mode (-t) and this b_2 case happens, sof-logger will notice the end of draining phase (read returns once with 0 bytes available) and reopens the trace node. So when FW starts again, sof-logger will

Great to know about this, thanks.

continue reading from 0.

@plbossart
Copy link
Member

Thanks @keyonjie let's merge and test

@plbossart plbossart merged commit 2291b8c into thesofproject:topic/sof-dev May 16, 2019
plbossart pushed a commit that referenced this pull request May 1, 2020
Running with KASAN on a VIM3L systems leads to the following splat
when probing the Ethernet device:

==================================================================
BUG: KASAN: global-out-of-bounds in _get_maxdiv+0x74/0xd8
Read of size 4 at addr ffffa000090615f4 by task systemd-udevd/139
CPU: 1 PID: 139 Comm: systemd-udevd Tainted: G            E     5.7.0-rc1-00101-g8624b7577b9c #781
Hardware name: amlogic w400/w400, BIOS 2020.01-rc5 03/12/2020
Call trace:
 dump_backtrace+0x0/0x2a0
 show_stack+0x20/0x30
 dump_stack+0xec/0x148
 print_address_description.isra.12+0x70/0x35c
 __kasan_report+0xfc/0x1d4
 kasan_report+0x4c/0x68
 __asan_load4+0x9c/0xd8
 _get_maxdiv+0x74/0xd8
 clk_divider_bestdiv+0x74/0x5e0
 clk_divider_round_rate+0x80/0x1a8
 clk_core_determine_round_nolock.part.9+0x9c/0xd0
 clk_core_round_rate_nolock+0xf0/0x108
 clk_hw_round_rate+0xac/0xf0
 clk_factor_round_rate+0xb8/0xd0
 clk_core_determine_round_nolock.part.9+0x9c/0xd0
 clk_core_round_rate_nolock+0xf0/0x108
 clk_core_round_rate_nolock+0xbc/0x108
 clk_core_set_rate_nolock+0xc4/0x2e8
 clk_set_rate+0x58/0xe0
 meson8b_dwmac_probe+0x588/0x72c [dwmac_meson8b]
 platform_drv_probe+0x78/0xd8
 really_probe+0x158/0x610
 driver_probe_device+0x140/0x1b0
 device_driver_attach+0xa4/0xb0
 __driver_attach+0xcc/0x1c8
 bus_for_each_dev+0xf4/0x168
 driver_attach+0x3c/0x50
 bus_add_driver+0x238/0x2e8
 driver_register+0xc8/0x1e8
 __platform_driver_register+0x88/0x98
 meson8b_dwmac_driver_init+0x28/0x1000 [dwmac_meson8b]
 do_one_initcall+0xa8/0x328
 do_init_module+0xe8/0x368
 load_module+0x3300/0x36b0
 __do_sys_finit_module+0x120/0x1a8
 __arm64_sys_finit_module+0x4c/0x60
 el0_svc_common.constprop.2+0xe4/0x268
 do_el0_svc+0x98/0xa8
 el0_svc+0x24/0x68
 el0_sync_handler+0x12c/0x318
 el0_sync+0x158/0x180

The buggy address belongs to the variable:
 div_table.63646+0x34/0xfffffffffffffa40 [dwmac_meson8b]

Memory state around the buggy address:
 ffffa00009061480: fa fa fa fa 00 00 00 01 fa fa fa fa 00 00 00 00
 ffffa00009061500: 05 fa fa fa fa fa fa fa 00 04 fa fa fa fa fa fa
>ffffa00009061580: 00 03 fa fa fa fa fa fa 00 00 00 00 00 00 fa fa
                                                             ^
 ffffa00009061600: fa fa fa fa 00 01 fa fa fa fa fa fa 01 fa fa fa
 ffffa00009061680: fa fa fa fa 00 01 fa fa fa fa fa fa 04 fa fa fa
==================================================================

Digging into this indeed shows that the clock divider array is
lacking a final fence, and that the clock subsystems goes in the
weeds. Oh well.

Let's add the empty structure that indicates the end of the array.

Fixes: bd6f485 ("net: stmmac: dwmac-meson8b: Fix the RGMII TX delay on Meson8b/8m2 SoCs")
Signed-off-by: Marc Zyngier <maz@kernel.org>
Cc: Martin Blumenstingl <martin.blumenstingl@googlemail.com>
Reviewed-by: Martin Blumenstingl <martin.blumenstingl@googlemail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
naveen-manohar pushed a commit to naveen-manohar/linux that referenced this pull request May 14, 2020
[ Upstream commit f0212a5 ]

Running with KASAN on a VIM3L systems leads to the following splat
when probing the Ethernet device:

==================================================================
BUG: KASAN: global-out-of-bounds in _get_maxdiv+0x74/0xd8
Read of size 4 at addr ffffa000090615f4 by task systemd-udevd/139
CPU: 1 PID: 139 Comm: systemd-udevd Tainted: G            E     5.7.0-rc1-00101-g8624b7577b9c thesofproject#781
Hardware name: amlogic w400/w400, BIOS 2020.01-rc5 03/12/2020
Call trace:
 dump_backtrace+0x0/0x2a0
 show_stack+0x20/0x30
 dump_stack+0xec/0x148
 print_address_description.isra.12+0x70/0x35c
 __kasan_report+0xfc/0x1d4
 kasan_report+0x4c/0x68
 __asan_load4+0x9c/0xd8
 _get_maxdiv+0x74/0xd8
 clk_divider_bestdiv+0x74/0x5e0
 clk_divider_round_rate+0x80/0x1a8
 clk_core_determine_round_nolock.part.9+0x9c/0xd0
 clk_core_round_rate_nolock+0xf0/0x108
 clk_hw_round_rate+0xac/0xf0
 clk_factor_round_rate+0xb8/0xd0
 clk_core_determine_round_nolock.part.9+0x9c/0xd0
 clk_core_round_rate_nolock+0xf0/0x108
 clk_core_round_rate_nolock+0xbc/0x108
 clk_core_set_rate_nolock+0xc4/0x2e8
 clk_set_rate+0x58/0xe0
 meson8b_dwmac_probe+0x588/0x72c [dwmac_meson8b]
 platform_drv_probe+0x78/0xd8
 really_probe+0x158/0x610
 driver_probe_device+0x140/0x1b0
 device_driver_attach+0xa4/0xb0
 __driver_attach+0xcc/0x1c8
 bus_for_each_dev+0xf4/0x168
 driver_attach+0x3c/0x50
 bus_add_driver+0x238/0x2e8
 driver_register+0xc8/0x1e8
 __platform_driver_register+0x88/0x98
 meson8b_dwmac_driver_init+0x28/0x1000 [dwmac_meson8b]
 do_one_initcall+0xa8/0x328
 do_init_module+0xe8/0x368
 load_module+0x3300/0x36b0
 __do_sys_finit_module+0x120/0x1a8
 __arm64_sys_finit_module+0x4c/0x60
 el0_svc_common.constprop.2+0xe4/0x268
 do_el0_svc+0x98/0xa8
 el0_svc+0x24/0x68
 el0_sync_handler+0x12c/0x318
 el0_sync+0x158/0x180

The buggy address belongs to the variable:
 div_table.63646+0x34/0xfffffffffffffa40 [dwmac_meson8b]

Memory state around the buggy address:
 ffffa00009061480: fa fa fa fa 00 00 00 01 fa fa fa fa 00 00 00 00
 ffffa00009061500: 05 fa fa fa fa fa fa fa 00 04 fa fa fa fa fa fa
>ffffa00009061580: 00 03 fa fa fa fa fa fa 00 00 00 00 00 00 fa fa
                                                             ^
 ffffa00009061600: fa fa fa fa 00 01 fa fa fa fa fa fa 01 fa fa fa
 ffffa00009061680: fa fa fa fa 00 01 fa fa fa fa fa fa 04 fa fa fa
==================================================================

Digging into this indeed shows that the clock divider array is
lacking a final fence, and that the clock subsystems goes in the
weeds. Oh well.

Let's add the empty structure that indicates the end of the array.

Fixes: bd6f485 ("net: stmmac: dwmac-meson8b: Fix the RGMII TX delay on Meson8b/8m2 SoCs")
Signed-off-by: Marc Zyngier <maz@kernel.org>
Cc: Martin Blumenstingl <martin.blumenstingl@googlemail.com>
Reviewed-by: Martin Blumenstingl <martin.blumenstingl@googlemail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
bardliao pushed a commit to bardliao/linux that referenced this pull request Jul 14, 2021
Fixes checkpatch issues:

  CHECK: Alignment should match open parenthesis
  thesofproject#87: FILE: drivers/reset/core.c:87:
  +static int of_reset_simple_xlate(struct reset_controller_dev *rcdev,
  +			  const struct of_phandle_args *reset_spec)

  CHECK: Lines should not end with a '('
  thesofproject#540: FILE: drivers/reset/core.c:540:
  +static struct reset_control *__reset_control_get_internal(

  CHECK: Alignment should match open parenthesis
  thesofproject#603: FILE: drivers/reset/core.c:603:
  +struct reset_control *__of_reset_control_get(struct device_node *node,
  +				     const char *id, int index, bool shared,

  CHECK: Alignment should match open parenthesis
  thesofproject#781: FILE: drivers/reset/core.c:781:
  +struct reset_control *__devm_reset_control_get(struct device *dev,
  +				     const char *id, int index, bool shared,

Signed-off-by: Philipp Zabel <p.zabel@pengutronix.de>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Unclear No agreement on problem statement and resolution

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants