Skip to content

Rename DTS Files#4

Merged
arm000 merged 1 commit intoarm000:linux-next/mboxicfrom
Fishwaldo:dts-rename
Jan 20, 2023
Merged

Rename DTS Files#4
arm000 merged 1 commit intoarm000:linux-next/mboxicfrom
Fishwaldo:dts-rename

Conversation

@Fishwaldo
Copy link

No description provided.


uart0: serial@30002000 {
compatible = "bouffalolab,uart";
compatible = "bflb,bl808-uart";
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't this incompatible with the upstream bindings the Bouffalo kernel maintainer posted?

torvalds@d54c88d

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As I understand they never made it to mainline. Plus I'm trying to integrate the latest opensbi with smaeul's patches here where he uses bl808-uart.

smaeul/opensbi@1767f7f

@arm000 arm000 merged commit f6b59e1 into arm000:linux-next/mboxic Jan 20, 2023
@Fishwaldo Fishwaldo deleted the dts-rename branch January 22, 2023 16:39
arm000 pushed a commit that referenced this pull request Jan 25, 2023
When unloading the SCMI core stack module, configured to use the virtio
SCMI transport, LOCKDEP reports the splat down below about unsafe locks
dependencies.

In order to avoid this possible unsafe locking scenario call upfront
virtio_break_device() before getting hold of vioch->lock.

=====================================================
 WARNING: HARDIRQ-safe -> HARDIRQ-unsafe lock order detected
 6.1.0-00067-g6b934395ba07-dirty #4 Not tainted
 -----------------------------------------------------
 rmmod/307 [HC0[0]:SC0[0]:HE0:SE1] is trying to acquire:
 ffff000080c510e0 (&dev->vqs_list_lock){+.+.}-{3:3}, at: virtio_break_device+0x28/0x68

 and this task is already holding:
 ffff00008288ada0 (&channels[i].lock){-.-.}-{3:3}, at: virtio_chan_free+0x60/0x168 [scmi_module]

 which would create a new lock dependency:
  (&channels[i].lock){-.-.}-{3:3} -> (&dev->vqs_list_lock){+.+.}-{3:3}

 but this new dependency connects a HARDIRQ-irq-safe lock:
  (&channels[i].lock){-.-.}-{3:3}

 ... which became HARDIRQ-irq-safe at:
   lock_acquire+0x128/0x398
   _raw_spin_lock_irqsave+0x78/0x140
   scmi_vio_complete_cb+0xb4/0x3b8 [scmi_module]
   vring_interrupt+0x84/0x120
   vm_interrupt+0x94/0xe8
   __handle_irq_event_percpu+0xb4/0x3d8
   handle_irq_event_percpu+0x20/0x68
   handle_irq_event+0x50/0xb0
   handle_fasteoi_irq+0xac/0x138
   generic_handle_domain_irq+0x34/0x50
   gic_handle_irq+0xa0/0xd8
   call_on_irq_stack+0x2c/0x54
   do_interrupt_handler+0x8c/0x90
   el1_interrupt+0x40/0x78
   el1h_64_irq_handler+0x18/0x28
   el1h_64_irq+0x64/0x68
   _raw_write_unlock_irq+0x48/0x80
   ep_start_scan+0xf0/0x128
   do_epoll_wait+0x390/0x858
   do_compat_epoll_pwait.part.34+0x1c/0xb8
   __arm64_sys_epoll_pwait+0x80/0xd0
   invoke_syscall+0x4c/0x110
   el0_svc_common.constprop.3+0x98/0x120
   do_el0_svc+0x34/0xd0
   el0_svc+0x40/0x98
   el0t_64_sync_handler+0x98/0xc0
   el0t_64_sync+0x170/0x174

 to a HARDIRQ-irq-unsafe lock:
  (&dev->vqs_list_lock){+.+.}-{3:3}

 ... which became HARDIRQ-irq-unsafe at:
 ...
   lock_acquire+0x128/0x398
   _raw_spin_lock+0x58/0x70
   __vring_new_virtqueue+0x130/0x1c0
   vring_create_virtqueue+0xc4/0x2b8
   vm_find_vqs+0x20c/0x430
   init_vq+0x308/0x390
   virtblk_probe+0x114/0x9b0
   virtio_dev_probe+0x1a4/0x248
   really_probe+0xc8/0x3a8
   __driver_probe_device+0x84/0x190
   driver_probe_device+0x44/0x110
   __driver_attach+0x104/0x1e8
   bus_for_each_dev+0x7c/0xd0
   driver_attach+0x2c/0x38
   bus_add_driver+0x1e4/0x258
   driver_register+0x6c/0x128
   register_virtio_driver+0x2c/0x48
   virtio_blk_init+0x70/0xac
   do_one_initcall+0x84/0x420
   kernel_init_freeable+0x2d0/0x340
   kernel_init+0x2c/0x138
   ret_from_fork+0x10/0x20

 other info that might help us debug this:

  Possible interrupt unsafe locking scenario:

        CPU0                    CPU1
        ----                    ----
   lock(&dev->vqs_list_lock);
                                local_irq_disable();
                                lock(&channels[i].lock);
                                lock(&dev->vqs_list_lock);
   <Interrupt>
     lock(&channels[i].lock);

  *** DEADLOCK ***
================

Fixes: 42e90eb ("firmware: arm_scmi: Add a virtio channel refcount")
Signed-off-by: Cristian Marussi <cristian.marussi@arm.com>
Link: https://lore.kernel.org/r/20221222183823.518856-6-cristian.marussi@arm.com
Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>
arm000 pushed a commit that referenced this pull request Jan 30, 2023
During EEH error injection testing, a deadlock was encountered in the tg3
driver when tg3_io_error_detected() was attempting to cancel outstanding
reset tasks:

crash> foreach UN bt
...
PID: 159    TASK: c0000000067c6000  CPU: 8   COMMAND: "eehd"
...
 #5 [c00000000681f990] __cancel_work_timer at c00000000019fd18
 #6 [c00000000681fa30] tg3_io_error_detected at c00800000295f098 [tg3]
 #7 [c00000000681faf0] eeh_report_error at c00000000004e25c
...

PID: 290    TASK: c000000036e5f800  CPU: 6   COMMAND: "kworker/6:1"
...
 #4 [c00000003721fbc0] rtnl_lock at c000000000c940d8
 #5 [c00000003721fbe0] tg3_reset_task at c008000002969358 [tg3]
 #6 [c00000003721fc60] process_one_work at c00000000019e5c4
...

PID: 296    TASK: c000000037a65800  CPU: 21  COMMAND: "kworker/21:1"
...
 #4 [c000000037247bc0] rtnl_lock at c000000000c940d8
 #5 [c000000037247be0] tg3_reset_task at c008000002969358 [tg3]
 #6 [c000000037247c60] process_one_work at c00000000019e5c4
...

PID: 655    TASK: c000000036f49000  CPU: 16  COMMAND: "kworker/16:2"
...:1

 #4 [c0000000373ebbc0] rtnl_lock at c000000000c940d8
 #5 [c0000000373ebbe0] tg3_reset_task at c008000002969358 [tg3]
 #6 [c0000000373ebc60] process_one_work at c00000000019e5c4
...

Code inspection shows that both tg3_io_error_detected() and
tg3_reset_task() attempt to acquire the RTNL lock at the beginning of
their code blocks.  If tg3_reset_task() should happen to execute between
the times when tg3_io_error_deteced() acquires the RTNL lock and
tg3_reset_task_cancel() is called, a deadlock will occur.

Moving tg3_reset_task_cancel() call earlier within the code block, prior
to acquiring RTNL, prevents this from happening, but also exposes another
deadlock issue where tg3_reset_task() may execute AFTER
tg3_io_error_detected() has executed:

crash> foreach UN bt
PID: 159    TASK: c0000000067d2000  CPU: 9   COMMAND: "eehd"
...
 #4 [c000000006867a60] rtnl_lock at c000000000c940d8
 #5 [c000000006867a80] tg3_io_slot_reset at c0080000026c2ea8 [tg3]
 #6 [c000000006867b00] eeh_report_reset at c00000000004de88
...
PID: 363    TASK: c000000037564000  CPU: 6   COMMAND: "kworker/6:1"
...
 #3 [c000000036c1bb70] msleep at c000000000259e6c
 #4 [c000000036c1bba0] napi_disable at c000000000c6b848
 #5 [c000000036c1bbe0] tg3_reset_task at c0080000026d942c [tg3]
 #6 [c000000036c1bc60] process_one_work at c00000000019e5c4
...

This issue can be avoided by aborting tg3_reset_task() if EEH error
recovery is already in progress.

Fixes: db84bf4 ("tg3: tg3_reset_task() needs to use rtnl_lock to synchronize")
Signed-off-by: David Christensen <drc@linux.vnet.ibm.com>
Reviewed-by: Pavan Chebbi <pavan.chebbi@broadcom.com>
Link: https://lore.kernel.org/r/20230124185339.225806-1-drc@linux.vnet.ibm.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
arm000 pushed a commit that referenced this pull request Feb 5, 2023
Jakub Sitnicki says:

====================

This patch set addresses the syzbot report in [1].

Patch #1 has been suggested by Eric [2]. I extended it to cover the rest of
sock_map proto callbacks. Otherwise we would still overflow the stack.

Patch #2 contains the actual fix and bug analysis.
Patches #3 & #4 add coverage to selftests to trigger the bug.

[1] https://lore.kernel.org/all/00000000000073b14905ef2e7401@google.com/
[2] https://lore.kernel.org/all/CANn89iK2UN1FmdUcH12fv_xiZkv2G+Nskvmq7fG6aA_6VKRf6g@mail.gmail.com/
---
v1 -> v2:
v1: https://lore.kernel.org/r/20230113-sockmap-fix-v1-0-d3cad092ee10@cloudflare.com
[v1 didn't hit bpf@ ML by mistake]

 * pull in Eric's patch to protect against recursion loop bugs (Eric)
 * add a macro helper to check if pointer is inside a memory range (Eric)
====================

Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants

Comments