Skip to content

Conversation

@ranj063
Copy link
Collaborator

@ranj063 ranj063 commented Feb 7, 2019

Set platform name for dai links.

Signed-off-by: Ranjani Sridharan ranjani.sridharan@linux.intel.com

@ranj063 ranj063 requested a review from plbossart February 7, 2019 02:27
@ranj063 ranj063 force-pushed the fix/sof_rt5682 branch 2 times, most recently from 94a8fd4 to 17d286d Compare February 7, 2019 04:42
Copy link
Member

@plbossart plbossart left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

only minor nitpicks to align with other drivers

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

try to follow the other machine drivers and put this in the code block, not in declarations.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

separate line for ret

Set platform name for dai links.

Signed-off-by: Ranjani Sridharan <ranjani.sridharan@linux.intel.com>
@ranj063
Copy link
Collaborator Author

ranj063 commented Feb 7, 2019

@plbossart updated now

Copy link
Member

@plbossart plbossart left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks Ranjani

@plbossart plbossart merged commit 216cb93 into thesofproject:topic/sof-dev Feb 7, 2019
@ranj063 ranj063 deleted the fix/sof_rt5682 branch February 8, 2019 06:36
bardliao pushed a commit to bardliao/linux that referenced this pull request Feb 25, 2020
Damien Le Moal reports a lockdep splat with the acpi_power_meter,
observed with Linux v5.5 and later.

======================================================
WARNING: possible circular locking dependency detected
5.6.0-rc2+ thesofproject#629 Not tainted
------------------------------------------------------
python/1397 is trying to acquire lock:
ffff888619080070 (&resource->lock){+.+.}, at: show_power+0x3c/0xa0 [acpi_power_meter]

               but task is already holding lock:
ffff88881643f188 (kn->count#119){++++}, at: kernfs_seq_start+0x6a/0x160

               which lock already depends on the new lock.

               the existing dependency chain (in reverse order) is:

               -> #1 (kn->count#119){++++}:
       __kernfs_remove+0x626/0x7e0
       kernfs_remove_by_name_ns+0x41/0x80
       remove_attrs+0xcb/0x3c0 [acpi_power_meter]
       acpi_power_meter_notify+0x1f7/0x310 [acpi_power_meter]
       acpi_ev_notify_dispatch+0x198/0x1f3
       acpi_os_execute_deferred+0x4d/0x70
       process_one_work+0x7c8/0x1340
       worker_thread+0x94/0xc70
       kthread+0x2ed/0x3f0
       ret_from_fork+0x24/0x30

               -> #0 (&resource->lock){+.+.}:
       __lock_acquire+0x20be/0x49b0
       lock_acquire+0x127/0x340
       __mutex_lock+0x15b/0x1350
       show_power+0x3c/0xa0 [acpi_power_meter]
       dev_attr_show+0x3f/0x80
       sysfs_kf_seq_show+0x216/0x410
       seq_read+0x407/0xf90
       vfs_read+0x152/0x2c0
       ksys_read+0xf3/0x1d0
       do_syscall_64+0x95/0x1010
       entry_SYSCALL_64_after_hwframe+0x49/0xbe

               other info that might help us debug this:

 Possible unsafe locking scenario:

       CPU0                    CPU1
       ----                    ----
  lock(kn->count#119);
                               lock(&resource->lock);
                               lock(kn->count#119);
  lock(&resource->lock);

                *** DEADLOCK ***
4 locks held by python/1397:
 #0: ffff8890242d64e0 (&f->f_pos_lock){+.+.}, at: __fdget_pos+0x9b/0xb0
 #1: ffff889040be74e0 (&p->lock){+.+.}, at: seq_read+0x6b/0xf90
 #2: ffff8890448eb880 (&of->mutex){+.+.}, at: kernfs_seq_start+0x47/0x160
 #3: ffff88881643f188 (kn->count#119){++++}, at: kernfs_seq_start+0x6a/0x160

               stack backtrace:
CPU: 10 PID: 1397 Comm: python Not tainted 5.6.0-rc2+ thesofproject#629
Hardware name: Supermicro Super Server/X11DPL-i, BIOS 3.1 05/21/2019
Call Trace:
 dump_stack+0x97/0xe0
 check_noncircular+0x32e/0x3e0
 ? print_circular_bug.isra.0+0x1e0/0x1e0
 ? unwind_next_frame+0xb9a/0x1890
 ? entry_SYSCALL_64_after_hwframe+0x49/0xbe
 ? graph_lock+0x79/0x170
 ? __lockdep_reset_lock+0x3c0/0x3c0
 ? mark_lock+0xbc/0x1150
 __lock_acquire+0x20be/0x49b0
 ? mark_held_locks+0xe0/0xe0
 ? stack_trace_save+0x91/0xc0
 lock_acquire+0x127/0x340
 ? show_power+0x3c/0xa0 [acpi_power_meter]
 ? device_remove_bin_file+0x10/0x10
 ? device_remove_bin_file+0x10/0x10
 __mutex_lock+0x15b/0x1350
 ? show_power+0x3c/0xa0 [acpi_power_meter]
 ? show_power+0x3c/0xa0 [acpi_power_meter]
 ? mutex_lock_io_nested+0x11f0/0x11f0
 ? lock_downgrade+0x6a0/0x6a0
 ? kernfs_seq_start+0x47/0x160
 ? lock_acquire+0x127/0x340
 ? kernfs_seq_start+0x6a/0x160
 ? device_remove_bin_file+0x10/0x10
 ? show_power+0x3c/0xa0 [acpi_power_meter]
 show_power+0x3c/0xa0 [acpi_power_meter]
 dev_attr_show+0x3f/0x80
 ? memset+0x20/0x40
 sysfs_kf_seq_show+0x216/0x410
 seq_read+0x407/0xf90
 ? security_file_permission+0x16f/0x2c0
 vfs_read+0x152/0x2c0

Problem is that reading an attribute takes the kernfs lock in the kernfs
code, then resource->lock in the driver. During an ACPI notification, the
opposite happens: The resource lock is taken first, followed by the kernfs
lock when sysfs attributes are removed and re-created. Presumably this is
now seen due to some locking related changes in kernfs after v5.4, but it
was likely always a problem.

Fix the problem by not blindly acquiring the lock in the notification
function. It is only needed to protect the various update functions.
However, those update functions are called anyway when sysfs attributes
are read. This means that we can just stop calling those functions from
the notifier, and the resource lock in the notifier function is no longer
needed.

That leaves two situations:

First, METER_NOTIFY_CONFIG removes and re-allocates capability strings.
While it did so under the resource lock, _displaying_ those strings was not
protected, creating a race condition. To solve this problem, selectively
protect both removal/creation and reporting of capability attributes with
the resource lock.

Second, removing and re-creating the attribute files is no longer protected
by the resource lock. That doesn't matter since access to each individual
attribute is protected by the kernfs lock. Userspace may get messed up if
attributes disappear and reappear under its nose, but that is not different
than today, and there is nothing we can do about it without major driver
restructuring.

Last but not least, when removing the driver, remove attribute functions
first, then release capability strings. This avoids yet another race
condition.

Reported-by: Damien Le Moal <Damien.LeMoal@wdc.com>
Cc: Damien Le Moal <Damien.LeMoal@wdc.com>
Cc: stable@vger.kernel.org # v5.5+
Tested-by: Damien Le Moal <damien.lemoal@wdc.com>
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
plbossart pushed a commit that referenced this pull request Jun 16, 2023
Fixes the following splat on a6xx gen2+ (a640, a650, a660 families),
a6xx gen1 has smaller GMU allocations so they fit under the default
64K max segment size.

   ------------[ cut here ]------------
   DMA-API: msm_dpu ae01000.display-controller: mapping sg segment longer than device claims to support [len=126976] [max=65536]
   WARNING: CPU: 5 PID: 9 at kernel/dma/debug.c:1160 debug_dma_map_sg+0x288/0x314
   Modules linked in:
   CPU: 5 PID: 9 Comm: kworker/u16:0 Not tainted 6.3.0-rc2-debug+ #629
   Hardware name: Google Villager (rev1+) with LTE (DT)
   Workqueue: events_unbound deferred_probe_work_func
   pstate: 60400009 (nZCv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
   pc : debug_dma_map_sg+0x288/0x314
   lr : debug_dma_map_sg+0x288/0x314
   sp : ffffffc00809b560
   x29: ffffffc00809b560 x28: 0000000000000060 x27: 0000000000000000
   x26: 0000000000010000 x25: 0000000000000004 x24: 0000000000000004
   x23: ffffffffffffffff x22: ffffffdb31693cc0 x21: ffffff8080935800
   x20: ffffff8087417400 x19: ffffff8087a45010 x18: 0000000000000000
   x17: 0000000000000000 x16: 0000000000000000 x15: 0000000000010000
   x14: 0000000000000001 x13: ffffffffffffffff x12: ffffffffffffffff
   x11: 0000000000000000 x10: 000000000000000a x9 : ffffffdb2ff05e14
   x8 : ffffffdb31275000 x7 : ffffffdb2ff08908 x6 : 0000000000000000
   x5 : 0000000000000001 x4 : ffffffdb2ff08a74 x3 : ffffffdb31275008
   x2 : 0000000000000000 x1 : 0000000000000000 x0 : ffffff80803a9a80
   Call trace:
    debug_dma_map_sg+0x288/0x314
    __dma_map_sg_attrs+0x80/0xe4
    dma_map_sgtable+0x30/0x4c
    get_pages+0x1d4/0x1e4
    msm_gem_pin_pages_locked+0xbc/0xf8
    msm_gem_pin_vma_locked+0x58/0xa0
    msm_gem_get_and_pin_iova_range+0x98/0xac
    a6xx_gmu_memory_alloc+0x7c/0x128
    a6xx_gmu_init+0x16c/0x9b0
    a6xx_gpu_init+0x38c/0x3e4
    adreno_bind+0x214/0x264
    component_bind_all+0x128/0x1f8
    msm_drm_bind+0x2b8/0x608
    try_to_bring_up_aggregate_device+0x88/0x1a4
    __component_add+0xec/0x13c
    component_add+0x1c/0x28
    dp_display_probe+0x3f8/0x43c
    platform_probe+0x70/0xc4
    really_probe+0x148/0x280
    __driver_probe_device+0xc8/0xe0
    driver_probe_device+0x44/0x100
    __device_attach_driver+0x64/0xdc
    bus_for_each_drv+0xb0/0xd8
    __device_attach+0xd8/0x168
    device_initial_probe+0x1c/0x28
    bus_probe_device+0x44/0xb0
    deferred_probe_work_func+0xc8/0xe0
    process_one_work+0x2e0/0x488
    process_scheduled_works+0x4c/0x50
    worker_thread+0x218/0x274
    kthread+0xf0/0x100
    ret_from_fork+0x10/0x20
   irq event stamp: 293712
   hardirqs last  enabled at (293711): [<ffffffdb2ff0893c>] vprintk_emit+0x160/0x25c
   hardirqs last disabled at (293712): [<ffffffdb30b48130>] el1_dbg+0x24/0x80
   softirqs last  enabled at (279520): [<ffffffdb2fe10420>] __do_softirq+0x21c/0x4bc
   softirqs last disabled at (279515): [<ffffffdb2fe16708>] ____do_softirq+0x18/0x24
   ---[ end trace 0000000000000000 ]---

Signed-off-by: Rob Clark <robdclark@chromium.org>
Fixes: db735fc ("drm/msm: Set dma maximum segment size for mdss")
Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org>
Patchwork: https://patchwork.freedesktop.org/patch/534892/
Link: https://lore.kernel.org/r/20230501204441.1642741-1-robdclark@gmail.com
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants