Skip to content

Conversation

@JustAMan
Copy link

@JustAMan JustAMan commented Sep 7, 2014

Might fix some other multi-button headsets as well

@M1cha
Copy link
Owner

M1cha commented Sep 8, 2014

Merged to official CM repos

@M1cha M1cha closed this Sep 8, 2014
M1cha pushed a commit that referenced this pull request Nov 16, 2014
An inactive timer's base can refer to a offline cpu's base.

In the current code, cpu_base's lock is blindly reinitialized
each time a CPU is brought up. If a CPU is brought online
during the period that another thread is trying to modify an
inactive timer on that CPU with holding its timer base lock,
then the lock will be reinitialized under its feet. This leads
to following SPIN_BUG().

<0> BUG: spinlock already unlocked on CPU#3, kworker/u:3/1466
<0> lock: 0xe3ebe000, .magic: dead4ead, .owner: kworker/u:3/1466,
 .owner_cpu: 1
<4> [<c0013dc4>] (unwind_backtrace+0x0/0x11c) from [<c026e794>]
(do_raw_spin_unlock+0x40/0xcc)
<4> [<c026e794>] (do_raw_spin_unlock+0x40/0xcc) from [<c076c160>]
(_raw_spin_unlock+0x8/0x30)
<4> [<c076c160>] (_raw_spin_unlock+0x8/0x30) from [<c009b858>]
(mod_timer+0x294/0x310)
<4> [<c009b858>] (mod_timer+0x294/0x310) from [<c00a5e04>]
(queue_delayed_work_on+0x104/0x120)
<4> [<c00a5e04>] (queue_delayed_work_on+0x104/0x120) from [<c04eae00>]
(sdhci_msm_bus_voting+0x88/0x9c)
<4> [<c04eae00>] (sdhci_msm_bus_voting+0x88/0x9c) from [<c04d8780>]
(sdhci_disable+0x40/0x48)
<4> [<c04d8780>] (sdhci_disable+0x40/0x48) from [<c04bf300>]
(mmc_release_host+0x4c/0xb0)
<4> [<c04bf300>] (mmc_release_host+0x4c/0xb0) from [<c04c7aac>]
(mmc_sd_detect+0x90/0xfc)
<4> [<c04c7aac>] (mmc_sd_detect+0x90/0xfc) from [<c04c2504>]
(mmc_rescan+0x7c/0x2c4)
<4> [<c04c2504>] (mmc_rescan+0x7c/0x2c4) from [<c00a6a7c>]
(process_one_work+0x27c/0x484)
<4> [<c00a6a7c>] (process_one_work+0x27c/0x484) from [<c00a6e94>]
(worker_thread+0x210/0x3b0)
<4> [<c00a6e94>] (worker_thread+0x210/0x3b0) from [<c00aad9c>]
(kthread+0x80/0x8c)
<4> [<c00aad9c>] (kthread+0x80/0x8c) from [<c000ea80>]
(kernel_thread_exit+0x0/0x8)

As an example, this particular crash occurred when CPU #3 is executing
mod_timer() on an inactive timer whose base is refered to offlined CPU #2.
The code locked the timer_base corresponding to CPU #2. Before it could
proceed, CPU #2 came online and reinitialized the spinlock corresponding
to its base. Thus now CPU #3 held a lock which was reinitialized. When
CPU #3 finally ended up unlocking the old cpu_base corresponding to CPU #2,
we hit the above SPIN_BUG().

CPU #0			CPU #3				       CPU #2
------			-------				       -------
.....			 ......				      <Offline>
			mod_timer()
			 lock_timer_base
			  spin_lock_irqsave(&base->lock)

cpu_up(2)		 .....				        ......
							 init_timers_cpu()
.....		 	 spin_unlock_irqrestore(&base->lock)     ......
			   <spin_bug>

Allocation of per_cpu timer vector bases is done only once under
"tvec_base_done[]" check. In the current code, spinlock_initialization
of base->lock isn't under this check. When a CPU is up each time the base
lock is reinitialized. Move base spinlock initialization under the check.

CRs-Fixed: 471127
Change-Id: I73b48440fffb227a60af9180e318c851048530dd
Signed-off-by: Tirupathi Reddy <tirupath@codeaurora.org>
Signed-off-by: Ed Tam <etam@google.com>
ivan19871002 referenced this pull request Dec 8, 2014
* Sync to android-msm-3.4-flo-lollipop-release

Change-Id: I6595aadd8aca4e2cd724266e42879ca1244eb2fa
M1cha pushed a commit that referenced this pull request Feb 16, 2015
This patch makes clearer the ambiguous f2fs_gc flow as follows.

1. Remove intermediate checkpoint condition during f2fs_gc
 (i.e., should_do_checkpoint() and GC_BLOCKED)

2. Remove unnecessary return values of f2fs_gc because of #1.
 (i.e., GC_NODE, GC_OK, etc)

3. Simplify write_checkpoint() because of #2.

4. Clarify the main f2fs_gc flow.
 o monitor how many freed sections during one iteration of do_garbage_collect().
 o do GC more without checkpoints if we can't get enough free sections.
 o do checkpoint once we've got enough free sections through forground GCs.

5. Adopt thread-logging (Slack-Space-Recycle) scheme more aggressively on data
  log types. See. get_ssr_segement()

Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
(cherry picked from commit 242008244cbadcaec57520df690060eee0def654)
M1cha pushed a commit that referenced this pull request Feb 16, 2015
The get_node_page_ra tries to:
1. grab or read a target node page for the given nid,
2. then, call ra_node_page to read other adjacent node pages in advance.

So, when we try to read a target node page by #1, we should submit bio with
READ_SYNC instead of READA.
And, in #2, READA should be used.

Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
Reviewed-by: Namjae Jeon <namjae.jeon@samsung.com>
M1cha pushed a commit that referenced this pull request Feb 16, 2015
o Deadlock case #1

Thread 1:
- writeback_sb_inodes
 - do_writepages
  - f2fs_write_data_pages
   - write_cache_pages
    - f2fs_write_data_page
     - f2fs_balance_fs
      - wait mutex_lock(gc_mutex)

Thread 2:
- f2fs_balance_fs
 - mutex_lock(gc_mutex)
 - f2fs_gc
  - f2fs_iget
   - wait iget_locked(inode->i_lock)

Thread 3:
- do_unlinkat
 - iput
  - lock(inode->i_lock)
   - evict
    - inode_wait_for_writeback

o Deadlock case #2

Thread 1:
- __writeback_single_inode
 : set I_SYNC
  - do_writepages
   - f2fs_write_data_page
    - f2fs_balance_fs
     - f2fs_gc
      - iput
       - evict
        - inode_wait_for_writeback(I_SYNC)

In order to avoid this, even though iput is called with the zero-reference
count, we need to stop the eviction procedure if the inode is on writeback.
So this patch links f2fs_drop_inode which checks the I_SYNC flag.

Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
sndnvaps pushed a commit to multirom-aries/android_kernel_xiaomi_aries that referenced this pull request May 20, 2016
commit 3fd7b60f2c7418239d586e359e0c6d8503e10646 upstream.

This patch drops legacy active_ts_list usage within iscsi_target_tq.c
code.  It was originally used to track the active thread sets during
iscsi-target shutdown, and is no longer used by modern upstream code.

Two people have reported list corruption using traditional iscsi-target
and iser-target with the following backtrace, that appears to be related
to iscsi_thread_set->ts_list being used across both active_ts_list and
inactive_ts_list.

[   60.782534] ------------[ cut here ]------------
[   60.782543] WARNING: CPU: 0 PID: 9430 at lib/list_debug.c:53 __list_del_entry+0x63/0xd0()
[   60.782545] list_del corruption, ffff88045b00d180->next is LIST_POISON1 (dead000000100100)
[   60.782546] Modules linked in: ib_srpt tcm_qla2xxx qla2xxx tcm_loop tcm_fc libfc scsi_transport_fc scsi_tgt ib_isert rdma_cm iw_cm ib_addr iscsi_target_mod target_core_pscsi target_core_file target_core_iblock target_core_mod configfs ebtable_nat ebtables ipt_MASQUERADE iptable_nat nf_nat_ipv4 nf_nat nf_conntrack_ipv4 nf_defrag_ipv4 ipt_REJECT xt_CHECKSUM iptable_mangle iptable_filter ip_tables bridge stp llc autofs4 sunrpc ip6t_REJECT nf_conntrack_ipv6 nf_defrag_ipv6 xt_state nf_conntrack ip6table_filter ip6_tables ipv6 ib_ipoib ib_cm ib_uverbs ib_umad mlx4_en mlx4_ib ib_sa ib_mad ib_core mlx4_core dm_mirror dm_region_hash dm_log dm_mod vhost_net macvtap macvlan vhost tun kvm_intel kvm uinput iTCO_wdt iTCO_vendor_support microcode serio_raw pcspkr sb_edac edac_core sg i2c_i801 lpc_ich mfd_core mtip32xx igb i2c_algo_bit i2c_core ptp pps_core ioatdma dca wmi ext3(F) jbd(F) mbcache(F) sd_mod(F) crc_t10dif(F) crct10dif_common(F) ahci(F) libahci(F) isci(F) libsas(F) scsi_transport_sas(F) [last unloaded: speedstep_lib]
[   60.782597] CPU: 0 PID: 9430 Comm: iscsi_ttx Tainted: GF 3.12.19+ M1cha#2
[   60.782598] Hardware name: Supermicro X9DRX+-F/X9DRX+-F, BIOS 3.00 07/09/2013
[   60.782599]  0000000000000035 ffff88044de31d08 ffffffff81553ae7 0000000000000035
[   60.782602]  ffff88044de31d58 ffff88044de31d48 ffffffff8104d1cc 0000000000000002
[   60.782605]  ffff88045b00d180 ffff88045b00d0c0 ffff88045b00d0c0 ffff88044de31e58
[   60.782607] Call Trace:
[   60.782611]  [<ffffffff81553ae7>] dump_stack+0x49/0x62
[   60.782615]  [<ffffffff8104d1cc>] warn_slowpath_common+0x8c/0xc0
[   60.782618]  [<ffffffff8104d2b6>] warn_slowpath_fmt+0x46/0x50
[   60.782620]  [<ffffffff81280933>] __list_del_entry+0x63/0xd0
[   60.782622]  [<ffffffff812809b1>] list_del+0x11/0x40
[   60.782630]  [<ffffffffa06e7cf9>] iscsi_del_ts_from_active_list+0x29/0x50 [iscsi_target_mod]
[   60.782635]  [<ffffffffa06e87b1>] iscsi_tx_thread_pre_handler+0xa1/0x180 [iscsi_target_mod]
[   60.782642]  [<ffffffffa06fb9ae>] iscsi_target_tx_thread+0x4e/0x220 [iscsi_target_mod]
[   60.782647]  [<ffffffffa06fb960>] ? iscsit_handle_snack+0x190/0x190 [iscsi_target_mod]
[   60.782652]  [<ffffffffa06fb960>] ? iscsit_handle_snack+0x190/0x190 [iscsi_target_mod]
[   60.782655]  [<ffffffff8106f99e>] kthread+0xce/0xe0
[   60.782657]  [<ffffffff8106f8d0>] ? kthread_freezable_should_stop+0x70/0x70
[   60.782660]  [<ffffffff8156026c>] ret_from_fork+0x7c/0xb0
[   60.782662]  [<ffffffff8106f8d0>] ? kthread_freezable_should_stop+0x70/0x70
[   60.782663] ---[ end trace 9662f4a661d33965 ]---

Since this code is no longer used, go ahead and drop the problematic usage
all-together.

Reported-by: Gavin Guo <gavin.guo@canonical.com>
Reported-by: Moussa Ba <moussaba@micron.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
[lizf: Backported to 3.4: adjust context]
Signed-off-by: Zefan Li <lizefan@huawei.com>
sndnvaps pushed a commit to multirom-aries/android_kernel_xiaomi_aries that referenced this pull request May 20, 2016
An inactive timer's base can refer to a offline cpu's base.

In the current code, cpu_base's lock is blindly reinitialized
each time a CPU is brought up. If a CPU is brought online
during the period that another thread is trying to modify an
inactive timer on that CPU with holding its timer base lock,
then the lock will be reinitialized under its feet. This leads
to following SPIN_BUG().

<0> BUG: spinlock already unlocked on CPU#3, kworker/u:3/1466
<0> lock: 0xe3ebe000, .magic: dead4ead, .owner: kworker/u:3/1466,
 .owner_cpu: 1
<4> [<c0013dc4>] (unwind_backtrace+0x0/0x11c) from [<c026e794>]
(do_raw_spin_unlock+0x40/0xcc)
<4> [<c026e794>] (do_raw_spin_unlock+0x40/0xcc) from [<c076c160>]
(_raw_spin_unlock+0x8/0x30)
<4> [<c076c160>] (_raw_spin_unlock+0x8/0x30) from [<c009b858>]
(mod_timer+0x294/0x310)
<4> [<c009b858>] (mod_timer+0x294/0x310) from [<c00a5e04>]
(queue_delayed_work_on+0x104/0x120)
<4> [<c00a5e04>] (queue_delayed_work_on+0x104/0x120) from [<c04eae00>]
(sdhci_msm_bus_voting+0x88/0x9c)
<4> [<c04eae00>] (sdhci_msm_bus_voting+0x88/0x9c) from [<c04d8780>]
(sdhci_disable+0x40/0x48)
<4> [<c04d8780>] (sdhci_disable+0x40/0x48) from [<c04bf300>]
(mmc_release_host+0x4c/0xb0)
<4> [<c04bf300>] (mmc_release_host+0x4c/0xb0) from [<c04c7aac>]
(mmc_sd_detect+0x90/0xfc)
<4> [<c04c7aac>] (mmc_sd_detect+0x90/0xfc) from [<c04c2504>]
(mmc_rescan+0x7c/0x2c4)
<4> [<c04c2504>] (mmc_rescan+0x7c/0x2c4) from [<c00a6a7c>]
(process_one_work+0x27c/0x484)
<4> [<c00a6a7c>] (process_one_work+0x27c/0x484) from [<c00a6e94>]
(worker_thread+0x210/0x3b0)
<4> [<c00a6e94>] (worker_thread+0x210/0x3b0) from [<c00aad9c>]
(kthread+0x80/0x8c)
<4> [<c00aad9c>] (kthread+0x80/0x8c) from [<c000ea80>]
(kernel_thread_exit+0x0/0x8)

As an example, this particular crash occurred when CPU #3 is executing
mod_timer() on an inactive timer whose base is refered to offlined CPU M1cha#2.
The code locked the timer_base corresponding to CPU M1cha#2. Before it could
proceed, CPU M1cha#2 came online and reinitialized the spinlock corresponding
to its base. Thus now CPU #3 held a lock which was reinitialized. When
CPU #3 finally ended up unlocking the old cpu_base corresponding to CPU M1cha#2,
we hit the above SPIN_BUG().

CPU #0			CPU #3				       CPU M1cha#2
------			-------				       -------
.....			 ......				      <Offline>
			mod_timer()
			 lock_timer_base
			  spin_lock_irqsave(&base->lock)

cpu_up(2)		 .....				        ......
							 init_timers_cpu()
.....		 	 spin_unlock_irqrestore(&base->lock)     ......
			   <spin_bug>

Allocation of per_cpu timer vector bases is done only once under
"tvec_base_done[]" check. In the current code, spinlock_initialization
of base->lock isn't under this check. When a CPU is up each time the base
lock is reinitialized. Move base spinlock initialization under the check.

CRs-Fixed: 471127
Change-Id: I73b48440fffb227a60af9180e318c851048530dd
Signed-off-by: Tirupathi Reddy <tirupath@codeaurora.org>
Signed-off-by: Ed Tam <etam@google.com>
sndnvaps pushed a commit to multirom-aries/android_kernel_xiaomi_aries that referenced this pull request Jun 9, 2016
commit eddd3826a1a0190e5235703d1e666affa4d13b96 upstream.

Dmitry Vyukov reported the following using trinity and the memory
error detector AddressSanitizer
(https://code.google.com/p/address-sanitizer/wiki/AddressSanitizerForKernel).

[ 124.575597] ERROR: AddressSanitizer: heap-buffer-overflow on
address ffff88002e280000
[ 124.576801] ffff88002e280000 is located 131938492886538 bytes to
the left of 28857600-byte region [ffffffff81282e0a, ffffffff82e0830a)
[ 124.578633] Accessed by thread T10915:
[ 124.579295] inlined in describe_heap_address
./arch/x86/mm/asan/report.c:164
[ 124.579295] #0 ffffffff810dd277 in asan_report_error
./arch/x86/mm/asan/report.c:278
[ 124.580137] M1cha#1 ffffffff810dc6a0 in asan_check_region
./arch/x86/mm/asan/asan.c:37
[ 124.581050] M1cha#2 ffffffff810dd423 in __tsan_read8 ??:0
[ 124.581893] #3 ffffffff8107c093 in get_wchan
./arch/x86/kernel/process_64.c:444

The address checks in the 64bit implementation of get_wchan() are
wrong in several ways:

 - The lower bound of the stack is not the start of the stack
   page. It's the start of the stack page plus sizeof (struct
   thread_info)

 - The upper bound must be:

       top_of_stack - TOP_OF_KERNEL_STACK_PADDING - 2 * sizeof(unsigned long).

   The 2 * sizeof(unsigned long) is required because the stack pointer
   points at the frame pointer. The layout on the stack is: ... IP FP
   ... IP FP. So we need to make sure that both IP and FP are in the
   bounds.

Fix the bound checks and get rid of the mix of numeric constants, u64
and unsigned long. Making all unsigned long allows us to use the same
function for 32bit as well.

Use READ_ONCE() when accessing the stack. This does not prevent a
concurrent wakeup of the task and the stack changing, but at least it
avoids TOCTOU.

Also check task state at the end of the loop. Again that does not
prevent concurrent changes, but it avoids walking for nothing.

Add proper comments while at it.

Reported-by: Dmitry Vyukov <dvyukov@google.com>
Reported-by: Sasha Levin <sasha.levin@oracle.com>
Based-on-patch-from: Wolfram Gloger <wmglo@dent.med.uni-muenchen.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Borislav Petkov <bp@alien8.de>
Reviewed-by: Dmitry Vyukov <dvyukov@google.com>
Cc: Andrey Ryabinin <ryabinin.a.a@gmail.com>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Andrey Konovalov <andreyknvl@google.com>
Cc: Kostya Serebryany <kcc@google.com>
Cc: Alexander Potapenko <glider@google.com>
Cc: kasan-dev <kasan-dev@googlegroups.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Wolfram Gloger <wmglo@dent.med.uni-muenchen.de>
Link: http://lkml.kernel.org/r/20150930083302.694788319@linutronix.de
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
[lizf: Backported to 3.4:
 - s/READ_ONCE/ACCESS_ONCE
 - remove TOP_OF_KERNEL_STACK_PADDING]
Signed-off-by: Zefan Li <lizefan@huawei.com>
sndnvaps pushed a commit to multirom-aries/android_kernel_xiaomi_aries that referenced this pull request Jun 9, 2016
commit e81107d4c6bd098878af9796b24edc8d4a9524fd upstream.

My colleague ran into a program stall on a x86_64 server, where
n_tty_read() was waiting for data even if there was data in the buffer
in the pty.  kernel stack for the stuck process looks like below.
 #0 [ffff88303d107b58] __schedule at ffffffff815c4b20
 M1cha#1 [ffff88303d107bd0] schedule at ffffffff815c513e
 M1cha#2 [ffff88303d107bf0] schedule_timeout at ffffffff815c7818
 #3 [ffff88303d107ca0] wait_woken at ffffffff81096bd2
 #4 [ffff88303d107ce0] n_tty_read at ffffffff8136fa23
 #5 [ffff88303d107dd0] tty_read at ffffffff81368013
 #6 [ffff88303d107e20] __vfs_read at ffffffff811a3704
 #7 [ffff88303d107ec0] vfs_read at ffffffff811a3a57
 #8 [ffff88303d107f00] sys_read at ffffffff811a4306
 #9 [ffff88303d107f50] entry_SYSCALL_64_fastpath at ffffffff815c86d7

There seems to be two problems causing this issue.

First, in drivers/tty/n_tty.c, __receive_buf() stores the data and
updates ldata->commit_head using smp_store_release() and then checks
the wait queue using waitqueue_active().  However, since there is no
memory barrier, __receive_buf() could return without calling
wake_up_interactive_poll(), and at the same time, n_tty_read() could
start to wait in wait_woken() as in the following chart.

        __receive_buf()                         n_tty_read()
------------------------------------------------------------------------
if (waitqueue_active(&tty->read_wait))
/* Memory operations issued after the
   RELEASE may be completed before the
   RELEASE operation has completed */
                                        add_wait_queue(&tty->read_wait, &wait);
                                        ...
                                        if (!input_available_p(tty, 0)) {
smp_store_release(&ldata->commit_head,
                  ldata->read_head);
                                        ...
                                        timeout = wait_woken(&wait,
                                          TASK_INTERRUPTIBLE, timeout);
------------------------------------------------------------------------

The second problem is that n_tty_read() also lacks a memory barrier
call and could also cause __receive_buf() to return without calling
wake_up_interactive_poll(), and n_tty_read() to wait in wait_woken()
as in the chart below.

        __receive_buf()                         n_tty_read()
------------------------------------------------------------------------
                                        spin_lock_irqsave(&q->lock, flags);
                                        /* from add_wait_queue() */
                                        ...
                                        if (!input_available_p(tty, 0)) {
                                        /* Memory operations issued after the
                                           RELEASE may be completed before the
                                           RELEASE operation has completed */
smp_store_release(&ldata->commit_head,
                  ldata->read_head);
if (waitqueue_active(&tty->read_wait))
                                        __add_wait_queue(q, wait);
                                        spin_unlock_irqrestore(&q->lock,flags);
                                        /* from add_wait_queue() */
                                        ...
                                        timeout = wait_woken(&wait,
                                          TASK_INTERRUPTIBLE, timeout);
------------------------------------------------------------------------

There are also other places in drivers/tty/n_tty.c which have similar
calls to waitqueue_active(), so instead of adding many memory barrier
calls, this patch simply removes the call to waitqueue_active(),
leaving just wake_up*() behind.

This fixes both problems because, even though the memory access before
or after the spinlocks in both wake_up*() and add_wait_queue() can
sneak into the critical section, it cannot go past it and the critical
section assures that they will be serialized (please see "INTER-CPU
ACQUIRING BARRIER EFFECTS" in Documentation/memory-barriers.txt for a
better explanation).  Moreover, the resulting code is much simpler.

Latency measurement using a ping-pong test over a pty doesn't show any
visible performance drop.

Signed-off-by: Kosuke Tatsukawa <tatsu@ab.jp.nec.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[lizf: Backported to 3.4:
 - adjust context
 - s/wake_up_interruptible_poll/wake_up_interruptible/
 - drop changes to __receive_buf() and n_tty_set_termios()]
Signed-off-by: Zefan Li <lizefan@huawei.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants