Skip to content

Conversation

@Pillar1989
Copy link

No description provided.

RobertCNelson and others added 30 commits April 11, 2016 13:26
Signed-off-by: Robert Nelson <robertcnelson@gmail.com>
Signed-off-by: Robert Nelson <robertcnelson@gmail.com>
Signed-off-by: Robert Nelson <robertcnelson@gmail.com>
Signed-off-by: Robert Nelson <robertcnelson@gmail.com>
Signed-off-by: Robert Nelson <robertcnelson@gmail.com>
… is defined

'owner' member of 'struct mutex' is defined as below
in 'include/linux/mutex.h':

struct mutex {
...
if defined(CONFIG_DEBUG_MUTEXES) || defined(CONFIG_MUTEX_SPIN_ON_OWNER)
        struct task_struct      *owner;
endif
...

But function au_pin_hdir_set_owner() called owner as below:

 void au_pin_hdir_set_owner(struct au_pin *p, struct task_struct *task)
 {
if defined(CONFIG_DEBUG_MUTEXES) || defined(CONFIG_SMP)
        p->hdir->hi_inode->i_mutex.owner = task;
endif
 }

So if Kernel doesn't define 'DEBUG_MUTEXES' and 'MUTEX_SPIN_ON_OWNER',
but defines SMP, compiler will report the below error:

fs/aufs/i_op.c: In function 'au_pin_hdir_set_owner':
fs/aufs/i_op.c:593:28: error: 'struct mutex' has no member named 'owner'
  p->hdir->hi_inode->i_mutex.owner = task;
                            ^

Signed-off-by: Yanjiang Jin <yanjiang.jin@windriver.com>
Signed-off-by: Bruce Ashfield <bruce.ashfield@windriver.com>
Signed-off-by: Robert Nelson <robertcnelson@gmail.com>
Signed-off-by: Robert Nelson <robertcnelson@gmail.com>
Beyond the warning:

 drivers/tty/serial/8250/8250.c:1613:6: warning: unused variable ‘pass_counter’ [-Wunused-variable]

the solution of just looping infinitely was ugly - up it to 1 million to
give it a chance to continue in some really ugly situation.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Robert Nelson <robertcnelson@gmail.com>
Signed-off-by: Robert Nelson <robertcnelson@gmail.com>
Signed-off-by: Robert Nelson <robertcnelson@gmail.com>
Signed-off-by: Robert Nelson <robertcnelson@gmail.com>
Adds support for using a OMAP dual-mode timer with PWM capability
as a Linux PWM device. The driver controls the timer by using the
dmtimer API.

Add a platform_data structure for each pwm-omap-dmtimer nodes containing
the dmtimers functions in order to get driver not rely on platform
specific functions.

Cc: Grant Erickson <marathon96@gmail.com>
Cc: NeilBrown <neilb@suse.de>
Cc: Joachim Eastwood <manabian@gmail.com>
Suggested-by: Tony Lindgren <tony@atomide.com>
Signed-off-by: Neil Armstrong <narmstrong@baylibre.com>
Acked-by: Tony Lindgren <tony@atomide.com>
[thierry.reding@gmail.com: coding style bikeshed, fix timer leak]
Signed-off-by: Thierry Reding <thierry.reding@gmail.com>
"omap" is NULL so we can't dereference it.

Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Thierry Reding <thierry.reding@gmail.com>
In order to set the currently platform dependent dmtimer
functions pointers as platform data for the pwm-omap-dmtimer
platform driver, add it to plat-omap auxdata_lookup table.

Suggested-by: Tony Lindgren <tony@atomide.com>
Signed-off-by: Neil Armstrong <narmstrong@baylibre.com>
Signed-off-by: Tony Lindgren <tony@atomide.com>
Fix the calculation of load_value and match_value. Currently they
are slightly too low, which produces a noticeably wrong PWM rate with
sufficiently short periods (i.e. when 1/period approaches clk_rate/2).

Example:
 clk_rate=32768Hz, period=122070ns, duty_cycle=61035ns (8192Hz/50% PWM)
 Correct values: load = 0xfffffffc, match = 0xfffffffd
 Current values: load = 0xfffffffa, match = 0xfffffffc
 effective PWM: period=183105ns, duty_cycle=91553ns (5461Hz/50% PWM)

Fixes: 6604c65 ("pwm: Add PWM driver for OMAP using dual-mode timers")
Signed-off-by: David Rivshin <drivshin@allworx.com>
Acked-by: Neil Armstrong <narmstrong@baylibre.com>
Tested-by: Adam Ford <aford173@gmail.com>
Signed-off-by: Thierry Reding <thierry.reding@gmail.com>
Add sanity checking to ensure that we do not program load or match values
that are out of range if a user requests period or duty_cycle values which
are not achievable. The match value cannot be less than the load value (but
can be equal), and neither can be 0xffffffff. This means that there must be
at least one fclk cycle between load and match, and another between match
and overflow.

Fixes: 6604c65 ("pwm: Add PWM driver for OMAP using dual-mode timers")
Signed-off-by: David Rivshin <drivshin@allworx.com>
Acked-by: Neil Armstrong <narmstrong@baylibre.com>
[thierry.reding@gmail.com: minor coding style cleanups]
Signed-off-by: Thierry Reding <thierry.reding@gmail.com>
When converting period and duty_cycle from nanoseconds to fclk cycles,
the error introduced by the integer division can be appreciable, especially
in the case of slow fclk or short period. Use DIV_ROUND_CLOSEST_ULL() so
that the error is kept to +/- 0.5 clock cycles.

Fixes: 6604c65 ("pwm: Add PWM driver for OMAP using dual-mode timers")
Signed-off-by: David Rivshin <drivshin@allworx.com>
Acked-by: Neil Armstrong <narmstrong@baylibre.com>
Signed-off-by: Thierry Reding <thierry.reding@gmail.com>
After going through the math and constraints checking to compute load
and match values, it is helpful to know what the resultant period and
duty cycle are.

Signed-off-by: David Rivshin <drivshin@allworx.com>
Acked-by: Neil Armstrong <narmstrong@baylibre.com>
Signed-off-by: Thierry Reding <thierry.reding@gmail.com>
This reverts commit 1234e3f.

Signed-off-by: Robert Nelson <robertcnelson@gmail.com>
<zmatt> rcn-ee: btw, I got a small patch that makes 4.1-ti compile in thumb mode
again, you want it?  (this means TI still compiles in ARM mode?  why oh why?)
<zmatt> http://pastebin.com/3S3sy8U4

Signed-off-by: Robert Nelson <robertcnelson@gmail.com>
Signed-off-by: Robert Nelson <robertcnelson@gmail.com>
Signed-off-by: Robert Nelson <robertcnelson@gmail.com>
… there is also a hidden SSID in the scan list

Signed-off-by: Robert Nelson <robertcnelson@gmail.com>
Signed-off-by: Robert Nelson <robertcnelson@gmail.com>
Signed-off-by: Robert Nelson <robertcnelson@gmail.com>
Use kobj_to_dev() instead of open-coding it.

Signed-off-by: Geliang Tang <geliangtang@163.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Compatible at93xx46 devices from both Microchip and Atmel expect a
word-based address, regardless of whether the device is strapped for 8-
or 16-bit operation.  However, the offset parameter passed in when
reading or writing at a specific location is always specified in terms
of bytes.

This commit fixes 16-bit read and write accesses by shifting the offset
parameter to account for this difference between a byte offset and a
word-based address.

Signed-off-by: Cory Tusar <cory.tusar@pid1solutions.com>
Tested-by: Chris Healy <chris.healy@zii.aero>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
vinifr and others added 24 commits April 11, 2016 13:27
Add a dts file for Olimex AM3352-SOM board. The board does not use the PMIC
tps65217 and does not have many peripherals present in beaglebone. Thus, a
specific dtsi file (am335x-som-common.dtsi) is needed.

rcn-ee:
drop ti,am335x-bone due to:
davinci_mdio: dt: updated phy_id[0] from phy_mask[fffffffc]
davinci_mdio: dt: updated phy_id[1] from phy_mask[fffffffc]
#4.5.0-rc0
Use: olimex,am335x-olimex-som
tps65217.dtsi gone

Signed-off-by: Dimitar Gamishev <hehopmajieh@debian.bg>
Signed-off-by: Vinicius Maciel <viniciusfre@gmail.com>
Good news, BBGW born out.
Signed-off-by: Robert Nelson <robertcnelson@gmail.com>
Signed-off-by: Robert Nelson <robertcnelson@gmail.com>
Rewritten using includes, v3.16.1

Signed-off-by: Dave Lambert <dave@lambsys.com>
Signed-off-by: Robert Nelson <robertcnelson@gmail.com>
Add support for the WL1835MOD cape.
This cape conflicts with the eMMC and HDMI on board the BeagleBone Black.

This change requires that the board be booted from the SD card slot by
holding the user/boot button down at power on and reset.

Signed-off-by: Eyal Reizer <eyalr@ti.com>
Signed-off-by: Robert Nelson <robertcnelson@gmail.com>
Signed-off-by: Robert Nelson <robertcnelson@gmail.com>
Signed-off-by: Robert Nelson <robertcnelson@gmail.com>
When in half-duplex mode RX will be disabled before TX, but not
enabled after deactivating transmitter. This patch enables
UART_IER_RLSI and UART_IER_RDI interrupts after TX is over.

Cc: Matwey V. Kornilov <matwey@sai.msu.ru>
Signed-off-by: Yegor Yefremov <yegorslists@googlemail.com>
Introduce serial8250_out_MCR() and serial8250_in_MCR() routines, that
replace following calls:

serial_out(port, UART_MCR, val)
serial_port_out(up, UART_MCR, val)
serial_in(port, UART_MCR)

This patch is needed in order to integrate reading/writing of MCR
signals via SERIAL_MCTRL_GPIO infrastructure later.

CC: Peter Hurley <peter@hurleysoftware.com>
Signed-off-by: Yegor Yefremov <yegorslists@googlemail.com>
mctrl_gpio_get_outputs() returns the state of following signals:

RTS, DTR, OUT1, OUT2

Signed-off-by: Yegor Yefremov <yegorslists@googlemail.com>
uart_handle_cts_change() should be called in IRQ locked state, hence
use port->lock to disable interrupts.

CC: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Signed-off-by: Yegor Yefremov <yegorslists@googlemail.com>
This patch permits the usage fo GPIOs to control the CTS/RTS/DTR/DSR/DCD/RI
signals.

Signed-off-by: Yegor Yefremov <yegorslists@googlemail.com>
Signed-off-by: Robert Nelson <robertcnelson@gmail.com>
Signed-off-by: Robert Nelson <robertcnelson@gmail.com>
Signed-off-by: Robert Nelson <robertcnelson@gmail.com>
I have encountered the same issue(s) on A6A boards.

I couldn't find a patch,  so I wrote this patch to update the device tree
in the davinci_mdio driver in the 3.15.1 tree, it seems to correct it. I
would welcome any input on a different approach.

https://groups.google.com/d/msg/beagleboard/9mctrG26Mc8/SRlnumt0LoMJ

v4.1-rcX: added hack around CONFIG_OF_OVERLAY
v4.2-rc3+: added if (of_machine_is_compatible("ti,am335x-bone")) so we do
not break dual ethernet am335x devices

Signed-off-by: Robert Nelson <robertcnelson@gmail.com>
Signed-off-by: Robert Nelson <robertcnelson@gmail.com>
Signed-off-by: Robert Nelson <robertcnelson@gmail.com>
Signed-off-by: Robert Nelson <robertcnelson@gmail.com>
Signed-off-by: Robert Nelson <robertcnelson@gmail.com>
@Pillar1989 Pillar1989 closed this May 7, 2016
@Pillar1989 Pillar1989 deleted the 4.4 branch May 7, 2016 09:02
@Pillar1989
Copy link
Author

Sorry, wrong branch

RobertCNelson pushed a commit that referenced this pull request Nov 4, 2019
Scheduled policy update work may end up racing with the freeing of the
policy and unregistering the driver.

One possible race is as below, where the cpufreq_driver is unregistered,
but the scheduled work gets executed at later stage when, cpufreq_driver
is NULL (i.e. after freeing the policy and driver).

Unable to handle kernel NULL pointer dereference at virtual address 0000001c
pgd = (ptrval)
[0000001c] *pgd=80000080204003, *pmd=00000000
Internal error: Oops: 206 [#1] SMP THUMB2
Modules linked in:
CPU: 0 PID: 34 Comm: kworker/0:1 Not tainted 5.4.0-rc3-00006-g67f5a8081a4b #86
Hardware name: ARM-Versatile Express
Workqueue: events handle_update
PC is at cpufreq_set_policy+0x58/0x228
LR is at dev_pm_qos_read_value+0x77/0xac
Control: 70c5387d  Table: 80203000  DAC: fffffffd
Process kworker/0:1 (pid: 34, stack limit = 0x(ptrval))
	(cpufreq_set_policy) from (refresh_frequency_limits.part.24+0x37/0x48)
	(refresh_frequency_limits.part.24) from (handle_update+0x2f/0x38)
	(handle_update) from (process_one_work+0x16d/0x3cc)
	(process_one_work) from (worker_thread+0xff/0x414)
	(worker_thread) from (kthread+0xff/0x100)
	(kthread) from (ret_from_fork+0x11/0x28)

Fixes: 67d874c ("cpufreq: Register notifiers with the PM QoS framework")
Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>
[ rjw: Cancel the work before dropping the QoS requests ]
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
nmenon pushed a commit to nmenon/nm-beagle-linux that referenced this pull request May 22, 2023
[ Upstream commit 031af50 ]

The inline assembly for arm64's cmpxchg_double*() implementations use a
+Q constraint to hazard against other accesses to the memory location
being exchanged. However, the pointer passed to the constraint is a
pointer to unsigned long, and thus the hazard only applies to the first
8 bytes of the location.

GCC can take advantage of this, assuming that other portions of the
location are unchanged, leading to a number of potential problems.

This is similar to what we fixed back in commit:

  fee960b ("arm64: xchg: hazard against entire exchange variable")

... but we forgot to adjust cmpxchg_double*() similarly at the same
time.

The same problem applies, as demonstrated with the following test:

| struct big {
|         u64 lo, hi;
| } __aligned(128);
|
| unsigned long foo(struct big *b)
| {
|         u64 hi_old, hi_new;
|
|         hi_old = b->hi;
|         cmpxchg_double_local(&b->lo, &b->hi, 0x12, 0x34, 0x56, 0x78);
|         hi_new = b->hi;
|
|         return hi_old ^ hi_new;
| }

... which GCC 12.1.0 compiles as:

| 0000000000000000 <foo>:
|    0:   d503233f        paciasp
|    4:   aa0003e4        mov     x4, x0
|    8:   1400000e        b       40 <foo+0x40>
|    c:   d2800240        mov     x0, #0x12                       // beagleboard#18
|   10:   d2800681        mov     x1, #0x34                       // beagleboard#52
|   14:   aa0003e5        mov     x5, x0
|   18:   aa0103e6        mov     x6, x1
|   1c:   d2800ac2        mov     x2, #0x56                       // beagleboard#86
|   20:   d2800f03        mov     x3, #0x78                       // beagleboard#120
|   24:   48207c82        casp    x0, x1, x2, x3, [x4]
|   28:   ca050000        eor     x0, x0, x5
|   2c:   ca060021        eor     x1, x1, x6
|   30:   aa010000        orr     x0, x0, x1
|   34:   d2800000        mov     x0, #0x0                        // #0    <--- BANG
|   38:   d50323bf        autiasp
|   3c:   d65f03c0        ret
|   40:   d2800240        mov     x0, #0x12                       // beagleboard#18
|   44:   d2800681        mov     x1, #0x34                       // beagleboard#52
|   48:   d2800ac2        mov     x2, #0x56                       // beagleboard#86
|   4c:   d2800f03        mov     x3, #0x78                       // beagleboard#120
|   50:   f9800091        prfm    pstl1strm, [x4]
|   54:   c87f1885        ldxp    x5, x6, [x4]
|   58:   ca0000a5        eor     x5, x5, x0
|   5c:   ca0100c6        eor     x6, x6, x1
|   60:   aa0600a6        orr     x6, x5, x6
|   64:   b5000066        cbnz    x6, 70 <foo+0x70>
|   68:   c8250c82        stxp    w5, x2, x3, [x4]
|   6c:   35ffff45        cbnz    w5, 54 <foo+0x54>
|   70:   d2800000        mov     x0, #0x0                        // #0     <--- BANG
|   74:   d50323bf        autiasp
|   78:   d65f03c0        ret

Notice that at the lines with "BANG" comments, GCC has assumed that the
higher 8 bytes are unchanged by the cmpxchg_double() call, and that
`hi_old ^ hi_new` can be reduced to a constant zero, for both LSE and
LL/SC versions of cmpxchg_double().

This patch fixes the issue by passing a pointer to __uint128_t into the
+Q constraint, ensuring that the compiler hazards against the entire 16
bytes being modified.

With this change, GCC 12.1.0 compiles the above test as:

| 0000000000000000 <foo>:
|    0:   f9400407        ldr     x7, [x0, beagleboard#8]
|    4:   d503233f        paciasp
|    8:   aa0003e4        mov     x4, x0
|    c:   1400000f        b       48 <foo+0x48>
|   10:   d2800240        mov     x0, #0x12                       // beagleboard#18
|   14:   d2800681        mov     x1, #0x34                       // beagleboard#52
|   18:   aa0003e5        mov     x5, x0
|   1c:   aa0103e6        mov     x6, x1
|   20:   d2800ac2        mov     x2, #0x56                       // beagleboard#86
|   24:   d2800f03        mov     x3, #0x78                       // beagleboard#120
|   28:   48207c82        casp    x0, x1, x2, x3, [x4]
|   2c:   ca050000        eor     x0, x0, x5
|   30:   ca060021        eor     x1, x1, x6
|   34:   aa010000        orr     x0, x0, x1
|   38:   f9400480        ldr     x0, [x4, beagleboard#8]
|   3c:   d50323bf        autiasp
|   40:   ca0000e0        eor     x0, x7, x0
|   44:   d65f03c0        ret
|   48:   d2800240        mov     x0, #0x12                       // beagleboard#18
|   4c:   d2800681        mov     x1, #0x34                       // beagleboard#52
|   50:   d2800ac2        mov     x2, #0x56                       // beagleboard#86
|   54:   d2800f03        mov     x3, #0x78                       // beagleboard#120
|   58:   f9800091        prfm    pstl1strm, [x4]
|   5c:   c87f1885        ldxp    x5, x6, [x4]
|   60:   ca0000a5        eor     x5, x5, x0
|   64:   ca0100c6        eor     x6, x6, x1
|   68:   aa0600a6        orr     x6, x5, x6
|   6c:   b5000066        cbnz    x6, 78 <foo+0x78>
|   70:   c8250c82        stxp    w5, x2, x3, [x4]
|   74:   35ffff45        cbnz    w5, 5c <foo+0x5c>
|   78:   f9400480        ldr     x0, [x4, beagleboard#8]
|   7c:   d50323bf        autiasp
|   80:   ca0000e0        eor     x0, x7, x0
|   84:   d65f03c0        ret

... sampling the high 8 bytes before and after the cmpxchg, and
performing an EOR, as we'd expect.

For backporting, I've tested this atop linux-4.9.y with GCC 5.5.0. Note
that linux-4.9.y is oldest currently supported stable release, and
mandates GCC 5.1+. Unfortunately I couldn't get a GCC 5.1 binary to run
on my machines due to library incompatibilities.

I've also used a standalone test to check that we can use a __uint128_t
pointer in a +Q constraint at least as far back as GCC 4.8.5 and LLVM
3.9.1.

Fixes: 5284e1b ("arm64: xchg: Implement cmpxchg_double")
Fixes: e9a4b79 ("arm64: cmpxchg_dbl: patch in lse instructions when supported by the CPU")
Reported-by: Boqun Feng <boqun.feng@gmail.com>
Link: https://lore.kernel.org/lkml/Y6DEfQXymYVgL3oJ@boqun-archlinux/
Reported-by: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/lkml/Y6GXoO4qmH9OIZ5Q@hirez.programming.kicks-ass.net/
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: stable@vger.kernel.org
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Steve Capper <steve.capper@arm.com>
Cc: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20230104151626.3262137-1-mark.rutland@arm.com
Signed-off-by: Will Deacon <will@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
RobertCNelson pushed a commit that referenced this pull request Aug 27, 2025
[ Upstream commit 06f3642 ]

[BUG]
When testing with COW fixup marked as BUG_ON() (this is involved with the
new pin_user_pages*() change, which should not result new out-of-band
dirty pages), I hit a crash triggered by the BUG_ON() from hitting COW
fixup path.

This BUG_ON() happens just after a failed btrfs_run_delalloc_range():

  BTRFS error (device dm-2): failed to run delalloc range, root 348 ino 405 folio 65536 submit_bitmap 6-15 start 90112 len 106496: -28
  ------------[ cut here ]------------
  kernel BUG at fs/btrfs/extent_io.c:1444!
  Internal error: Oops - BUG: 00000000f2000800 [#1] SMP
  CPU: 0 UID: 0 PID: 434621 Comm: kworker/u24:8 Tainted: G           OE      6.12.0-rc7-custom+ #86
  Hardware name: QEMU KVM Virtual Machine, BIOS unknown 2/2/2022
  Workqueue: events_unbound btrfs_async_reclaim_data_space [btrfs]
  pc : extent_writepage_io+0x2d4/0x308 [btrfs]
  lr : extent_writepage_io+0x2d4/0x308 [btrfs]
  Call trace:
   extent_writepage_io+0x2d4/0x308 [btrfs]
   extent_writepage+0x218/0x330 [btrfs]
   extent_write_cache_pages+0x1d4/0x4b0 [btrfs]
   btrfs_writepages+0x94/0x150 [btrfs]
   do_writepages+0x74/0x190
   filemap_fdatawrite_wbc+0x88/0xc8
   start_delalloc_inodes+0x180/0x3b0 [btrfs]
   btrfs_start_delalloc_roots+0x174/0x280 [btrfs]
   shrink_delalloc+0x114/0x280 [btrfs]
   flush_space+0x250/0x2f8 [btrfs]
   btrfs_async_reclaim_data_space+0x180/0x228 [btrfs]
   process_one_work+0x164/0x408
   worker_thread+0x25c/0x388
   kthread+0x100/0x118
   ret_from_fork+0x10/0x20
  Code: aa1403e1 9402f3ef aa1403e0 9402f36f (d4210000)
  ---[ end trace 0000000000000000 ]---

[CAUSE]
That failure is mostly from cow_file_range(), where we can hit -ENOSPC.

Although the -ENOSPC is already a bug related to our space reservation
code, let's just focus on the error handling.

For example, we have the following dirty range [0, 64K) of an inode,
with 4K sector size and 4K page size:

   0        16K        32K       48K       64K
   |///////////////////////////////////////|
   |#######################################|

Where |///| means page are still dirty, and |###| means the extent io
tree has EXTENT_DELALLOC flag.

- Enter extent_writepage() for page 0

- Enter btrfs_run_delalloc_range() for range [0, 64K)

- Enter cow_file_range() for range [0, 64K)

- Function btrfs_reserve_extent() only reserved one 16K extent
  So we created extent map and ordered extent for range [0, 16K)

   0        16K        32K       48K       64K
   |////////|//////////////////////////////|
   |<- OE ->|##############################|

   And range [0, 16K) has its delalloc flag cleared.
   But since we haven't yet submit any bio, involved 4 pages are still
   dirty.

- Function btrfs_reserve_extent() returns with -ENOSPC
  Now we have to run error cleanup, which will clear all
  EXTENT_DELALLOC* flags and clear the dirty flags for the remaining
  ranges:

   0        16K        32K       48K       64K
   |////////|                              |
   |        |                              |

  Note that range [0, 16K) still has its pages dirty.

- Some time later, writeback is triggered again for the range [0, 16K)
  since the page range still has dirty flags.

- btrfs_run_delalloc_range() will do nothing because there is no
  EXTENT_DELALLOC flag.

- extent_writepage_io() finds page 0 has no ordered flag
  Which falls into the COW fixup path, triggering the BUG_ON().

Unfortunately this error handling bug dates back to the introduction of
btrfs.  Thankfully with the abuse of COW fixup, at least it won't crash
the kernel.

[FIX]
Instead of immediately unlocking the extent and folios, we keep the extent
and folios locked until either erroring out or the whole delalloc range
finished.

When the whole delalloc range finished without error, we just unlock the
whole range with PAGE_SET_ORDERED (and PAGE_UNLOCK for !keep_locked
cases), with EXTENT_DELALLOC and EXTENT_LOCKED cleared.
And the involved folios will be properly submitted, with their dirty
flags cleared during submission.

For the error path, it will be a little more complex:

- The range with ordered extent allocated (range (1))
  We only clear the EXTENT_DELALLOC and EXTENT_LOCKED, as the remaining
  flags are cleaned up by
  btrfs_mark_ordered_io_finished()->btrfs_finish_one_ordered().

  For folios we finish the IO (clear dirty, start writeback and
  immediately finish the writeback) and unlock the folios.

- The range with reserved extent but no ordered extent (range(2))
- The range we never touched (range(3))
  For both range (2) and range(3) the behavior is not changed.

Now even if cow_file_range() failed halfway with some successfully
reserved extents/ordered extents, we will keep all folios clean, so
there will be no future writeback triggered on them.

CC: stable@vger.kernel.org
Reviewed-by: Boris Burkov <boris@bur.io>
Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.