Commit 2464cc4
powerpc/tm: Fix clearing MSR[TS] in current when reclaiming on signal delivery
After a treclaim, we expect to be in non-transactional state. If we
don't clear the current thread's MSR[TS] before we get preempted, then
tm_recheckpoint_new_task() will recheckpoint and we get rescheduled in
suspended transaction state.
When handling a signal caught in transactional state,
handle_rt_signal64() calls get_tm_stackpointer() that treclaims the
transaction using tm_reclaim_current() but without clearing the
thread's MSR[TS]. This can cause the TM Bad Thing exception below if
later we pagefault and get preempted trying to access the user's
sigframe, using __put_user(). Afterwards, when we are rescheduled back
into do_page_fault() (but now in suspended state since the thread's
MSR[TS] was not cleared), upon executing 'rfid' after completion of
the page fault handling, the exception is raised because a transition
from suspended to non-transactional state is invalid.
Unexpected TM Bad Thing exception at c00000000000de44 (msr 0x8000000302a03031) tm_scratch=800000010280b033
Oops: Unrecoverable exception, sig: 6 [#1]
LE PAGE_SIZE=64K MMU=Hash SMP NR_CPUS=2048 NUMA pSeries
CPU: 25 PID: 15547 Comm: a.out Not tainted 5.4.0-rc2 #32
NIP: c00000000000de44 LR: c000000000034728 CTR: 0000000000000000
REGS: c00000003fe7bd70 TRAP: 0700 Not tainted (5.4.0-rc2)
MSR: 8000000302a03031 <SF,VEC,VSX,FP,ME,IR,DR,LE,TM[SE]> CR: 44000884 XER: 00000000
CFAR: c00000000000dda4 IRQMASK: 0
PACATMSCRATCH: 800000010280b033
GPR00: c000000000034728 c000000f65a17c80 c000000001662800 00007fffacf3fd78
GPR04: 0000000000001000 0000000000001000 0000000000000000 c000000f611f8af0
GPR08: 0000000000000000 0000000078006001 0000000000000000 000c000000000000
GPR12: c000000f611f84b0 c00000003ffcb200 0000000000000000 0000000000000000
GPR16: 0000000000000000 0000000000000000 0000000000000000 0000000000000000
GPR20: 0000000000000000 0000000000000000 0000000000000000 c000000f611f8140
GPR24: 0000000000000000 00007fffacf3fd68 c000000f65a17d90 c000000f611f7800
GPR28: c000000f65a17e90 c000000f65a17e90 c000000001685e18 00007fffacf3f000
NIP [c00000000000de44] fast_exception_return+0xf4/0x1b0
LR [c000000000034728] handle_rt_signal64+0x78/0xc50
Call Trace:
[c000000f65a17c80] [c000000000034710] handle_rt_signal64+0x60/0xc50 (unreliable)
[c000000f65a17d30] [c000000000023640] do_notify_resume+0x330/0x460
[c000000f65a17e20] [c00000000000dcc4] ret_from_except_lite+0x70/0x74
Instruction dump:
7c4ff120 e8410170 7c5a03a6 38400000 f8410060 e8010070 e8410080 e8610088
60000000 60000000 e8810090 e8210078 <4c000024> 48000000 e8610178 88ed0989
---[ end trace 93094aa44b442f87 ]---
The simplified sequence of events that triggers the above exception is:
... # userspace in NON-TRANSACTIONAL state
tbegin # userspace in TRANSACTIONAL state
signal delivery # kernelspace in SUSPENDED state
handle_rt_signal64()
get_tm_stackpointer()
treclaim # kernelspace in NON-TRANSACTIONAL state
__put_user()
page fault happens. We will never get back here because of the TM Bad Thing exception.
page fault handling kicks in and we voluntarily preempt ourselves
do_page_fault()
__schedule()
__switch_to(other_task)
our task is rescheduled and we recheckpoint because the thread's MSR[TS] was not cleared
__switch_to(our_task)
switch_to_tm()
tm_recheckpoint_new_task()
trechkpt # kernelspace in SUSPENDED state
The page fault handling resumes, but now we are in suspended transaction state
do_page_fault() completes
rfid <----- trying to get back where the page fault happened (we were non-transactional back then)
TM Bad Thing # illegal transition from suspended to non-transactional
This patch fixes that issue by clearing the current thread's MSR[TS]
just after treclaim in get_tm_stackpointer() so that we stay in
non-transactional state in case we are preempted. In order to make
treclaim and clearing the thread's MSR[TS] atomic from a preemption
perspective when CONFIG_PREEMPT is set, preempt_disable/enable() is
used. It's also necessary to save the previous value of the thread's
MSR before get_tm_stackpointer() is called so that it can be exposed
to the signal handler later in setup_tm_sigcontexts() to inform the
userspace MSR at the moment of the signal delivery.
Found with tm-signal-context-force-tm kernel selftest.
Fixes: 2b0a576 ("powerpc: Add new transactional memory state to the signal context")
Cc: stable@vger.kernel.org # v3.9
Signed-off-by: Gustavo Luiz Duarte <gustavold@linux.ibm.com>
Acked-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20200211033831.11165-1-gustavold@linux.ibm.com1 parent a4031af commit 2464cc4
3 files changed
+39
-28
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
200 | 200 | | |
201 | 201 | | |
202 | 202 | | |
| 203 | + | |
| 204 | + | |
203 | 205 | | |
204 | 206 | | |
205 | 207 | | |
206 | 208 | | |
| 209 | + | |
207 | 210 | | |
208 | 211 | | |
209 | | - | |
| 212 | + | |
| 213 | + | |
| 214 | + | |
| 215 | + | |
| 216 | + | |
| 217 | + | |
| 218 | + | |
| 219 | + | |
| 220 | + | |
| 221 | + | |
| 222 | + | |
210 | 223 | | |
211 | 224 | | |
212 | | - | |
| 225 | + | |
213 | 226 | | |
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
489 | 489 | | |
490 | 490 | | |
491 | 491 | | |
492 | | - | |
| 492 | + | |
| 493 | + | |
493 | 494 | | |
494 | | - | |
495 | | - | |
496 | 495 | | |
497 | 496 | | |
498 | | - | |
499 | | - | |
500 | | - | |
501 | | - | |
502 | | - | |
503 | | - | |
504 | | - | |
505 | 497 | | |
506 | 498 | | |
507 | 499 | | |
| |||
912 | 904 | | |
913 | 905 | | |
914 | 906 | | |
| 907 | + | |
| 908 | + | |
| 909 | + | |
| 910 | + | |
915 | 911 | | |
916 | 912 | | |
917 | 913 | | |
| |||
944 | 940 | | |
945 | 941 | | |
946 | 942 | | |
947 | | - | |
| 943 | + | |
948 | 944 | | |
949 | 945 | | |
950 | 946 | | |
951 | 947 | | |
952 | 948 | | |
953 | | - | |
| 949 | + | |
954 | 950 | | |
955 | 951 | | |
956 | 952 | | |
| |||
1369 | 1365 | | |
1370 | 1366 | | |
1371 | 1367 | | |
| 1368 | + | |
| 1369 | + | |
| 1370 | + | |
| 1371 | + | |
1372 | 1372 | | |
1373 | 1373 | | |
1374 | 1374 | | |
| |||
1402 | 1402 | | |
1403 | 1403 | | |
1404 | 1404 | | |
1405 | | - | |
| 1405 | + | |
1406 | 1406 | | |
1407 | | - | |
| 1407 | + | |
1408 | 1408 | | |
1409 | 1409 | | |
1410 | 1410 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
192 | 192 | | |
193 | 193 | | |
194 | 194 | | |
195 | | - | |
| 195 | + | |
| 196 | + | |
196 | 197 | | |
197 | 198 | | |
198 | 199 | | |
| |||
207 | 208 | | |
208 | 209 | | |
209 | 210 | | |
210 | | - | |
211 | 211 | | |
212 | 212 | | |
213 | 213 | | |
214 | 214 | | |
215 | | - | |
| 215 | + | |
216 | 216 | | |
217 | 217 | | |
218 | 218 | | |
| |||
222 | 222 | | |
223 | 223 | | |
224 | 224 | | |
225 | | - | |
226 | | - | |
227 | | - | |
228 | | - | |
229 | | - | |
230 | | - | |
231 | | - | |
232 | 225 | | |
233 | 226 | | |
234 | 227 | | |
| |||
824 | 817 | | |
825 | 818 | | |
826 | 819 | | |
| 820 | + | |
| 821 | + | |
| 822 | + | |
| 823 | + | |
827 | 824 | | |
828 | 825 | | |
829 | 826 | | |
| |||
841 | 838 | | |
842 | 839 | | |
843 | 840 | | |
844 | | - | |
| 841 | + | |
845 | 842 | | |
846 | 843 | | |
847 | 844 | | |
848 | 845 | | |
849 | 846 | | |
850 | 847 | | |
851 | 848 | | |
852 | | - | |
| 849 | + | |
| 850 | + | |
853 | 851 | | |
854 | 852 | | |
855 | 853 | | |
| |||
0 commit comments