Skip to content

Invalidate instruction cache when restoring#115

Open
aoikonomopoulos wants to merge 1 commit into
LibertyGlobal:mainfrom
aoikonomopoulos:i-cache-invalidation
Open

Invalidate instruction cache when restoring#115
aoikonomopoulos wants to merge 1 commit into
LibertyGlobal:mainfrom
aoikonomopoulos:i-cache-invalidation

Conversation

@aoikonomopoulos
Copy link
Copy Markdown

When restoring executable pages, invalidate the instruction cache for that address range. This ensures that a thread executing in a different CPU on restore will execute the newly-restored instructions.

We do that in an arch-independent fashion by using __builtin___clear_cache (which is a no-op on e.g. X86). On e.g. ARMv7, this needs to execute the cacheflush(..., ICACHE) syscall, so we only do that for VMAs that are marked as executable.

For the same reason, we now need to to link the parasite against libgcc.a, so switch to using the compiler driver (gcc) for linking rather than invoking ld.bfd directly, so that it can resolve the path to libgcc.a for us.

When restoring executable pages, invalidate the instruction cache for
that address range. This ensures that a thread executing in a different
CPU on restore will execute the newly-restored instructions.

We do that in an arch-independent fashion by using
__builtin___clear_cache (which is a no-op on e.g. X86). On e.g. ARMv7,
this needs to execute the cacheflush(..., ICACHE) syscall, so we only do that
for VMAs that are marked as executable.

For the same reason, we now need to to link the parasite against
libgcc.a, so switch to using the compiler driver (gcc) for linking
rather than invoking ld.bfd directly, so that it can resolve the path to
libgcc.a for us.
@mkozlowski
Copy link
Copy Markdown
Collaborator

Thanks for the patch - this is an interesting case.

Could you please share a bit more context on how this surfaced? Which CPU(s) did you see this on? Under what conditions? Anything special about the process being frozen?

Are you able to reproduce it? Would be ideal if you could share an isolated test case. I tried several approaches to test this cache invalidation, and it always behaves correctly on all platforms I have.

Comment thread parasite.c

read(cd, (void *)req.u.mem.vmr.addr, req.u.mem.vmr.len);
read(cd, (void *)req.u.mem.vmr.addr, req.u.mem.vmr.len);
/* Invalidate the instruction cache for the region we just overwrote. */
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

spaces vs tabs

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oops, will tabify, thanks.

Comment thread parasite.c
read(cd, (void *)req.u.mem.vmr.addr, req.u.mem.vmr.len);
/* Invalidate the instruction cache for the region we just overwrote. */
if (req.u.mem.vmr.prot & PROT_EXEC)
__builtin___clear_cache((void *)req.u.mem.vmr.addr, (char *)req.u.mem.vmr.addr + req.u.mem.vmr.len);
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this fails to link on riscv:

/usr/bin/ld.bfd: ./parasite.o: in function .L206': parasite.c:(.text+0x43a): undefined reference to __riscv_flush_icache'

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yah, just saw that in your CI. Whereas the ARMv7 version of __builtin___clear_cache expands to a call to a helper in libgcc.a (which simply invokes the cacheflush() syscall), it looks like on RISC-V it expands to a call to __riscv_flush_icache which the manpage says is implemented in libc.

So unfortunately fixing this on RISC-V would mean having a dependency on the availability of a static version of libc. It may be more practical to use syscall(2) to invoke cacheflush there then, what do you think?

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like you're supposed to invoke the riscv_flush_icache syscall on RISC-V: https://www.kernel.org/doc/html/latest/arch/riscv/cmodx.html#cmodx-in-the-user-space.

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, on RISC-V syscall is fine - we could wrap this in something like arch_clear_icache() that would hide arch details: syscall on RISC-V, __builtin___clear_cache on ARM, etc.

static libc is a no-go for parasite code that is injected into a target process

@aoikonomopoulos
Copy link
Copy Markdown
Author

aoikonomopoulos commented May 15, 2026

Hi @mkozlowski, I can't confirm I'm seeing this in production yet, but it would be an issue with a process that does JIT compilation and is multi-threaded.

Conceptually, the issue seems clear: the instruction and data caches are not consistent on all architectures. They are on x86{,_64} but not on ARM64/ARMv7/RISC-V. So when writing instructions to a memory region, one has to make sure that the i-cache for that region is invalidated before they allow threads to execute those instructions. This is something that all production JIT engines have code for.

I can cook up a test case to demonstrate this if that would be helpful, but this won't ever trigger on x86 systems.

@mkozlowski
Copy link
Copy Markdown
Collaborator

I can cook up a test case to demonstrate this if that would be helpful

Please do.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants