Skip to content
This repository was archived by the owner on Oct 12, 2022. It is now read-only.
/ druntime Public archive

Comments

Add Fiber's guard page in Posix as well#1698

Merged
MartinNowak merged 6 commits intodlang:masterfrom
nemanja-boric-sociomantic:fiber-protection
May 6, 2017
Merged

Add Fiber's guard page in Posix as well#1698
MartinNowak merged 6 commits intodlang:masterfrom
nemanja-boric-sociomantic:fiber-protection

Conversation

@nemanja-boric-sociomantic
Copy link
Contributor

Windows Fiber implementation already uses the fiber stack guard page.
This brings the similar protection for mmaped fiber stacks, using
mprotect.

// protect end of stack
if ( mprotect(guard, PAGESIZE, PROT_NONE) == -1 )
onOutOfMemoryError();
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If valloc/malloc cases are not supported, I'd suggest putting explicit empty else clause with a comment about it (to

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(to ... ? :-)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

.. to better match previous static if block)

@nemanja-boric-sociomantic
Copy link
Contributor Author

Updated with the else block and the comment saying that results are defined only for memory obtained by mmap

@nemanja-boric-sociomantic
Copy link
Contributor Author

Not sure why DAutoTest failed:

make[1]: Leaving directory '/home/dtest/DAutoTest/work/repo/phobos'
for f in `find "/home/dtest/DAutoTest/work/repo/dlang.org/web/phobos-prerelease-verbatim" -iname '*.html'`; do mv $f `dirname $f`/`basename $f .html`.verbatim; done
mv /home/dtest/DAutoTest/work/repo/dlang.org/web/phobos-prerelease-verbatim/* /home/dtest/DAutoTest/work/repo/dlang.org/web/phobos-prerelease/
rm -r /home/dtest/DAutoTest/work/repo/dlang.org/web/phobos-prerelease-verbatim
fatal: 'refs/ae-sys-d-cache/website-2fa232357638ad439f82243dfc7f9d1c2f237c46-f798e1d288b93f2f15229c70de8009f8' - not a valid ref


// protect end of stack
if ( mprotect(guard, PAGESIZE, PROT_NONE) == -1 )
onOutOfMemoryError();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps core.internal.abort is more appropriate here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, but I was just following the module's style, which uses this for anything that can fail (see the same thing in Windows: https://github.com/nemanja-boric-sociomantic/druntime/blob/68bf497033575bdcb41a9392e72bdcf794c66016/src/core/thread.d#L4385-L4386)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, the other ones look like they allocate sth., b/c VirtualAlloc is used to change flags, and should be changed to fail instead of using onOutOfMemoryError.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, fair enough, will change to abort.

@PetarKirov
Copy link
Member

The DAutoTest is about testing the documentation build, and so its failure is unrelated to your pull request.

@nemanja-boric-sociomantic nemanja-boric-sociomantic force-pushed the fiber-protection branch 2 times, most recently from 1d17e7c to f734164 Compare December 2, 2016 14:10
@MartinNowak
Copy link
Member

Maybe we should make the guard size configurable, including a size of 0, much as the stack size is configurable.
Not everyone might want to spent his address space and the (tiny) page mapping overhead, and a single page might not be enough for all memory corruption concerns.

{
// Mark end of stack
for ( ubyte* g = cast(ubyte*)guard; g < guard + PAGESIZE; g+= 32)
g[0 .. 32] = cast(ubyte[]) "END OF FIBER -- END OF FIBER -- ";
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not acceptable to write to that page.
The point of a guard page is that it's not accessible, not backed by real memory, and causes a (page) fault when being accessed.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is not a guarded at this point and writing recognizable string is the main purpose if this PR. It makes possible to easily debug stack overflow in a fiber by printing few bytes around rsp in gdb and checking if it resembles a guard string. Similar code has existed in original tango runtime and I have no idea how one is expected to work with heavy usage of fibers without it - pretty much everyone relies on it in daily workflow at Sociomantic :(

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As Dicebot said, at this point, it is not guarded at this point, and in the lack of proper exception type, instead of SIGSEGV, this is very useful.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not going to rediscuss why Exceptions for async failures are wrong, unwinding is hardly possible for those, and having the compiler assume that each instruction could throw requires complex unwind information or conservative (slow) code.
In any case, neither an exception nor a segfault for a stack overflow tells you much else that that the stack overflowed.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, I wasn't referring about the D exceptions, but the operating system's exception. For example, Windows builds will, this same runtime will, raise STATUS_GUARD_PAGE exception - which is very useful - it prevents you from silently corrupting your memory and it will tell you that your fiber's stack overflowed (if you're familiar with the runtime, though).

On POSIX, I'm not sure what else I can do here. The most reliable way to report the cause of the segmentation fault here is to install SIGSEGV handler, and to compare the siginfo_t.si_addr to see if this is done by fiber stack's overflow, eventually print info to stderr and die.

It is true that segmentation fault doesn't tell you why your program failed, but it least prevents you from silently corrupting other pages of the memory, making your program fail in completely unrelated code, just because some other fiber corrupted the memory. Sadly, this is not theoretical concern (and the code is already available on Windows), but this is something that we're seeing in practice every now and then.

However, even with the segmentation fault, if the core file is generated, you can load it in debugger and simply do x/s $sp-64 and see if the contents of the memory is just END OF FIBER. This is, btw, the reason why we're filling this page with data.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(I forgot to mention @MartinNowak to pass email filters :-) )


// protect end of stack
if ( mprotect(guard, PAGESIZE, PROT_NONE) == -1 )
onOutOfMemoryError();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, the other ones look like they allocate sth., b/c VirtualAlloc is used to change flags, and should be changed to fail instead of using onOutOfMemoryError.

@nemanja-boric-sociomantic
Copy link
Contributor Author

Maybe we should make the guard size configurable, including a size of 0, much as the stack size is configurable.
Not everyone might want to spent his address space and the (tiny) page mapping overhead, and a single page might not be enough for all memory corruption concerns.

Ok, sure. Do you prefer to do this via druntime config argument?

@nemanja-boric-sociomantic
Copy link
Contributor Author

Updated with the:

  • Change onOutOfMemoryError to abort in case of mprotect failure
  • Allow configuring the size of the stack guard (on both Windows and Posix systems) with --DRT-stack-guard-size runtime option.

@nemanja-boric-sociomantic nemanja-boric-sociomantic force-pushed the fiber-protection branch 2 times, most recently from f9dad10 to c912db6 Compare December 19, 2016 14:23
@nemanja-boric-sociomantic
Copy link
Contributor Author

Ah, FreeBSD failed because by the time it got to testing, changelog.dd started containing merge confict.

@nemanja-boric-sociomantic
Copy link
Contributor Author

Updated. @wilzbach could you advise if I got the new changelog format right?

@wilzbach
Copy link
Contributor

Ah, FreeBSD failed because by the time it got to testing, changelog.dd started containing merge confict.

The idea is at that you will never have to rebase again due to changelog conflicts ;-)

dlang/phobos#4228

Updated. @wilzbach could you advise if I got the new changelog format right?

Yup - the idea was also that it's less complicated ;-)

@nemanja-boric-sociomantic
Copy link
Contributor Author

Thanks!

@MartinNowak
Copy link
Member

Sorry haven't looked at this for a while.

  • Why can't the guard page be write protected? That would be a lot better. In fact actually using a page is a huge overhead, not to mention the cost of writing to main memory.
  • Druntime config switches are fairly ugly and only intended for runtime "boot" configuration, doesn't seem necessary here, simply passing the guard size to the constructor of Fiber (like the stack size) should work, but might require to save the size somewhere.

@nemanja-boric-sociomantic
Copy link
Contributor Author

Why can't the guard page be write protected? That would be a lot better. In fact actually using a page is a huge overhead, not to mention the cost of writing to main memory.

I'm not sure I follow this - the guard page is both write and read protected (see here: https://github.com/dlang/druntime/pull/1698/files#diff-8bb12ed976acf0a5132e877ec5a01ea8R4488)

I'll take a look later at other issues mentioned. Thanks!

@nemanja-boric-sociomantic
Copy link
Contributor Author

Druntime config switches are fairly ugly and only intended for runtime "boot" configuration, doesn't seem necessary here, simply passing the guard size to the constructor of Fiber (like the stack size) should work, but might require to save the size somewhere.

I've updated the code so that this is configurable using Fiber's contructor's argument (with an ability to turn it off by setting 0). There wasn't need to save the size anywhere, since it's just counted in the stack's size (in POSIX's case), or it was deallocated automatically (like before in Window's case).

Copy link
Member

@MartinNowak MartinNowak left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Otherwise looks great.

{
// Mark end of stack
for ( ubyte* g = cast(ubyte*)guard; g < guard + guard_page_size; g+= 32)
g[0 .. 32] = cast(ubyte[]) "END OF FIBER -- END OF FIBER -- ";
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here you're writing to the guard page which will cause it to actually get wired. Seems unnecessary, when the page is write protected below.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, the intention is to easily check if the stackoverflow has happened, simply by inspecting the memory pointed by sp in the debugger. I guess one could also check the mappings from /proc to confirm that the page is actually protected. If this writing is a deal breaker I would remove it.

* Params:
* fn = The fiber function.
* sz = The stack size for this fiber.
* guard_page_size = size of the guard page to trap fiber's stack
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Change to guardPageSize to match our naming.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will do. Thanks for the review!

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I sent you an PR @Burgos.

@nemanja-boric-sociomantic
Copy link
Contributor Author

Updated with your PR rebased.

nemanja-boric-sociomantic and others added 5 commits May 6, 2017 12:13
Windows Fiber implementation already uses the fiber stack guard page.
This brings the similar protection for mmaped fiber stacks, using
mprotect.
This allows configuring the size or turning off completely
the fiber's stack's guard page. It can be configured using new
Fiber's constructor's parameter `guard_page_size`.
- no point in creating a dirty write protected page
@nemanja-boric-sociomantic nemanja-boric-sociomantic force-pushed the fiber-protection branch 3 times, most recently from 0ca13da to e6c6ec3 Compare May 6, 2017 10:16
@Burgos
Copy link
Contributor

Burgos commented May 6, 2017

@MartinNowak I've pushed a small commit with a trivial tidying up the test case, if you don't mind.

@Burgos
Copy link
Contributor

Burgos commented May 6, 2017

Green!

@MartinNowak MartinNowak merged commit a3505e7 into dlang:master May 6, 2017
dnadlinger added a commit to dnadlinger/druntime that referenced this pull request May 6, 2017
During the review of GitHub dlang#1698, the "END OF FIBRE" marker data
was removed to avoid dirtying the guard page. Hence, the corresponding
comment in the release notes doesn't apply anymore.
page used for the Fiber's stack, the page is allocate which is protected for
any kind of access. This will cause system to trap immediately on the fiber's
stack overflow. If in debugger session, one can inspect contents of the memory
before or after stack pointer and it can be seen if it contains END OF FIBER
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This isn't true anymore, see #1821 for a fix.

@nemanja-boric-sociomantic nemanja-boric-sociomantic deleted the fiber-protection branch May 8, 2017 13:04
nemanja-boric-sociomantic added a commit to nemanja-boric-sociomantic/dlang.org that referenced this pull request Jun 26, 2017
During the review of GitHub dlang/druntime#1698, the "END OF FIBER" marker data
was removed to avoid dirtying the guard page. Hence, the corresponding
comment in the release notes doesn't apply anymore.

See dlang/druntime#1821
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

8 participants