Skip to content

Fix ExceptionPackage memory errors#3088

Merged
tlively merged 3 commits intoWebAssembly:masterfrom
tlively:exn-leak
Sep 1, 2020
Merged

Fix ExceptionPackage memory errors#3088
tlively merged 3 commits intoWebAssembly:masterfrom
tlively:exn-leak

Conversation

@tlively
Copy link
Member

@tlively tlively commented Sep 1, 2020

First, adds an explicit destructor call to fix a memory leak in
Literal::operator= in which existing ExceptionPackages would be
silently dropped. Next, changes Literal::getExceptionPackage to
return the ExceptionPackage by value to avoid a use-after-free bug
in the interpreter that was surfaced by the new destructor call.

Fixes #3087.

First, adds an explicit destructor call to fix a memory leak in
`Literal::operator=` in which existing `ExceptionPackage`s would be
silently dropped. Next, changes `Literal::getExceptionPackage` to
return the `ExceptionPackage` by value to avoid a use-after-free bug
in the interpreter that was surfaced by the new destructor call.

Fixes WebAssembly#3087.
@tlively tlively requested review from aheejin and dcodeIO September 1, 2020 01:09
Comment on lines 60 to 62
// Avoid calling the destructor, which may not be correct
new (&exn) auto(
std::move(std::make_unique<ExceptionPackage>(*other.exn)));
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was already not calling the destructor; the addition of std::move is just a drive-by optimization.

It is important to not call the destructor if the previous type was anything besides exnref, so this line was already correct. The only thing missing was that we had to call the destructor if the previous type was exnref.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is one of those days where I think longingly of Rust 🦀

@dcodeIO
Copy link
Contributor

dcodeIO commented Sep 1, 2020

Looking at your fix (yay!), this somewhat reminds me of

const Literal& getSingleValue() {
assert(values.size() == 1);
return values[0];
}

which is (pretty much the only occasion not in literal.h/cpp) passing a literal by reference. Just curious if this might be problematic, or is this fine?

@dcodeIO
Copy link
Contributor

dcodeIO commented Sep 1, 2020

(just gave this a shot, and indeed removing just the & in the code snippet above also makes the sanitizer pass, at least in my branch with the destructor call uncommented. I'm fascinated.)

@tlively
Copy link
Member Author

tlively commented Sep 1, 2020

Looking at your fix (yay!), this somewhat reminds me of

const Literal& getSingleValue() {
assert(values.size() == 1);
return values[0];
}

which is (pretty much the only occasion not in literal.h/cpp) passing a literal by reference. Just curious if this might be problematic, or is this fine?

This is fine as long as the reference returned by this function is never used after the underlying Literals is destroyed. So every caller just has to be careful and not mess up. It would be safer to return the Literal by value rather than by reference, but then we would be making unnecessary copies all over the place.

In Rust, we would have to add a lifetime annotation to this function in order to return the reference, but in return the type system would guarantee that the reference would never be dangling.

@dcodeIO
Copy link
Contributor

dcodeIO commented Sep 1, 2020

In modern times an idiom has grown from it meaning "Any source of great and unexpected troubles", or alternatively "A present which seems valuable but which in reality is a curse".

What is _______ ?

@tlively
Copy link
Member Author

tlively commented Sep 1, 2020

Omg, now the issue is that the copy constructor is implemented in terms of the assignment operator, but the assignment operator now reads from the current value of Literal::type, which is of course uninitialized, so this is UB.

This is quite the pandora's box ;)

@dcodeIO
Copy link
Contributor

dcodeIO commented Sep 1, 2020

Not sure if related, but though I check out your branch, compile with -fsanitize=address and run check.py before approving, which found

.. flatten_dfo_O3_enable-threads.wast
executing:  /path/to/binaryen/lbuild/bin/wasm-opt --flatten --dfo -O3 --enable-threads split.wast --print
warning: no output file specified, not emitting output

=================================================================
==4718==ERROR: LeakSanitizer: detected memory leaks

Direct leak of 8208 byte(s) in 18 object(s) allocated from:
    #0 0x7f59a7b48947 in operator new(unsigned long) (/usr/lib/x86_64-linux-gnu/libasan.so.5+0x10f947)
    #1 0x7f59a6349fff in wasm::DataFlowOpts::optimizeExprToConstant(wasm::DataFlow::Node*) (/path/to/binaryen/lbuild/bin/../lib/libbinaryen.so+0x895fff)
    #2 0x7f59a6350b2c in wasm::DataFlowOpts::doWalkFunction(wasm::Function*) (/path/to/binaryen/lbuild/bin/../lib/libbinaryen.so+0x89cb2c)
    #3 0x7f59a6352ccd in wasm::WalkerPass<wasm::PostWalker<wasm::DataFlowOpts, wasm::Visitor<wasm::DataFlowOpts, void> > >::runOnFunction(wasm::PassRunner*, wasm::Module*, wasm::Function*) (/path/to/binaryen/lbuild/bin/../lib/libbinaryen.so+0x89eccd)
    #4 0x7f59a60a406c in wasm::PassRunner::runPassOnFunction(wasm::Pass*, wasm::Function*) (/path/to/binaryen/lbuild/bin/../lib/libbinaryen.so+0x5f006c)
    #5 0x7f59a60a4f2e in std::_Function_handler<wasm::ThreadWorkState (), wasm::PassRunner::run()::{lambda()#2}::operator()() const::{lambda()#1}>::_M_invoke(std::_Any_data const&) (/path/to/binaryen/lbuild/bin/../lib/libbinaryen.so+0x5f0f2e)
    #6 0x7f59a7245091 in wasm::Thread::mainLoop(void*) (/path/to/binaryen/lbuild/bin/../lib/libbinaryen.so+0x1791091)
    #7 0x7f59a7246849 in std::thread::_State_impl<std::thread::_Invoker<std::tuple<void (*)(void*), wasm::Thread*> > >::_M_run() (/path/to/binaryen/lbuild/bin/../lib/libbinaryen.so+0x1792849)
    #8 0x7f59a59a9cb3  (/usr/lib/x86_64-linux-gnu/libstdc++.so.6+0xd6cb3)

SUMMARY: AddressSanitizer: 8208 byte(s) leaked in 18 allocation(s).

Makes me wonder if we should add (more) AddressSanitizer to CI?

@tlively
Copy link
Member Author

tlively commented Sep 1, 2020

We already have ASan on CI, but we disable the leak checker, probably because all our leaks would cause our CI to fail! It would certainly be good to clean up the remaining leaks and enable leak checking on CI.

@tlively tlively requested a review from kripken September 1, 2020 19:47
@tlively
Copy link
Member Author

tlively commented Sep 1, 2020

To be clear, I think this PR is good to land, though. That other leak can be cleaned up separately.

@aheejin
Copy link
Member

aheejin commented Sep 1, 2020

Come to think of it, I'm not sure if we should use unique_ptr in the first place. Literals are copied every time they are passed by value. And while this looks fine at the moment because it returns a reference, does this mean we consider Flow as a canonical storage for Literals?

Can't we just make this shared_ptr and pass Literals by value anywhere, to make matters simple? (Or we can keep using refererences with no problem in places we currently use references), if you are concerned with performance) I'm actually trying this myself, but this gives me another leak, so I was debugging it, but I now have to leave for today.

#3087 says shared_ptr doesn't work because it increases the union size, but in my machine, sizeof(shared_ptr) is 16, and it seems the size is usually 2x the pointer size (I'm not very sure if this is always true). And we have uint8_t v128[16];, whose size is 16, in the union alerady. So I guess it's gonna be OK for 64bit machines..? Anyway I didn't have problems compiling the program.

@tlively
Copy link
Member Author

tlively commented Sep 1, 2020

does this mean we consider Flow as a canonical storage for Literals?

That's how I've been thinking about it.

Can't we just make this shared_ptr and pass Literals by value anywhere, to make matters simple?

It seems slightly simpler to me to use unique_ptr and just copy the payload as necessary. That would have the benefit that a Literal would be the sole owner of its payload, so it would be safe to mutate the payload. But I don't feel strongly about it.

getSingleValue only returns a reference because it is essentially a shortcut for flow.values[0] with an assert, but I agree that Literals should be passed by value in most situtations.

@dcodeIO
Copy link
Contributor

dcodeIO commented Sep 1, 2020

Hmm I don't recall exactly why shared_ptr didn't just work, but I remember that the assert in binaryen-c.cpp comparing both Literal and BinaryenLiteral to have the same size triggered. Might well be that I did something else wrong leading to it, was trying out lots of stuff at that point. In general I think that if a Literal remains the sole owner of its payload anyway, we might just use an ExceptionPackage* that we new and delete, greatly simplifying the code as I don't see what benefit unique_ptr gives us if we have to exn.~unique_ptr() it explicitly anyway.

@tlively
Copy link
Member Author

tlively commented Sep 1, 2020

In general I think that if a Literal remains the sole owner of its payload anyway, we might just use an ExceptionPackage* that we new and delete, greatly simplifying the code as I don't see what benefit unique_ptr gives us if we have to exn.~unique_ptr() it explicitly anyway.

Hmm, good point. Even if it feels dangerous, it's actually the same as using a unique_ptr in this case. Of course one way to actually make this safer would be to upgrade to C++17 and use a std::variant instead of manually managing a union here. Maybe that would be worth doing soonish...

Copy link
Member

@kripken kripken left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This specific incremental change looks good.

In general moving more to passing by value as @aheejin said sounds good to me. I don't think interpreter perf matters much which is the one place that would be noticed. (However, we should try to avoid increasing the size if possible, but sounds like that might not be an issue.)

assert(type == Type::exnref);
return *exn.get();
}
ExceptionPackage getExceptionPackage() const;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why move this body to the cpp? all the siblings are here it seems.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since ExceptionPackage is now returned by value, it needs to be a complete type at the point of this function definition. But the definition of ExceptionPackage can't move before the definition of Literal for the same reason, so the only choice was to move getExceptionPackage to be defined out of line, which means it needs to be in the .cpp file.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see, thanks...

@tlively
Copy link
Member Author

tlively commented Sep 1, 2020

Cool, I'll land this as-is (even though the unique_ptr doesn't buy us much) for now and we can revisit this and probably make Literal a std::variant later.

@tlively tlively merged commit 85234cb into WebAssembly:master Sep 1, 2020
@tlively tlively deleted the exn-leak branch September 1, 2020 22:56
@kripken
Copy link
Member

kripken commented Sep 1, 2020

+1 for doing c++17 sooner than later in order to use std::variant!

@aheejin
Copy link
Member

aheejin commented Sep 2, 2020

I think using unique_ptr is actually dangerous and a source of potential bugs. Literal is not only used in the interpreter, and it is copied every single time it is created and passed. It is pure luck that we only use Literal for exnref in the interpreter. Making Flow as a canonical source of Literal has not been discussed so far, and that doesn't hold true for Literals outside of the interpreter anyway.

But maybe it's not worth putting much time on this given that exnref can be gone in near future.

@aheejin
Copy link
Member

aheejin commented Sep 2, 2020

Hmm I don't recall exactly why shared_ptr didn't just work, but I remember that the assert in binaryen-c.cpp comparing both Literal and BinaryenLiteral to have the same size triggered.

I didn't have any problem with this clause.. In my machine, both the size of uint8_t v128[16] and shared_ptr is 16. Maybe it can be different in your platform? If this is a platform-depending value, using this can be risky. I googled it and it seems the size is twice the size of the pointer size, but I'm not sure if it is guaranteed everywhere..

@kripken
Copy link
Member

kripken commented Sep 2, 2020

@aheejin I agree unique_ptr is not good here. This PR looks like a good incremental step though, as it doesn't add unique_ptr, and improves what we have already. But yes, let's move towards avoiding unique_ptr here entirely.

@aheejin
Copy link
Member

aheejin commented Sep 2, 2020

Yeah, using it was my fault in the first place ;( But anyway, it will likely to be gone at some point.

This was referenced Sep 5, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Memory leak in Literal

4 participants