Skip to content

Conversation

@VSadov
Copy link
Member

@VSadov VSadov commented Jan 6, 2021

Also need to make initialization of OpenSsl shim on-demand.
(see: #46771 and dotnet/sdk#15229).
Otherwise any singlefile app on macOS tries to initialize the shim and aborts if OpenSsl is not available, which is the default situation.

@VSadov
Copy link
Member Author

VSadov commented Jan 6, 2021

The change worked on OSX x64, the tests are enabled, but now there is a test failure.

@VSadov
Copy link
Member Author

VSadov commented Jan 8, 2021

Figured the reasons for the failures - #46771

We load OpenSSL eagerly and it is not generally available on macOS. Even when installed via brew, it not symlinked into /usr/local, because macOS provides LibreSSL

The eagerness needs to be fixed in order to reenable the tests.

@VSadov
Copy link
Member Author

VSadov commented Jan 8, 2021

CC: @janvorli @agocke

@VSadov VSadov force-pushed the hostOSXci branch 2 times, most recently from ef4c4d7 to db47414 Compare January 9, 2021 18:08
@VSadov
Copy link
Member Author

VSadov commented Jan 9, 2021

@janvorli - Could you take a look at the portable Ssl shim initialization change?
The idea is to just delay it until the first use. Thanks!

// If 1.0, call the 1.0 one.
// Otherwise call the 1.1 one.
#ifdef FEATURE_DISTRO_AGNOSTIC_SSL
InitializeOpenSSLShim();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The semantics of the containing method is that it can be called multiple times (thus the name starting with Ensure). The EnsureOpenSsl10Initialized / EnsureOpenSsl11Initialized both work that way:

pthread_mutex_lock(&g_initLock);
if (g_locks != NULL)
{
// Already initialized; nothing more to do.
goto done;
}

So we should make the InitializeOpenSSLShim behave the same way or maybe just put the InitializeOpenSSLShim call into those functions, which would seem better.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed. I will try moving initialization call into these functions.

}
}

internal static partial class OpenSsl
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is not clear to my why the CryptoInitializer is not sufficient and this new initialization invocation is needed.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CryptoInitializer is still used to initialize. What was insufficient is to have only Interop.Crypto to trigger the initialization. Interop.OpenSsl is another surface area that should trigger initialization as there are no guarantees that Interop.Crypto methods are always the first to be used.

One fairly common way to hit uninitialized shim was via Interop.OpenSsl.OpenSslVersionNumber, for example.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems we should use the g_initLock around the InitializeOpenSSLShim here (and make its definition unconditional)

Copy link
Member Author

@VSadov VSadov Jan 11, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems we should use the g_initLock

Is there anything in EnsureOpenSsl10Initialized that is not thread-safe?
I made it conditional in EnsureOpenSsl10Initialized only because there was already code preventing multiple calls.
I do not think it is necessary though. If called twice or concurrently, would it just do exactly the same thing?

make its definition unconditional

Do you mean remove #ifdef FEATURE_DISTRO_AGNOSTIC_SSL ?
Would that compile when we do not have the shim?

Copy link
Member Author

@VSadov VSadov Jan 11, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, it looks like the change just reintroduced the crash back. Tests are failing.

We need to do initialization of the shim in CryptoNative_EnsureOpenSslInitialized. EnsureOpenSsl10Initialized is too late, by then we have crashed already while checking Ssl state.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CryptoNative_EnsureOpenSslInitialized is also convenient, since we know that it will be called exactly once - from static constructor.
Overall I now think the original location of the InitializeOpenSSLShim had fewer problems.

Copy link
Member Author

@VSadov VSadov Jan 11, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just noticed that we also have CryptoNative_EnsureLibSslInitialized, which tries to do similar initialization. Evidently, it is not always called or called early enough, since we have crashes.

The only difference is that it does DetectCiphersuiteConfiguration I think it should be folded into CryptoNative_EnsureOpenSslInitialized.

Then we would know that it is called, and called exactly once and early enough.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems that one is for openssl and the other is for other ssl (macOS)

Copy link
Member Author

@VSadov VSadov Jan 11, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like one calls another. However the calee (EnsureOpenSslInitialized) could be called on its own and would need to ensure that shim is initialized anyways.

I think doing all this initialization through a single entry point may make it easier to ensure that everything gets initialized, in the right order, and only once.

@janvorli
Copy link
Member

cc: @bartonjs - can you please take a look at the changes to the SSL initialization in this change?

@bartonjs
Copy link
Member

We load OpenSSL eagerly and it is not generally available on macOS. Even when installed via brew, it not symlinked into /usr/local, because macOS provides LibreSSL

It should exist on all of our test machines, since we have to test the existing OpenSSL support.

Copy link
Member

@bartonjs bartonjs Jan 11, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This function now needs to ensure mutual exclusion because two different managed libraries could trip calls in parallel.

Specifically, first caller does work, subsequent callers have to block until the first caller finishes (then exit having done no further work).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, this function looks relatively idempotent. It might be OK.

Copy link
Member Author

@VSadov VSadov Jan 11, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the mutual exclusion is the reason why we do not call EnsureOpenSslInitialized directly, but rely on triggering a static constructor, which is guaranteed to execute once.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well... it's guaranteed to execute once per type. But since it's an internal type included in multiple assemblies (System.Security.Cryptography.Algorithms, System.Security.Cryptography.OpenSsl, System.Security.Cryptography.X509Certificates, and System.Net.Security) there are multiple cctors which can call it.

If all cctors execute on a single thread then mutex is guaranteed, but my understanding is they run on the invoking thread, so this has to account for 4-library parallel callers.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And yes, I think it is idempotent. Shim init basically just binds pointers to implementing methods. Doing this twice should produce the same results.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

since it's an internal type included in multiple assemblies

Oh right. That will call for every clone of the CryptoInitializer, which would be redundant, but ok thing to do.

@VSadov
Copy link
Member Author

VSadov commented Jan 11, 2021

It should exist on all of our test machines

I used to have it on my Air too, but the library is no longer usable as-is. I think installing Big Sur broke it. (It was reported to be reproducible on Catalina too, so it could be broken earlier)
Reacquiring via brew was not sufficient. I managed to fix it locally by manipulating DYLD_LIBRARY_PATH. Perhaps lab machines have the same issue.

Anyways, I think the lab issue should be addressed separately. What user sees is that singlefile HelloWorld crashes with No usable version of libssl was found. It is hard to sell as a desirable behavior :)

@agocke
Copy link
Member

agocke commented Jan 12, 2021

I can't seem to get it to break on one of my machines, so it's certainly available somehow... I have another machine that's pretty much blank so I'll see if I can repro the issue there and narrow down what's required.

@VSadov
Copy link
Member Author

VSadov commented Jan 12, 2021

What worked for me locally was:

  • brew install openssl@1.1
  • export DYLD_LIBRARY_PATH="/usr/local/opt/openssl@1.1/lib"

@VSadov
Copy link
Member Author

VSadov commented Jan 12, 2021

I still think we should initialize the shim lazily, as a part of overall openssl initialization, that is already lazy.

@VSadov
Copy link
Member Author

VSadov commented Jan 12, 2021

@janvorli @bartonjs - are we ok with the fix, or do you have more concerns or need to think more about this?

@bartonjs
Copy link
Member

I'm probably OK with the overall shape, but there's an outstanding comment to call to Initialize is supposed to move into each of the two Ensure functions. (Ensure10 already locks itself for mutual exclusion, Ensure11 doesn't since it only does idempotent things... for Ensure10 let's go ahead and put the call to Initialize after the "oh, I'm already done" bailout)

@VSadov
Copy link
Member Author

VSadov commented Jan 12, 2021

Attempt to move the initialize to Ensure10 and Ensure11 brought the crash back and I reverted it. Initializing there is too late. The shim must be initialized before we know which one of these two to call.

@VSadov
Copy link
Member Author

VSadov commented Jan 12, 2021

It looks like all the comments about how we went that way (moving initialization lower) and back are not presented by GitHub. I am guessing because the code was reverted.

Moving initialize to Ensure10 and Ensure11 brought more problems that it solved. - since one of them uses lock, should the other one use a lock as well, should it be the same shared lock, can that shared lock deadlock. Why all this synchronization is suddenly needed?

And at the end it did not work anyways, since we need to access SSL state before calling these two methods just to decide which one to call and that caused the same crashes as before.

@VSadov
Copy link
Member Author

VSadov commented Jan 12, 2021

I think we need to confirm one thing - is the shim initialization idempotent?
@janvorli - is that correct?

If it is idempotent, we have a lot of freedom where to call it, as long as that is before the first use.

The current location seems convenient. It is early enough and in one place. Also being called from static constructor prevents most redundant calls. We may still see one call per copy of the init class, but that redundancy should not be a big deal.

@VSadov
Copy link
Member Author

VSadov commented Jan 12, 2021

It looks like initialization is basically just a sequence of fnXX_ptr = (TYPEOF(fn))(dlsym(libssl, fnXXname)); for a bunch of functions.
I do not see how this can produce different results if called twice or concurrently.

@bartonjs
Copy link
Member

Moving initialize to Ensure10 and Ensure11 brought more problems that it solved.

OK.

The only difference is that it does DetectCiphersuiteConfiguration I think it should be folded into CryptoNative_EnsureOpenSslInitialized.

If that simplifies the delayed nature, OK. The two are currently separated in that one initializes libcrypto (the cryptographic primitives and such) and the other initializes libssl (the library which knows how to do TLS and uses libcrypto to do it). That got a little blurry already with OpenSSL 1.1 and our initializer there, so if we want to have just one waker-upper that's OK with me. If it still is sensible as two (the libssl configuration one is only called by System.Net.Security so it has never had a "maybe I happen more than once") then we'd need to double check idempotency of that routine.


Assuming Jan V is OK with letting the function pointer binder run more than once I'm OK with the change (unless something comes up in a future commit, of course 😄).

@janvorli
Copy link
Member

If we let the InitializeOpenSSLShim run more than once in debug / checked build, you'd get an assert here:


We could remove the assert, but then we would end up with the library having a reference count bumped possibly more than once there. So besides removing it, I'd suggest to use compare exchange to set the libssl and call dlclose on the handle in case it was already set. That way we would bump the ref count only by one. The rest of the code should just work - if other thread opened the library before us, we would just regenerate the pointers (and we need to do that, since we don't know if the other thread didn't get scheduled out in the middle of filling the pointers)

@VSadov
Copy link
Member Author

VSadov commented Jan 12, 2021

So, we need:

  • take care of libssl ref count (interlock exchange + dlclose suggestion)
  • make sure EnsureLibSslInitialized is idempotent.

@VSadov
Copy link
Member Author

VSadov commented Jan 12, 2021

It looks like EnsureLibSslInitialized may need a init flag/lock to be callable more than once. Or split LibSsl back into a separate initializer.

I hoped that by folding that into CryptoNative_EnsureOpenSslInitialized we would avoid issues of multiple invocations. However, since we duplicate initializer classes that is not the case and that was the main goal for the merge.

I will revert the change that merges LibSsl initializer into OpenSsl initializer.

@VSadov
Copy link
Member Author

VSadov commented Jan 13, 2021

@janvorli @bartonjs - in the last change:

  • use interlocked operation to set libssl static handle in the shim initialization and do dlclose if another thread did it faster.
  • reverted the commit that folds libSsl initialization into CryptoNative_EnsureOpenSslInitialized

Now we are back to the original code where libSsl has its own initializer CryptoNative_EnsureLibSslInitialized called from a static constructor in System.Net.Security to guarantee a single call.
CryptoNative_EnsureLibSslInitialized calls CryptoNative_EnsureOpenSslInitialized as before, and that will initialize libssl shim. And since the shim initialization is idempotent (it needs to be anyways), we are ok with rare cases when that happens to be redundant/concurrent.

Copy link
Member

@janvorli janvorli left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thank you!

@VSadov
Copy link
Member Author

VSadov commented Jan 13, 2021

Thanks!!!

@ghost
Copy link

ghost commented Jan 13, 2021

Hello @VSadov!

Because this pull request has the auto-merge label, I will be glad to assist with helping to merge this pull request once all check-in policies pass.

p.s. you can customize the way I help with merging this pull request, such as holding this pull request until a specific person approves. Simply @mention me (@msftbot) and give me an instruction to get started! Learn more here.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants