Skip to content

Comments

Eliminate static shared this from std/parallelism.d#5470

Merged
dlang-bot merged 3 commits intodlang:masterfrom
andralex:nomoresharedzisinparallelism
Oct 27, 2017
Merged

Eliminate static shared this from std/parallelism.d#5470
dlang-bot merged 3 commits intodlang:masterfrom
andralex:nomoresharedzisinparallelism

Conversation

@andralex
Copy link
Member

Wondering how to get rid of the shared static ~this. cc @dsimcha @wilzbach @quickfur @JackStouffer @MartinNowak

@andralex andralex force-pushed the nomoresharedzisinparallelism branch 3 times, most recently from a4b2323 to 8bfa663 Compare June 11, 2017 19:04
@andralex andralex force-pushed the nomoresharedzisinparallelism branch from 8bfa663 to 7f2a8b0 Compare June 29, 2017 15:21
@dlang-bot
Copy link
Contributor

Thanks for your pull request, @andralex!

Bugzilla references

Your PR doesn't reference any Bugzilla issue.

If your PR contains non-trivial changes, please reference a Bugzilla issue or create a manual changelog.

@JackStouffer
Copy link
Contributor

Maybe this is something I don't yet understand about D, but why the cast to @trusted pure instead of marking the impl function as @trusted pure?

@andralex
Copy link
Member Author

Maybe this is something I don't yet understand about D, but why the cast to @trusted pure instead of marking the impl function as @trusted pure?

The cast doesn't only add the necessary attributes, it forces all attributes.

@JackStouffer
Copy link
Contributor

I understand why we're getting rid of module constructors in general, but IMO this seems like one area where they are working as intended. Especially since the code is now more complex, and to get this work without constructors, an unsafe cast has to be used to lie to the user about the purity of the function.

@schveiguy I'm curious about your opinion on the purity question, as you had a lot of insight during the pure malloc discussion.

@andralex andralex force-pushed the nomoresharedzisinparallelism branch from 7f2a8b0 to d712d81 Compare June 30, 2017 13:59
@andralex
Copy link
Member Author

Updated the code with a reusable function that lazily initializes any atomically-loadable value. I plan to use that elsewhere and perhaps expose it publicly in the future.

@JackStouffer module constructors do a fine job but the standard library undergoes extra demands. It's expected of us to wander off the beaten path in the implementation.

@andralex
Copy link
Member Author

cc @WalterBright @MartinNowak @CyberShadow need this reviewed because I can reuse it elsewhere - thx!

import core.sys.posix.unistd : _SC_NPROCESSORS_ONLN, sysconf;
totalCPUs = cast(uint) sysconf(_SC_NPROCESSORS_ONLN);
}
shared T result = outOfBandValue;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

static shared?

auto local = atomicLoad(result);
if (local != outOfBandValue) return local;
// Long path
local = (cast(size_t function() @nogc nothrow pure @trusted) &initializer)();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You should check the type of initializer, because it may be delegate, a literal value, or any function pointer type pointing to a function taking one more arguments (which should be statically disallowed).

{
return (cast(uint function() @nogc nothrow pure)
&totalCPUsImpl)();
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can't you use lazilyInitializedConstant for totalCPUs?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah, good idea brb

@andralex andralex force-pushed the nomoresharedzisinparallelism branch from d712d81 to 9408826 Compare October 25, 2017 21:56
@andralex
Copy link
Member Author

@ZombineDev updated

For that reason, no special precautions are taken so `initializer` may be called
more than one time leading to benign races on the cached value.

In the quiescent state the cost of the function is an atomic load from a global.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Need Params: and Returns: sections.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

function is private, but will do

{
static shared T result = outOfBandValue;
// Short path
auto local = atomicLoad(result);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should local be const ?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no, for generality

auto local = atomicLoad(result);
if (local != outOfBandValue) return local;
// Long path
local = initializer();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use different variable name here, as this use of local is completely independent of the previous one.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not getting it... it's the same var

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider the trivial case:

int i = 4;
foo(i);
i = 5;  // really the start of a different variable
foo(i);

By the idea that each variable should have the tightest scope it can, when a variable has multiple disjoint lifetimes it should be split into multiple independent variables. It also often enables a variable to become const.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@WalterBright I understood the larger point you made with your first comment, but here local is and stays the local representation of the data meant to be cached throughout. We really are in good shape here.

import core.cpuid : datacache;
size_t lineSize = 0;
foreach (cachelevel; datacache)
foreach (ref cachelevel; datacache)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

const ref ?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok

uint totalCPUsImpl() @nogc nothrow
{
static shared uint result;
auto localResult = atomicLoad(result);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

const

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok

auto nameStr = "hw.ncpu\0".ptr;
}

localResult = 0;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Independent use of localResult should not reuse the name.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it's in the same arc...

}

localResult = 0;
size_t len = uint.sizeof;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use const len instead of size_t len

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@property uint defaultPoolThreads() @trusted
{
return atomicLoad(_defaultPoolThreads);
auto local = atomicLoad(_defaultPoolThreads);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

const

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok

@WalterBright
Copy link
Member

As I recall, there was a correct way to do double checked locking using TLS variables. Perhaps this would be faster than going through atomicLoad().

@John-Colvin
Copy link
Contributor

John-Colvin commented Oct 26, 2017

@WalterBright an easy way to use tls for the fast path

    static nothrow T impl()
    {
        // as Andrei wrote it, reading/writing to global shared ...
    }
    
    static nothrow tlsCache()
    {
        static T tls = outOfBandValue;
        if (tls == outOfBandValue)
            tls = impl();
        return tls;
    }

@andralex andralex force-pushed the nomoresharedzisinparallelism branch 3 times, most recently from cf00ba7 to f0c986f Compare October 26, 2017 15:29
@andralex andralex force-pushed the nomoresharedzisinparallelism branch from f0c986f to 286930a Compare October 26, 2017 16:10
@andralex
Copy link
Member Author

I updated the code to use a thread-local cache, starts to feel a bit overcooked. Anyhow it works nicely.

@dlang-bot dlang-bot merged commit 87bdeae into dlang:master Oct 27, 2017
@jmdavis
Copy link
Member

jmdavis commented Oct 27, 2017

This PR broke dustmite

DustMite/dustmite.d(112): Error: function std.parallelism.lazilyInitializedConstant!(immutable(uint), 4294967295u, totalCPUsImpl).lazilyInitializedConstant is not accessible from module dustmite
../phobos/std/algorithm/iteration.d(1106):        instantiated from here: FilterResult!(__lambda2, string[])
DustMite/dustmite.d(106):        instantiated from here: filter!(string[])
DustMite/dustmite.d(163): Error: function std.parallelism.lazilyInitializedConstant!(immutable(uint), 4294967295u, totalCPUsImpl).lazilyInitializedConstant is not accessible from module dustmite

It looks like totalCPUs is now unusable outside of std.parallelism, because lazilyInitializedConstant is package, and aliases don't change the access level of what they alias.

@PetarKirov
Copy link
Member

PetarKirov commented Oct 27, 2017

@jmdavis I propose to (temporary) revert this change in #5815.

PetarKirov added a commit to PetarKirov/phobos that referenced this pull request Oct 27, 2017
…parallelism"

This reverts commit 87bdeae, reversing
changes made to b5e6365.
@andralex
Copy link
Member Author

Oy what a bug that by definition can't be detected through unittesting :).

dlang-bot added a commit that referenced this pull request Oct 27, 2017
Fix visibility issue for #5470
merged-on-behalf-of: Petar Kirov <ZombineDev@users.noreply.github.com>
@MartinNowak
Copy link
Member

That's the public alias to private symbol thing people mentioned recently. This will soon be fixed and should pass, atm. we still have the old (buggy) access checker running.
dlang/dmd#7241

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

9 participants