Skip to content

fix Issue 16513 - slow findExistingInstance#6418

Merged
MartinNowak merged 6 commits intodlang:masterfrom
MartinNowak:fix16513
Jan 13, 2017
Merged

fix Issue 16513 - slow findExistingInstance#6418
MartinNowak merged 6 commits intodlang:masterfrom
MartinNowak:fix16513

Conversation

@MartinNowak
Copy link
Member

  • construct a proper hash to reduce collisions in template instance cache
  • in particularly hash all sorts of expressions and combine hashes non-associatively
  • the current hash degrades to linear search for many ti arguments, with this
    patch collisions while compiling phobos were reduced from ~36K to 376
  • for the reported case (few 1000s alias symbols) compilation is now ~50% faster

- use separate branches for the 4 kinds of template arguments
- count lookups and comparisons
@dlang-bot
Copy link
Contributor

dlang-bot commented Jan 9, 2017

Fix Bugzilla Description
16513 Speed up TemplateInstance.findExistingInstance hash

Copy link
Member

@UplinkCoder UplinkCoder left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks fine at a first glance.

- calc hash for all expressions (matching their equals implementation!)
- heavily reduces number of hash collisions and comparisons
- use mixHash from MurmurHash2 for any order sensitive hashes
- use ^ instead of + to reduce order insensitive AA elem hashes as it
  doesn't have the bias towards more significant bits
- fixes Issue 16513
@WalterWaldron
Copy link
Contributor

Not sure if it's relevant to real world template parameters, but the mixing function maps two zero hashes to zero, so if you have N zero hash template parameters it'll have the same hash as M zero hash template parameters.

My solution was to use a mixing function like boost::hash_combine and multiply the integer arguments by a large prime.

- avoids multiplication and returns non-zero hash when combining two
  zero hash
@MartinNowak
Copy link
Member Author

My solution was to use a mixing function like boost::hash_combine

Nice, thanks for the tip, this reduced the number of collisions even further down to 244.
Could be a nice candidate to replace our imul based mix function for AA's as well.
https://github.com/dlang/druntime/blob/adc254cea657fb7898963b79fd544c5b7bef84f2/src/rt/aaA.d#L300

@WalterWaldron
Copy link
Contributor

Could be a nice candidate to replace our imul based mix function for AA's as well.

My preference would be removing bad defaults[1] and removing that function all together.
[1] examples:
dmd/src/clone.d#L741
most of the rt/typeinfo's: druntime/src/rt/typeinfo/ti_uint.d#L27

@MartinNowak
Copy link
Member Author

Could be a nice candidate to replace our imul based mix function for AA's as well.

My preference would be removing bad defaults[1] and removing that function all together.

Didn't do this back then b/c we do allow user defined toHash and people love using pointers or non-random ids for hashes. But right, we shouldn't pay for other peoples issues. Then again the mix op is really cheap.

@WalterWaldron
Copy link
Contributor

One more thought: it might be profitable to do some additional smearing for pointers to reduce the impact of their alignment.

@MartinNowak
Copy link
Member Author

One more thought: it might be profitable to do some additional smearing for pointers to reduce the impact of their alignment.

That should be covered by the rotate-like (h << 6) + (h >> 2).

@WalterWaldron
Copy link
Contributor

That should be covered by the rotate-like (h << 6) + (h >> 2).

I was thinking about the small default (size 4?) tables, and how alignment would impact those cases.

Anyways, I did a little testing myself [1] and found that the std.algorithm.searching unit test produce a variable amount of collisions (~3600 - 11000.)
I tried doing the following:

  • using if (h) h = h ^ bsf(h); for pointer hashing
  • using h >>= 4; for pointer hashing (removing min. align)
  • In TemplateInstance.toHash() using mixHash instead of addition.

None of these changes made collisions consistent, so a little more digging is in order.
[1] I was using: make -f posix.mak std/range.test std/traits.test std/typecons.test std/algorithm.test | grep collisions --context=3

@MartinNowak
Copy link
Member Author

Just debugged this, seems to mostly happen because of auto ref parameters (of binaryFun). Those are matched to the lvalue-ness of the function template arguments, see

dmd/src/dtemplate.d

Lines 6894 to 6923 in 538a895

if (auto fd = ti.toAlias().isFuncDeclaration())
{
if (!fd.errors)
{
auto fparameters = fd.getParameters(null);
size_t nfparams = Parameter.dim(fparameters); // Num function parameters
for (size_t j = 0; j < nfparams; j++)
{
Parameter fparam = Parameter.getNth(fparameters, j);
if (fparam.storageClass & STCautoref) // if "auto ref"
{
if (!fargs)
goto Lnotequals;
if (fargs.dim <= j)
break;
Expression farg = (*fargs)[j];
if (farg.isLvalue())
{
if (!(fparam.storageClass & STCref))
goto Lnotequals; // auto ref's don't match
}
else
{
if (fparam.storageClass & STCref)
goto Lnotequals; // auto ref's don't match
}
}
}
}
}
. So even if the template arguments are the same, in the current implementation the instances don't compare equal. Also auto ref is resolved during instantiation, so it seems tricky to distinguish that in advance.
Fixing that is somewhat out of scope of this PR.

@UplinkCoder
Copy link
Member

@MartinNowak go ahead and merge 👍

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants