Skip to content

Rework #8542 wrt. deterministic IDs via generateIdWithLoc()#8633

Merged
dlang-bot merged 1 commit intodlang:masterfrom
kinke:stableIDs
Apr 15, 2019
Merged

Rework #8542 wrt. deterministic IDs via generateIdWithLoc()#8633
dlang-bot merged 1 commit intodlang:masterfrom
kinke:stableIDs

Conversation

@kinke
Copy link
Contributor

@kinke kinke commented Aug 28, 2018

No description provided.

@dlang-bot dlang-bot added the Review:WIP Work In Progress - not ready for review or pulling label Aug 28, 2018
@dlang-bot
Copy link
Contributor

Thanks for your pull request and interest in making D better, @kinke! We are looking forward to reviewing it, and you should be hearing from a maintainer soon.
Please verify that your PR follows this checklist:

  • My PR is fully covered with tests (you can see the annotated coverage diff directly on GitHub with CodeCov's browser extension
  • My PR is as minimal as possible (smaller, focused PRs are easier to review than big ones)
  • I have provided a detailed rationale explaining my changes
  • New or modified functions have Ddoc comments (with Params: and Returns:)

Please see CONTRIBUTING.md for more information.


If you have addressed all reviews or aren't sure how to proceed, don't hesitate to ping us with a simple comment.

Bugzilla references

Your PR doesn't reference any Bugzilla issue.

If your PR contains non-trivial changes, please reference a Bugzilla issue or create a manual changelog.

Testing this PR locally

If you don't have a local development environment setup, you can use Digger to test this PR:

dub fetch digger
dub run digger -- build "master + dmd#8633"

@kinke kinke force-pushed the stableIDs branch 2 times, most recently from 9a5b054 to 5c6e203 Compare August 29, 2018 20:34
@kinke
Copy link
Contributor Author

kinke commented Aug 29, 2018

Custom hashing of Loc is necessary because of mixins, where the const(char)* Loc.filename pointers diverge, e.g.:

test/runnable/test18880.d-mixin-7 (0x558d3e709930)
test/runnable/test18880.d-mixin-7 (0x558d3e70a120)
test/runnable/test18880.d-mixin-7 (0x558d3e70a3f0)

Btw, 2.081.1 isn't able to compile master, that's pretty bad (I went back to 2.077).

@kinke kinke changed the title [WIP] Rework #8542 wrt. deterministic IDs via generateIdWithLoc() Rework #8542 wrt. deterministic IDs via generateIdWithLoc() Aug 29, 2018
@kinke
Copy link
Contributor Author

kinke commented Aug 30, 2018

Some remarks:

  1. Hashing Loc is case-sensitive wrt. file path on Windows too, unlike opEquals(). So that's not really correct.
  2. The existing Loc.equals() method ignoring charnum without explicit -vcolumns is pretty weird IMO; it'd be better renamed to make that clear.
  3. Loc.filename should probably be a proper dmd.root.filename.FileName struct (contains a single const(char)* pointer). And FileName should probably get a proper opEquals() and toHash() instead of the Loc. If it contained a D string instead of the C one, it would probably just work out of the box without explicit implementations, at the (memory) cost of an additional size_t per Loc; that is, if we ditched the case-insensitiveness on Windows.
  4. The generated IDs aren't really unique per module, but rather per filename (same as current master; there should be no functional changes wrt. generated IDs with this PR). The mixin example above illustrates that (=> artificial filenames with extra suffix). Not sure if there can thus still be collisions between the actual module (foo.d) and a mixin (foo.d-mixin-<N>); base identifier, line and column would have to match for that to happen.

@ibuclaw
Copy link
Member

ibuclaw commented Aug 30, 2018

If it contained a D string instead of the C one, it would probably just work out of the box without explicit implementations, at the (memory) cost of an additional size_t per Loc.

I can't see this happening, Loc is a critical object used everywhere. Increasing its size is not really an available option.

@Geod24
Copy link
Member

Geod24 commented Aug 31, 2018

I can't see this happening, Loc is a critical object used everywhere. Increasing its size is not really an available option.

Well that's exactly the goal I've been working towards recently...

@ibuclaw
Copy link
Member

ibuclaw commented Aug 31, 2018

Well that's exactly the goal I've been working towards recently...

I'll just leave this review here: #7821 (comment)

@jacob-carlborg
Copy link
Contributor

I can't see this happening, Loc is a critical object used everywhere. Increasing its size is not really an available option.

Isn't Loc mostly passed by const ref? Or is that not your concern?

BTW, in Clan the corresponding type is only 32 bit large. The documentation says:

"Technically, a source location is simply an offset into the manager's view of the input source"

[1] https://clang.llvm.org/doxygen/classclang_1_1SourceLocation.html

@ibuclaw
Copy link
Member

ibuclaw commented Sep 2, 2018

Isn't Loc mostly passed by const ref? Or is that not your concern?

It's not how it's passed around, increasing the size of Loc increases the size of every object in the front end.

If we want memory consumption to go in any direction, we would want to keep it down.

BTW, in Clan the corresponding type is only 32 bit large. The documentation says:

The same is also true for GCC's location_t.

Having a location map might be reasonable to ensure that each location is only allocated once. You could make it more useful however by storing location ranges - start and end of all tokens and expressions.

I digress from the topic of this PR however...

@kinke
Copy link
Contributor Author

kinke commented Sep 3, 2018

toHash() and opEquals() are now consistent (both case-sensitive on Windows). I don't really want to create an uppercase copy each time the hash is requested on Windows, and assume that once we operate with Loc instances, there are no case-insensitive duplicate filenames to worry about. That assumption is probably not valid for generic FileNames, that's why I left it in Loc. equals() should probably be case-sensitive then too actually.

If you think modifying ubiquitous Loc may have too much impact, I can move custom hashing and equivalence to the Key struct.


/* Checks for equivalence,
* a) comparing the filename contents (not the pointer), case-
* insensitively on Windows, and
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FYI, macOS also has a case insensitive filesystem by default.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, interesting. The FileName.equals() function used here only has a Windows special case.

// 2.082+
counters.update(Key(loc, prefix),
() => 1u, // insertion
(ref uint counter) // update
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had to use ref here, although it makes no sense in this case. To be fair, I haven't tried returning void and incrementing the ref directly.

Copy link
Contributor Author

@kinke kinke Sep 3, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Btw, thanks @n8sh for this AA.update(), I was looking for it in the docs (didn't want to hash twice for insertion), found it, implemented it, and then figured it's new in 2.082. ;)

@kinke
Copy link
Contributor Author

kinke commented Feb 2, 2019

Rebased + squashed. Can someone remove the WIP label please?

@jacob-carlborg jacob-carlborg removed the Review:WIP Work In Progress - not ready for review or pulling label Feb 2, 2019
@jacob-carlborg
Copy link
Contributor

Can someone remove the WIP label please?

Done.

@kinke
Copy link
Contributor Author

kinke commented Apr 14, 2019

@thewilsonator: Please try to push this. It was only supposed to rectify the IMO horrible code introduced with the important fix in #8542; I don't see why it has to rot here for more than half a year now.

If you think modifying ubiquitous Loc may have too much impact, I can move custom hashing and equivalence to the Key struct.

That 'offer' still stands and is IMO the only question mark; I'll be on vacation though.

@thewilsonator
Copy link
Contributor

I've got it in the AGM at dconf to make Loc an offset into a source manager, that would make the problem of making Loc bigger moot.

@dlang-bot dlang-bot merged commit 07f379f into dlang:master Apr 15, 2019
@kinke kinke deleted the stableIDs branch April 15, 2019 21:30
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants