Skip to content

Comments

reduce Loc.sizeof by 8 bytes#15199

Merged
WalterBright merged 1 commit intodlang:masterfrom
WalterBright:filenameArray
May 9, 2023
Merged

reduce Loc.sizeof by 8 bytes#15199
WalterBright merged 1 commit intodlang:masterfrom
WalterBright:filenameArray

Conversation

@WalterBright
Copy link
Member

by replacing the pointer to a filename string with a ushort index into an array.

Also reduced the column number to a ushort, because 65,000 character lines is good enough for anyone.

Could probably convert all filename pointers to an index.

Since Loc is used everywhere, this should have a decent impact.

We could get the size to 4 bytes if willing to restrict the number of lines, chars, and files.

@WalterBright WalterBright added the Severity:Refactoring No semantic changes to code label May 8, 2023
@WalterBright WalterBright requested a review from ibuclaw as a code owner May 8, 2023 07:03
@dlang-bot
Copy link
Contributor

Thanks for your pull request, @WalterBright!

Bugzilla references

Your PR doesn't reference any Bugzilla issue.

If your PR contains non-trivial changes, please reference a Bugzilla issue or create a manual changelog.

Testing this PR locally

If you don't have a local development environment setup, you can use Digger to test this PR:

dub run digger -- build "master + dmd#15199"

@WalterBright WalterBright force-pushed the filenameArray branch 3 times, most recently from 2cfe6bc to 7d8e4bc Compare May 8, 2023 07:49
@thewilsonator
Copy link
Contributor

FYI @kinke @ibuclaw

@dkorpel
Copy link
Contributor

dkorpel commented May 8, 2023

I have also looked into doing this, but didn't get around to implementing it in a PR yet. Nice work!

Since Loc is used everywhere, this should have a decent impact.

Yes, last time I checked, 40 out of 440 MB RAM was spent on Loc when compiling druntime, so reducing its size is definitely worthwhile.

We could get the size to 4 bytes if willing to restrict the number of lines, chars, and files.

Indeed. One way to do this is to make a Loc a single integer, then have a separate map from this integer to a file index, and also have a map from the remaining file byte offset to line number. Column number can be computed from subtracting the line's file byte offset.

However, that puts a limit of 4GB on the input source code per compiler invocation. That should be enough for anyone today, but I can imagine someone is going to hit that some day.

//printf("setting %s\n", name);
filenames.push(name);
fileIndex = cast(ushort)filenames.length;
assert(fileIndex); // no overflow
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Won't this result in many duplicates?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ironically, no, for the same reason the lexer doesn't lex the same file multiple times.

@WalterBright WalterBright force-pushed the filenameArray branch 3 times, most recently from 7f3ce4b to f1445c9 Compare May 8, 2023 20:04
@kinke
Copy link
Contributor

kinke commented May 8, 2023

[I think shrinking it further to 4 bytes would have much less of an impact, as we'd need to additionally check for field placements - e.g. Dsymbol.loc currently lies inbetween 2 refs, so would be padded.]

@WalterBright WalterBright merged commit a98aef0 into dlang:master May 9, 2023
@WalterBright WalterBright deleted the filenameArray branch May 9, 2023 00:08
kinke added a commit to symmetryinvestments/dmd that referenced this pull request Sep 13, 2023
* Print an ICE message for Loc.filename overflows
* Increase the Loc size again, raising the file name limit from 64K
  to 4G but keeping the index indirection from dlang#15199. The `Loc`
  size for DMD-as-a-library is 16 bytes then, and 12 bytes otherwise.
kinke added a commit to symmetryinvestments/dmd that referenced this pull request Sep 13, 2023
* Print an ICE message for Loc.filename overflows
* Increase the Loc size again, raising the file name limit from 64K
  to 4G but keeping the index indirection from dlang#15199. The `Loc`
  size for DMD-as-a-library is 16 bytes then, and 12 bytes otherwise.
RazvanN7 pushed a commit that referenced this pull request Sep 14, 2023
* Print an ICE message for Loc.filename overflows
* Increase the Loc size again, raising the file name limit from 64K
  to 4G but keeping the index indirection from #15199. The `Loc`
  size for DMD-as-a-library is 16 bytes then, and 12 bytes otherwise.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Severity:Refactoring No semantic changes to code

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants