function argument stringification: `__traits(getCallerSource, symbol)` (variadics work too) by timotheecour · Pull Request #7821 · dlang/dmd

timotheecour · 2018-02-01T00:09:18Z

Stringify caller's function argument, analog to C's # stringification macro (except it's type safe).

A main use case is to allow finally writing nice logging and debug tools.

Purely library based alternatives all have drawbacks, see #7811 (comment) for example.

It also works with UFCS, special tokens FILE etc, see examples in test/runnable/testargnames.d

Eg:

string log(T...)(T a, string[T.length] names = __traits(getCallerSource, a)){
  import std.conv;
  return text(name, ": ", a);
}
log(1+2, "foo") // names = ["1+2", `"foo"`]

EDIT

added variadic support with clean syntax and semantics
I've change from __ARG_STRING__!a to __traits(getCallerSource, a) after feedback from reviewers [2]

NOTES

this is a new implementation that avoids some of the issues in [WIP] function argument stringification: __ARG_STRING__ (replaced by 7821) #7811 ; in this new implementation, parsed D source file contents are kept in memory until after end of semantic analysis [1]
currently only allows binding to a function parameter (variadic or not); other cases such as template parameters could be handled in future PR without breaking backward compatibility
i’ve dfmt-ted test/runnable/testargnames.d but not the rest (option to format only diff-ed lines (like git clang-format) dlang-community/dfmt#159 would help…) ; please ignore whitespace issues for now

[1] I've measured that impact on memory is minimal [3] Optimizations are possible if needed, if we find use cases where the memory increase is more significant. Use new hidden flag -disposesrc to reproduce prexisting behavior of disposing of file contents right after parsing (getSource will in this case resort to using expr.toChars which was similar to the previous implementation, cf #7811); you can measure memory with and without -disposesrc

[2] pros/cons of __traits(getCallerSource, a) vs __ARG_STRING__:

no new keyword
that seems like the 1st use of __trait inside a function declaration
pros/cons of __ARG_STRING__:
syntax is less verbose, which matters if it's used often
ARG_STRING is natural along the line of other tokens that are used for default initialization in caller’s namespace (FILE + friends)
but these other tokens don't use template arguments, unlike __ARG_STRING__

NOTE: using getCallerSource instead of getSource to avoid conflating with a future potential getSource

[3] detailed reproducible memory tests

brew install gnu-time

gtime -v generated/osx/release/64/dmd -unittest -version=StdUnittest -version=import_std -main -o- -i=std test/runnable/imports/import_world.d
#+ same with added `-disposesrc`
Elapsed (wall clock) time (h:mm:ss or m:ss): 0:52.96
Maximum resident set size (kbytes): 7711924

time is always very similar (with +/- 1 second)
Maximum resident set size varies quite a bit between runs (high variance) but the extra memory involved (at most 10M from du -sh phobos/std) is just 10M/7711924k = 1/771 of the total memory usage, so negligible.

On an extreme, unrealistic example that imports all of phobos and does 0 work (ie no unittests or main code), the total memory usage will be less so the fraction from keeping file contents in memory relative to total memory consumption will be higher, but still acceptable:

gtime -v generated/osx/release/64/dmd -version=import_std -main -o- -i=std  test/runnable/imports/import_world.d
#+ same with -disposesrc
Elapsed (wall clock) time (h:mm:ss or m:ss): 0:02.75 # same regardless
Maximum resident set size (kbytes): 663340 # without -disposesrc
Maximum resident set size (kbytes): 655432 # with -disposesrc

this 2nd testing scenario has very low variance, and we get:
(663340-655432)/655432.=1%
which is acceptable (and again, this is the unfavorable case as explained)

dlang-bot · 2018-02-01T00:09:19Z

Thanks for your pull request, @timotheecour! We are looking forward to reviewing it, and you should be hearing from a maintainer soon.

Some tips to help speed things up:

smaller, focused PRs are easier to review than big ones
try not to mix up refactoring or style changes with bug fixes or feature enhancements
provide helpful commit messages explaining the rationale behind each change

Bear in mind that large or tricky changes may require multiple rounds of review and revision.

Please see CONTRIBUTING.md for more information.

Bugzilla references

Your PR doesn't reference any Bugzilla issue.

If your PR contains non-trivial changes, please reference a Bugzilla issue or create a manual changelog.

jacob-carlborg · 2018-02-01T07:39:37Z

test/runnable/testargnames.d

+
+  // NOTE: __ARG_STRING__ will pretty-print the expression so passing `t+t`
+  // or `t + t` won't make a difference:
+  // TODO: how to do formatting ignoring with dfmt?


See the documentation: https://github.com/dlang-community/dfmt#disabling-formatting

thanks, done!

wilzbach · 2018-02-01T10:58:58Z

I'm ok with doing that change if there's consensus [2]

I think we can get there quickly:

Vote here with "UP" if you like __traits
Vote with "DOWN" if you like __ARG_STRING__
Vote with "Confused" if you prefer a third, yet to be named option (or don't like this in general)
Vote with "Heart" if you don't care about __traits vs. __ARG_STRING__, but just want to finally use this in your code

marler8997 · 2018-02-01T17:56:05Z

Note that keeping the files in memory may open up other optimizations and features that weren't available before. This allows strings to be used directly from source. This enables better error messages in the semantic phase since the original source is available to pull context from. The current design which frees source files before semantic analysis may have been holding back superior designs for various solutions. I pose this question to the more seasoned dmd developers who may know of different ways this change could be taken advantage of.

wilzbach · 2018-02-01T18:03:29Z

One thing that comes to my mind is __traits(documentation) - #6872
It was closed because of the slight increase in memory consumption :/

timotheecour · 2018-02-02T00:30:09Z

One thing that comes to my mind is __traits(documentation) - #6872
It was closed because of the slight increase in memory consumption :/

added evaluation (+ how to reproduce) of effect on max memory usage + time, see updated PR description. Seems like 0 effect on time and negligible effect on memory.

ibuclaw · 2018-02-03T10:40:18Z

This enables better error messages in the semantic phase since the original source is available to pull context from.

You have file, line, and column information, you can pull this out of the file at any time during the error diagnostic and read the slice it points to.

ibuclaw · 2018-02-03T10:44:59Z

I think a traits would be better. __ARG_STRING__ doesn't really look natural alongside other tokens.

You seem to have an unrelated change included here also (-vbytes), but I'll give the benefit of the doubt that this is still wip.

d-random-contributor · 2018-02-03T12:05:16Z

__traits(source) - it can be subsequently extended to get source of other things like lambda passed as template argument, IIRC somebody asked for such feature.

d-random-contributor · 2018-02-03T12:14:14Z

@ZombineDev was it you?

timotheecour · 2018-02-04T06:13:57Z

You have file, line, and column information, you can pull this out of the file at any time during the error diagnostic and read the slice it points to.

the file contents (before this PR) is cleared from memory right after parsing, and before semantic analysis, so it was preventing to do that. With this PR, we have access to it;
file,line,col (even with sources still available) doesn't allow O(1) access in a slice; with this PR, we have O(1) access via bytes offset

timotheecour · 2018-02-04T06:17:48Z

You seem to have an unrelated change included here also (-vbytes), but I'll give the benefit of the doubt that this is still wip.

the change is related to this PR: this PR introduces bytes offset which was needed for O(1) access to source content at a given Loc (again, line+col wasn't enough for O(1) access). -vbytes just exposes this bytes offset in compiler messages and is the analog to -vcolumns. Right now it's set as undocumented flag but I can make it a regular documented flag.

ntrel · 2018-02-05T17:23:37Z

Just wanted to mention that a possible enhancement would be to support multiple arguments using , as a separator e.g.:

void f(int a, int b, string args = __traits(getArgument, a, b)) {writeln("Arguments: ", args);}
f(1, n % 3);

args would be 1, n % 3. In fact, using __traits(getArgument) without passing any symbols could be equivalent to passing the corresponding parameter names for each argument provided at the call site - default arguments would not be included. That would be a useful shortcut for logging.

…TRING__

WalterBright · 2018-02-09T07:30:00Z

src/dmd/dmodule.d

        modules = new DsymbolTable();
    }

+    extern (C++) void releaseResources()


Global comment: Add Ddoc comment for all new functions.

WalterBright · 2018-02-09T07:31:09Z

src/dmd/expression.d

    }
+    else
+    {
+        pragma(inline, true)


why force inline this?

WalterBright · 2018-02-09T07:32:37Z

src/dmd/expression.d


+/***********************************************************
+ */
+extern (C++) final class ArgStringInitExp : DefaultInitExp


What is an ArgStringInitExp? There's no documentation here.

WalterBright · 2018-02-09T07:33:49Z

src/dmd/expression.d

+        {
+            version(with_debug){
+                import std.conv:text;
+                writelnL(file, ":", line);


where/what is writelnL()?

WalterBright · 2018-02-09T07:34:16Z

src/dmd/expression.d

+        extern(D)
+        final void error(string file=__FILE__, int line=__LINE__)(const(char)* format, ...) const
+        {
+            version(with_debug){


where/what is with_debug?

WalterBright · 2018-02-09T07:38:08Z

src/dmd/expressionsem.d

+
+                if(auto argexp = arg.isArgStringInitExp())
+                {
+                  if(argexp.ident is null)


Please don't invent a different indentation style. It's 4 columns.

WalterBright · 2018-02-09T07:39:08Z

src/dmd/expressionsem.d

+                    }
+                    if(is_variadic_param){
+                        arg=argexp.resolveArgStrings(loc, sc, values);
+                    } else{


please don't invent new formatting styles.

WalterBright · 2018-02-09T07:41:00Z

src/dmd/expressionsem.d

                arg = inlineCopy(arg, sc);
-                // __FILE__, __LINE__, __MODULE__, __FUNCTION__, and __PRETTY_FUNCTION__
-                arg = arg.resolveLoc(loc, sc);
+


What does the following big block of code do? I don't have a clue, so I have nothing to compare the code to. Also, it's so large consider breaking it out into a separate function.

WalterBright · 2018-02-09T07:41:55Z

src/dmd/expressionsem.d

+                {
+                  if(argexp.ident is null)
+                  {
+                    error(argexp.loc, "argexp.ident null");


This error message refers to compiler internals and will mean nothing to the user.

WalterBright · 2018-02-09T07:42:52Z

src/dmd/expressionsem.d

+                            enum vararg_name = "_param_";
+                            // TODO: define/use startsWith
+                            auto ident2b=ident2.toString;
+                            if(ident2b.length>vararg_name.length && ident2b[0..vararg_name.length]==vararg_name)


Global comment: Please use a space before and after operators. Please format code like in the rest of the front end.

WalterBright · 2018-02-09T07:44:03Z

src/dmd/expressionsem.d

    }
 }
+
+// Get source code for `expr`


Why is this different from the code in hdrgen.d which turns ASTs back into strings?

WalterBright · 2018-02-09T07:45:17Z

src/dmd/globals.d

    uint linnum;
    uint charnum;
+    // 0-based offset from origin in bytes (allows O(1) access)
+    uint bytes;


Adding this field, since Loc is attached to every AST node, will result in a very large increase in memory consumption.

An offset should be called offset not bytes.

WalterBright · 2018-02-09T07:47:13Z

src/dmd/globals.d

+            {
+                // instead of ',' to avoid confusing with charnum if !showColumns
+                buf.writeByte(':');
+                buf.print(bytes);


With line/column, I am not sure what use there would be for printing file offsets.

WalterBright · 2018-02-09T07:48:02Z

src/dmd/globals.d


    extern (C++) bool equals(ref const(Loc) loc) const
    {
+      // TODO: how about: `return (bytes == loc.bytes) && FileName.equals(filename, loc.filename);` ?


This implies that bytes is redundant with line, column

WalterBright · 2018-02-09T07:49:05Z

src/dmd/mars.d

+            else if (arg == "-vbytes") // https://dlang.org/dmd.html#switch-bytes
+                params.showBytes = true;
+            else if (arg == "-disposesrc")
+                params.disposeSrcContent = true;


why is this optional under user control?

WalterBright · 2018-02-09T07:49:35Z

src/dmd/mtype.d

            this.trust = TRUST.trusted;
    }

+    extern(D) Parameter searchByName(const(char)[] name){


Global comment: Need Ddoc comments for all new functions.

WalterBright · 2018-02-09T07:50:47Z

src/dmd/tokens.d

    delegateFunctionPointer,

-    // 54
+    // 54 // TODO: instead of these non robust numbers, use `cast(TOK) number`


The numbers are there because when I print out a TOK value, in order to figure out which one it is, I have to count the tokens.

WalterBright · 2018-02-09T07:53:32Z

src/dmd/traits.d

+        error(e.loc, "TODO");
+        return new ErrorExp();
+    }
+    auto ret = new ArgStringInitExp(e.loc);


This is allocated here, but is not initialized or used until the end of the function. It should be allocated there. Also, convention is to use result as the return value, not ret.

WalterBright · 2018-02-09T07:54:30Z

src/dmd/traits.d

+        return new ErrorExp();
+    }
+    auto ret = new ArgStringInitExp(e.loc);
+    auto ident_expr=(*e.args)[0];


calling it arg0 would make more sense, especially since the next line verifies that it is a type, not an identifier.

WalterBright · 2018-02-09T07:54:56Z

src/dmd/traits.d

+    auto ident_expr=(*e.args)[0];
+    auto t=isType(ident_expr);
+    if(!t){
+        error(e.loc, "TODO.2");


Internal errors should use assert.

WalterBright · 2018-02-09T07:56:50Z

src/dmd/traits.d

+    // TODO: lookup?
+    auto id = Identifier.idPool(name, name.strlen);
+    ret.setIdent(id);
+    ret.exp=e;


Perhaps ArgStringInitExp should have a constructor that takes these two expressions as arguments.

WalterBright · 2018-02-09T07:57:33Z

src/dmd/traits.d


+    if (e.ident == Id.getCallerSource)
+    {
+        e.error("`%s` unexpected in this context", Id.getCallerSource);


Should this be an assert instead?

wilzbach · 2018-03-26T02:31:18Z

changelog/argname.dd

+string logSimple(T...)(T a, string[T.length] names = __traits(getCallerSource, a))
+{
+  import std.conv;
+  return text(names, ": ", a);  // see testargnames for a better log function


Use four spaces for indentation + don't reference to DMD's testsuite in the changelog.

wilzbach · 2018-04-09T16:16:38Z

ping @timotheecour

timotheecour · 2018-04-09T18:45:09Z

I don't have time to work on this at the moment but anyone is welcome to reopen a PR based on this;
a few notes:

a prerequisite for this PR is to add bytes field, if that's a problem it'll block this PR.
if adding that field is unacceptable, an alternative is to store byte offset of each line start by adding an array bytes_of_line[] in class File or Module to allow fast access.

timotheecour requested review from MartinNowak, WalterBright, dnadlinger, ibuclaw, mathias-lang-sociomantic and rainers as code owners February 1, 2018 00:09

dlang-bot added the Review:WIP Work In Progress - not ready for review or pulling label Feb 1, 2018

timotheecour force-pushed the pr_arg_names_bytes branch from d25a83e to e937185 Compare February 1, 2018 00:20

jacob-carlborg reviewed Feb 1, 2018

View reviewed changes

timotheecour force-pushed the pr_arg_names_bytes branch 2 times, most recently from d978ba6 to cad4d05 Compare February 1, 2018 09:11

timotheecour mentioned this pull request Feb 1, 2018

[WIP] function argument stringification: __ARG_STRING__ (replaced by 7821) #7811

Closed

timotheecour changed the title ~~[WIP] function argument stringification: __ARG_STRING__ (alternative implementation)~~ [WIP] function argument stringification: __ARG_STRING__ (new implementation) Feb 1, 2018

timotheecour force-pushed the pr_arg_names_bytes branch from de29def to 77a0589 Compare February 6, 2018 12:00

timotheecour added 2 commits February 6, 2018 04:01

function argument stringification: __ARG_STRING__

b63ed03

support variadics; use __traits(getSource, symbol) instead of __ARG_S…

1fc314e

…TRING__

timotheecour force-pushed the pr_arg_names_bytes branch from 77a0589 to 1fc314e Compare February 6, 2018 12:08

timotheecour changed the title ~~[WIP] function argument stringification: __ARG_STRING__ (new implementation)~~ [WIP] function argument stringification: __traits(getSource, symbol) Feb 6, 2018

WalterBright reviewed Feb 9, 2018

View reviewed changes

src/dmd/expression.d

}

else

{

pragma(inline, true)

Copy link

Member

WalterBright Feb 9, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why force inline this?

WalterBright reviewed Feb 9, 2018

View reviewed changes

wilzbach added the Review:Needs Work label Mar 26, 2018

wilzbach reviewed Mar 26, 2018

View reviewed changes

timotheecour closed this Apr 9, 2018

wilzbach added Review:Orphaned The author of the PR is no longer available and this PR can be adopted by anyone. Review:Phantom Zone Has value/information for future work, but closed for now labels Apr 9, 2018

ibuclaw mentioned this pull request Aug 31, 2018

Rework #8542 wrt. deterministic IDs via generateIdWithLoc() #8633

Merged

Uh oh!

Conversation

timotheecour commented Feb 1, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

EDIT

NOTES

Uh oh!

dlang-bot commented Feb 1, 2018

Bugzilla references

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

wilzbach commented Feb 1, 2018

Uh oh!

marler8997 commented Feb 1, 2018

Uh oh!

wilzbach commented Feb 1, 2018

Uh oh!

timotheecour commented Feb 2, 2018

Uh oh!

ibuclaw commented Feb 3, 2018

Uh oh!

ibuclaw commented Feb 3, 2018

Uh oh!

d-random-contributor commented Feb 3, 2018

Uh oh!

d-random-contributor commented Feb 3, 2018

Uh oh!

timotheecour commented Feb 4, 2018

Uh oh!

timotheecour commented Feb 4, 2018

Uh oh!

ntrel commented Feb 5, 2018

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

timotheecour commented Feb 1, 2018 •

edited

Loading