Skip to content

function argument stringification: __traits(getCallerSource, symbol) (variadics work too)#7821

Closed
timotheecour wants to merge 4 commits intodlang:masterfrom
timotheecour:pr_arg_names_bytes
Closed

function argument stringification: __traits(getCallerSource, symbol) (variadics work too)#7821
timotheecour wants to merge 4 commits intodlang:masterfrom
timotheecour:pr_arg_names_bytes

Conversation

@timotheecour
Copy link
Contributor

@timotheecour timotheecour commented Feb 1, 2018

Stringify caller's function argument, analog to C's # stringification macro (except it's type safe).

A main use case is to allow finally writing nice logging and debug tools.

Purely library based alternatives all have drawbacks, see #7811 (comment) for example.

It also works with UFCS, special tokens FILE etc, see examples in test/runnable/testargnames.d

Eg:

string log(T...)(T a, string[T.length] names = __traits(getCallerSource, a)){
  import std.conv;
  return text(name, ": ", a);
}
log(1+2, "foo") // names = ["1+2", `"foo"`]

EDIT

  • added variadic support with clean syntax and semantics
  • I've change from __ARG_STRING__!a to __traits(getCallerSource, a) after feedback from reviewers [2]

NOTES

[1] I've measured that impact on memory is minimal [3] Optimizations are possible if needed, if we find use cases where the memory increase is more significant. Use new hidden flag -disposesrc to reproduce prexisting behavior of disposing of file contents right after parsing (getSource will in this case resort to using expr.toChars which was similar to the previous implementation, cf #7811); you can measure memory with and without -disposesrc

[2] pros/cons of __traits(getCallerSource, a) vs __ARG_STRING__:

  • no new keyword
  • that seems like the 1st use of __trait inside a function declaration
    pros/cons of __ARG_STRING__:
  • syntax is less verbose, which matters if it's used often
  • ARG_STRING is natural along the line of other tokens that are used for default initialization in caller’s namespace (FILE + friends)
  • but these other tokens don't use template arguments, unlike __ARG_STRING__

NOTE: using getCallerSource instead of getSource to avoid conflating with a future potential getSource

[3] detailed reproducible memory tests

brew install gnu-time

gtime -v generated/osx/release/64/dmd -unittest -version=StdUnittest -version=import_std -main -o- -i=std test/runnable/imports/import_world.d
#+ same with added `-disposesrc`
Elapsed (wall clock) time (h:mm:ss or m:ss): 0:52.96
Maximum resident set size (kbytes): 7711924

time is always very similar (with +/- 1 second)
Maximum resident set size varies quite a bit between runs (high variance) but the extra memory involved (at most 10M from du -sh phobos/std) is just 10M/7711924k = 1/771 of the total memory usage, so negligible.

On an extreme, unrealistic example that imports all of phobos and does 0 work (ie no unittests or main code), the total memory usage will be less so the fraction from keeping file contents in memory relative to total memory consumption will be higher, but still acceptable:

gtime -v generated/osx/release/64/dmd -version=import_std -main -o- -i=std  test/runnable/imports/import_world.d
#+ same with -disposesrc
Elapsed (wall clock) time (h:mm:ss or m:ss): 0:02.75 # same regardless
Maximum resident set size (kbytes): 663340 # without -disposesrc
Maximum resident set size (kbytes): 655432 # with -disposesrc

this 2nd testing scenario has very low variance, and we get:
(663340-655432)/655432.=1%
which is acceptable (and again, this is the unfavorable case as explained)

@dlang-bot
Copy link
Contributor

Thanks for your pull request, @timotheecour! We are looking forward to reviewing it, and you should be hearing from a maintainer soon.

Some tips to help speed things up:

  • smaller, focused PRs are easier to review than big ones

  • try not to mix up refactoring or style changes with bug fixes or feature enhancements

  • provide helpful commit messages explaining the rationale behind each change

Bear in mind that large or tricky changes may require multiple rounds of review and revision.

Please see CONTRIBUTING.md for more information.

Bugzilla references

Your PR doesn't reference any Bugzilla issue.

If your PR contains non-trivial changes, please reference a Bugzilla issue or create a manual changelog.

@dlang-bot dlang-bot added the Review:WIP Work In Progress - not ready for review or pulling label Feb 1, 2018

// NOTE: __ARG_STRING__ will pretty-print the expression so passing `t+t`
// or `t + t` won't make a difference:
// TODO: how to do formatting ignoring with dfmt?
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks, done!

@timotheecour timotheecour force-pushed the pr_arg_names_bytes branch 2 times, most recently from d978ba6 to cad4d05 Compare February 1, 2018 09:11
@timotheecour timotheecour changed the title [WIP] function argument stringification: __ARG_STRING__ (alternative implementation) [WIP] function argument stringification: __ARG_STRING__ (new implementation) Feb 1, 2018
@wilzbach
Copy link
Contributor

wilzbach commented Feb 1, 2018

I'm ok with doing that change if there's consensus [2]

I think we can get there quickly:

  • Vote here with "UP" if you like __traits
  • Vote with "DOWN" if you like __ARG_STRING__
  • Vote with "Confused" if you prefer a third, yet to be named option (or don't like this in general)
  • Vote with "Heart" if you don't care about __traits vs. __ARG_STRING__, but just want to finally use this in your code

@marler8997
Copy link
Contributor

Note that keeping the files in memory may open up other optimizations and features that weren't available before. This allows strings to be used directly from source. This enables better error messages in the semantic phase since the original source is available to pull context from. The current design which frees source files before semantic analysis may have been holding back superior designs for various solutions. I pose this question to the more seasoned dmd developers who may know of different ways this change could be taken advantage of.

@wilzbach
Copy link
Contributor

wilzbach commented Feb 1, 2018

One thing that comes to my mind is __traits(documentation) - #6872
It was closed because of the slight increase in memory consumption :/

@timotheecour
Copy link
Contributor Author

One thing that comes to my mind is __traits(documentation) - #6872
It was closed because of the slight increase in memory consumption :/

added evaluation (+ how to reproduce) of effect on max memory usage + time, see updated PR description. Seems like 0 effect on time and negligible effect on memory.

@ibuclaw
Copy link
Member

ibuclaw commented Feb 3, 2018

This enables better error messages in the semantic phase since the original source is available to pull context from.

You have file, line, and column information, you can pull this out of the file at any time during the error diagnostic and read the slice it points to.

@ibuclaw
Copy link
Member

ibuclaw commented Feb 3, 2018

I think a traits would be better. __ARG_STRING__ doesn't really look natural alongside other tokens.

You seem to have an unrelated change included here also (-vbytes), but I'll give the benefit of the doubt that this is still wip.

@d-random-contributor
Copy link

__traits(source) - it can be subsequently extended to get source of other things like lambda passed as template argument, IIRC somebody asked for such feature.

@d-random-contributor
Copy link

@ZombineDev was it you?

@timotheecour
Copy link
Contributor Author

You have file, line, and column information, you can pull this out of the file at any time during the error diagnostic and read the slice it points to.

  • the file contents (before this PR) is cleared from memory right after parsing, and before semantic analysis, so it was preventing to do that. With this PR, we have access to it;

  • file,line,col (even with sources still available) doesn't allow O(1) access in a slice; with this PR, we have O(1) access via bytes offset

@timotheecour
Copy link
Contributor Author

You seem to have an unrelated change included here also (-vbytes), but I'll give the benefit of the doubt that this is still wip.

the change is related to this PR: this PR introduces bytes offset which was needed for O(1) access to source content at a given Loc (again, line+col wasn't enough for O(1) access). -vbytes just exposes this bytes offset in compiler messages and is the analog to -vcolumns. Right now it's set as undocumented flag but I can make it a regular documented flag.

@ntrel
Copy link
Contributor

ntrel commented Feb 5, 2018

Just wanted to mention that a possible enhancement would be to support multiple arguments using , as a separator e.g.:

void f(int a, int b, string args = __traits(getArgument, a, b)) {writeln("Arguments: ", args);}
f(1, n % 3);

args would be 1, n % 3. In fact, using __traits(getArgument) without passing any symbols could be equivalent to passing the corresponding parameter names for each argument provided at the call site - default arguments would not be included. That would be a useful shortcut for logging.

@timotheecour timotheecour changed the title [WIP] function argument stringification: __ARG_STRING__ (new implementation) [WIP] function argument stringification: __traits(getSource, symbol) Feb 6, 2018
modules = new DsymbolTable();
}

extern (C++) void releaseResources()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Global comment: Add Ddoc comment for all new functions.

}
else
{
pragma(inline, true)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why force inline this?


/***********************************************************
*/
extern (C++) final class ArgStringInitExp : DefaultInitExp
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is an ArgStringInitExp? There's no documentation here.

{
version(with_debug){
import std.conv:text;
writelnL(file, ":", line);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

where/what is writelnL()?

extern(D)
final void error(string file=__FILE__, int line=__LINE__)(const(char)* format, ...) const
{
version(with_debug){
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

where/what is with_debug?


if(auto argexp = arg.isArgStringInitExp())
{
if(argexp.ident is null)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please don't invent a different indentation style. It's 4 columns.

}
if(is_variadic_param){
arg=argexp.resolveArgStrings(loc, sc, values);
} else{
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please don't invent new formatting styles.

arg = inlineCopy(arg, sc);
// __FILE__, __LINE__, __MODULE__, __FUNCTION__, and __PRETTY_FUNCTION__
arg = arg.resolveLoc(loc, sc);

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What does the following big block of code do? I don't have a clue, so I have nothing to compare the code to. Also, it's so large consider breaking it out into a separate function.

{
if(argexp.ident is null)
{
error(argexp.loc, "argexp.ident null");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This error message refers to compiler internals and will mean nothing to the user.

enum vararg_name = "_param_";
// TODO: define/use startsWith
auto ident2b=ident2.toString;
if(ident2b.length>vararg_name.length && ident2b[0..vararg_name.length]==vararg_name)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Global comment: Please use a space before and after operators. Please format code like in the rest of the front end.

}
}

// Get source code for `expr`
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this different from the code in hdrgen.d which turns ASTs back into strings?

uint linnum;
uint charnum;
// 0-based offset from origin in bytes (allows O(1) access)
uint bytes;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Adding this field, since Loc is attached to every AST node, will result in a very large increase in memory consumption.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

An offset should be called offset not bytes.

{
// instead of ',' to avoid confusing with charnum if !showColumns
buf.writeByte(':');
buf.print(bytes);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With line/column, I am not sure what use there would be for printing file offsets.


extern (C++) bool equals(ref const(Loc) loc) const
{
// TODO: how about: `return (bytes == loc.bytes) && FileName.equals(filename, loc.filename);` ?
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This implies that bytes is redundant with line, column

else if (arg == "-vbytes") // https://dlang.org/dmd.html#switch-bytes
params.showBytes = true;
else if (arg == "-disposesrc")
params.disposeSrcContent = true;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why is this optional under user control?

this.trust = TRUST.trusted;
}

extern(D) Parameter searchByName(const(char)[] name){
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Global comment: Need Ddoc comments for all new functions.

delegateFunctionPointer,

// 54
// 54 // TODO: instead of these non robust numbers, use `cast(TOK) number`
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The numbers are there because when I print out a TOK value, in order to figure out which one it is, I have to count the tokens.

error(e.loc, "TODO");
return new ErrorExp();
}
auto ret = new ArgStringInitExp(e.loc);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is allocated here, but is not initialized or used until the end of the function. It should be allocated there. Also, convention is to use result as the return value, not ret.

return new ErrorExp();
}
auto ret = new ArgStringInitExp(e.loc);
auto ident_expr=(*e.args)[0];
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

calling it arg0 would make more sense, especially since the next line verifies that it is a type, not an identifier.

auto ident_expr=(*e.args)[0];
auto t=isType(ident_expr);
if(!t){
error(e.loc, "TODO.2");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Internal errors should use assert.

// TODO: lookup?
auto id = Identifier.idPool(name, name.strlen);
ret.setIdent(id);
ret.exp=e;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps ArgStringInitExp should have a constructor that takes these two expressions as arguments.


if (e.ident == Id.getCallerSource)
{
e.error("`%s` unexpected in this context", Id.getCallerSource);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this be an assert instead?

string logSimple(T...)(T a, string[T.length] names = __traits(getCallerSource, a))
{
import std.conv;
return text(names, ": ", a); // see testargnames for a better log function
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use four spaces for indentation + don't reference to DMD's testsuite in the changelog.

@wilzbach
Copy link
Contributor

wilzbach commented Apr 9, 2018

ping @timotheecour

@timotheecour
Copy link
Contributor Author

I don't have time to work on this at the moment but anyone is welcome to reopen a PR based on this;
a few notes:

  • a prerequisite for this PR is to add bytes field, if that's a problem it'll block this PR.
  • if adding that field is unacceptable, an alternative is to store byte offset of each line start by adding an array bytes_of_line[] in class File or Module to allow fast access.

@wilzbach wilzbach added Review:Orphaned The author of the PR is no longer available and this PR can be adopted by anyone. Review:Phantom Zone Has value/information for future work, but closed for now labels Apr 9, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Review:Needs Work Review:Orphaned The author of the PR is no longer available and this PR can be adopted by anyone. Review:Phantom Zone Has value/information for future work, but closed for now Review:WIP Work In Progress - not ready for review or pulling

Projects

None yet

Development

Successfully merging this pull request may close these issues.

9 participants