Adding support for __traits(documentation, ...) by MasonMcGill · Pull Request #3531 · dlang/dmd

MasonMcGill · 2014-05-07T04:29:24Z

__traits(comment, expr) now evaluates to the documentation comment on expr, if it exists and is available. Comments will only be available if DMD is invoked with the "-D" flag. If no comment is available for expr, __traits(comment, expr) evaluates to the empty string.

Unlike __traits(identifier, expr), __traits(comment, expr) will not report an error if expr is not a symbol or anonymous function. It will instead evaluate to the empty string. This is in the interest of ease of use.

This feature is intended to aide automatic wrapper generation, metaprogramming (e.g. forwarding documentation from a template argument), and simplifying documentation generators. It is not intended to be involved in providing application functionality.

Hackerpilot · 2014-05-07T04:47:23Z

Looks useful.

TurkeyMan · 2014-05-07T06:16:04Z

Should it be called 'comment'?
It looks like it's only intended to work with ddoc comment blocks (/** */ or ///), so possibly misleading to call the trait 'comment', since it's only a specific form of comment?

TurkeyMan · 2014-05-07T06:17:12Z

test/runnable/traits.d

I find it curious that these strings have a '\n' at the end... why would that be desirable or expected?

The newline is part of the comment as far as lexical analysis is concerned. If you need the stripped comment, just run it through std.string.strip.

Newbie question: It looks like StringExp works with C-style strings and doesn't take ownership of its data. Stripping a string may require making a copy of it. How can I do this without leaking memory?

DMD leaks memory all over. The idea is that a GC should be used in DMD, it just isn't done that way today. So in general, leaking memory is 'fine'...

Although I question the need to do any string manipulation in the compiler.

jacob-carlborg · 2014-05-07T06:22:45Z

Please squash your commits when done.

MasonMcGill · 2014-05-07T06:24:06Z

In DMD, internally, the "comment" field of a symbol ends in a newline. I suppose this has to do with how they are initially parsed. Generally, spaces newlines will be stripped before HTML or PDF documentation generation, so it doesn't seem like a major issue.

With respect to the term "comment", this is the field name in the DMD JSON output.

I think "notes" or "documentation" might also be good trait names, but I don't know if either is good enough to warrant breaking consistency with DMD JSON output and DMD internals. Thoughts?

jacob-carlborg · 2014-05-07T06:24:30Z

Could you please make a pull request for updating the documentation as well.

jacob-carlborg · 2014-05-07T06:26:59Z

I agree with @TurkeyMan. It might be confusing if it only works for ddoc comments. "documentation", "ddoc" or similar would be better.

TurkeyMan · 2014-05-07T06:30:53Z

I think the newline should be stripped before returning it as a __traits(), otherwise user code will constantly have to strip it manually.

"documentation" or "ddoc" is probably what I'd choose.
The way I see it, most end-users aren't DMD contributors, so I don't think that consistency with the compiler source has any significant weight if the trade-off is language clarity.
I'd also imagine very few users would be particularly upset that a different term was used in the json output. Most users never use/look at that either, it's something that's more likely to be digested by tooling.

For my money, code clarity trumps.

MasonMcGill · 2014-05-07T06:35:45Z

My impression (mostly based on experience with Python documentation) is that "ddoc" refers to the standard way to associate comments with symbols, and "documentation" may refer to any kind of documentation (web-based tutorials, examples, etc.)

I read __traits(comment, expr) as "get me the comment associated with expr", and since ddoc is the way to associate comments with symbols, it's unambiguous.

However, if more people prefer "documentation", I'm happy to change it. I'm honestly more interested in using this feature than designing it : )

jacob-carlborg · 2014-05-07T06:37:18Z

I read __traits(comment, expr) as "get me the comment associated with expr", and since ddoc is the way to associate comments with symbols, it's unambiguous.

Yes, but does it work both for ddoc comments and regular comments?

TurkeyMan · 2014-05-07T06:43:12Z

I guess the argument is that there's no rule to associate a regular comment with a particular symbol, so the concept is kinda meaningless. Doesn't mean it will always be that way in the future though...
I think I prefer 'ddoc', it's quite specific that it intends ddoc style associated comments.

MasonMcGill · 2014-05-07T06:45:33Z

With regards to the newline and extra spaces, should only the first/last be stripped, or all of them? Python leaves them in, so formatting applications can choose to deal with them as they wish.

To me, formatting docstrings seems like a separate issue. It's handled by different code, and used for more than just __traits (like JSON and HTML output). I'm sure anyone interested in working on it could do so before or after the new trait is added.

MasonMcGill · 2014-05-07T06:48:50Z

I don't know how many D users know that D's documentation comments are called "ddoc". Like Python's "docstrings", I wouldn't be surprised if many people have seen them but don't know they have a name.

"documentation" and "comment" seem more discoverable.

MasonMcGill · 2014-05-07T06:52:41Z

It looks like documentation comment retrieval is performed with __doc__ in Python and documentation in Common Lisp:
http://en.wikipedia.org/wiki/Docstring

yebblies · 2014-05-07T06:53:58Z

With regards to the newline and extra spaces, should only the first/last be stripped, or all of them? Python leaves them in, so formatting applications can choose to deal with them as they wish.

Leave internal, strip at beginning and end. Leading/trailing whitespace does not add information.

"documentation" and "comment" seem more discoverable.

"documentation" is fine but "comment" is imprecise. There is no need to match the internal compiler name unless the alternative is inventing new terminology.

ghost · 2014-05-07T06:56:35Z

~~-D is not required for this feature to work.~~

Edit: Wait, the tests test for empty strings. I'll investigate.

Edit2: Well, they should not require -D, or the trait is useless for metaprogramming. I'll see if this can be worked around.

jacob-carlborg · 2014-05-07T07:04:35Z

The documentation uses "Ddoc".

TurkeyMan · 2014-05-07T07:06:54Z

WRT worrying about 'ddoc' being a word that's known by many users; I expect that this __traits will likely be wrapped up in a suite of phobos helper's, which may parse and return details like args, return values, etc... Users probably won't come in contact with the __traits directly once it's fleshed out.

ghost · 2014-05-07T07:13:36Z

Apparently the lexer is set up so it completely discards comments when -D is not enabled. This can be worked around by changing this line in module.c:

Parser p(this, buf, buflen, docfile != NULL);

To:

// always lex comments, __traits(comment) might require it
bool doDocComment = true;
Parser p(this, buf, buflen, doDocComment);

The above just forwards this option to the Lexer.

The question is will this slow down compilation? I haven't tested the performance. Either way it's really awkward that you would need to enable -D in order to use a trait, you're essentially writing documentation files to disk even if all you're doing is metaprogramming with a trait.

CC'ing @WalterBright and @9rnsr: Would always lexing ddoc comments introduce a major slowdown in compilation speed? If not then I think we should always lex them, otherwise this trait becomes awkward to use.

MasonMcGill · 2014-05-07T07:13:37Z

True "Ddoc" is in the documentation, but I still suspect more people have seen Ddoc comments than know what they are.

I'm in favor of "documentation", for the reason above, for its use in other languages, and because I'm not sure what the proper CamelCase capitalization of "Ddoc" is. Also, new users looking for the feature might be more likely to search dlang.org for "documentation" than "ddoc".

WRT worrying about 'ddoc' being a word that's known by many users; I expect that this __traits will likely be wrapped up in a suite of phobos helper's, which may parse and return details like args, return values, etc... ie, users probably won't come in contact with the __traits directly once it's fleshed out.

This is true. But the __traits name will probably inform the std.trait name, so we might as well sort it out now.

MasonMcGill · 2014-05-07T07:19:07Z

Apparently the lexer is set up so it completely discards comments when -D is not enabled. This can be worked around by changing this line in module.c.

This is great! Hopefully extracting ddoc comments isn't too much overhead.

alexrp · 2014-05-07T07:48:48Z

I disagree with making this trait silently fail for symbols that don't exist. If that's the behavior you want (and I sincerely believe this is the exception, not the rule) you can get it by checking with __traits(compiles, ...) first.

Failing loudly by default in that case also makes sense if we're going to make it so this feature doesn't depend on -D.

MasonMcGill · 2014-05-07T07:54:32Z

I disagree with making this trait silently fail for symbols that don't exist. If that's the behavior you want (and I sincerely believe this is the exception, not the rule) you can get it by checking with __traits(compiles, ...) first.

This is the behavior I'm most unsure about, as well. Does anyone have a conflicting opinion?

bearophile · 2014-05-07T08:48:13Z

It is not intended to be involved in providing application functionality.<

People will use it in all ways possible.

__traits(documentation, expr) or __traits(ddoc, expr) are fine for me. The second is a little shorter, and a little more precise. I'd like this feature to work even when you don't use -D. I suggest to add to the unittest the retrieval of the module ddoc (the one at the top). Automatically stripping the leading newline seems good. I think it's acceptable to return an empty string when there is no ddoc available.

TurkeyMan · 2014-05-07T11:29:06Z

I agree that it shouldn't silently succeed if the symbol doesn't exist.
If the symbol doesn't exist; error. If the symbol exists but isn't documented; empty string.

Also, strip whitespace from both ends?

MasonMcGill · 2014-05-07T12:25:53Z

test/compilable/extra-files/json.out

This file is the expected output for a JSON emission test. It was changed because DMD no longer ignores documentation comments. Previously, DMD would ignore Ddoc comments unless it was run with "-D".

TurkeyMan · 2014-05-07T16:40:13Z

I often work with systems where you interact with editors and various other tooling, and that tooling benefits significantly from being able to present API information in the editor's GUI.
Compile time reflection is used to provide data via an editor API which the tooling can make use of. Code being able to tell you about itself is the definition of reflection, and I don't see why a line should be drawn at the human readable information when everything else is thoroughly reflected.

To separate this data from the binary is error-prone in my experience, especially when remote debugging (which is standard practise in gamedev) and the filesystem is not present. Without embedding such data in the binary, an exe is represented by a bundle of files including the associated metadata (docs in this case), and losing sync becomes very likely.

I definitely want the reflected documentation embedded in the exe. It simplifies everything.
Currently, I use UDA's to manually tag duplicates of the documentation which is also present in the ddoc header, and I find these duplicate strings often fall out of sync. DRY man ;)
/** blah blah */
@editor(x, y, "blah blah")
void func();

MasonMcGill · 2014-05-07T22:56:42Z

I cannot see benefit for the processing of ddoc comment during compilation. I think that ddoc comment is meta-information which is higher than the code. I think processing high-level layer information (ddoc comment) in low level layer (CTFE interpretation) is bad information model.

I think you're right when it comes to application or systems programming, but when it comes to library programming, Ddoc notes have become more than comments on the source: they're essentially a user interface. D has already bent over backwards to support interfacing with lower level languages (extern (C), extern (C++), etc.), even though one could argue that ABI directives are "lower than the source". Consider this trait a small feature to improve interfacing with scripting languages, development tools, and human readers.

D currently has every facility necessary for generating foreign language bindings at compile-time, except this. In my domain (academic science), providing a library-based replacement for C++ and SWIG--with no need for Makefiles, configuration files, or manual memory management--could be huge.

I'm developing a system to that end, but currently I'm stuck with a workaround similar to Manu's: using

@("
  This function bakes beans.
")
void bake(Bean b) { /* ... */ }

instead of

/**
 * This function bakes beans.
 */
void bake(Bean b) { /* ... */ }

I opened this pull request because I didn't want to ask my users (who may want to wrap existing code) to do the same.

MartinNowak · 2014-05-09T17:57:40Z

I'm kind of worried about documentation affecting the semantic meaning of a program.

TurkeyMan · 2014-05-10T01:58:34Z

Don't be worried, it's amazing! :)

SFrijters · 2014-06-26T09:12:46Z

This functionality would be quite useful to me; is there any chance this will be ready for the 2.066 release?

MasonMcGill · 2014-06-26T18:48:05Z

Should I add the compiler flag discussed with @WalterBright and @jacob-carlborg (see file comments)?

CyberShadow · 2014-08-05T01:24:28Z

This feature would be nice to have. The alternative would be that whenever documentation reflection is required, you are forced to use UDAs for documentation, which is rather ugly, unusable with -D, and doesn't work everywhere (e.g. enum members).

I don't see a problem with documentation comments affecting program behavior. I think this only emphasizes the reasons for which we have regular comments and documentation comments.

Also, I'm very skeptical about the memory usage argument. The ASCII text for all documentation is a drop in the bucket of the current memory consumption needed to compile a program. The entire Phobos source code is 8 MB in total, which is practically nothing compared to how much memory is needed to actually build it.

[Shameless plug] Here is my use case: http://blog.thecybershadow.net/2014/08/05/ae-utils-funopt/

MartinNowak · 2014-08-20T14:08:06Z

I would certainly agree to this if __traits(documentation, symbol) were only available at runtime (as a string constant). Having it available during compilation means it can affect program execution which goes against normal intuition and complicates any refactoring/language tools.

/// a + b
int foo(int a, int b) { return mixin(__traits(documentation, __FUNCTION__)); }

Unless there are good reasons to allow this, we shouldn't accidentally introduce it.

CyberShadow · 2014-08-20T14:18:12Z

How would that even work? You can't e.g. enumerate a struct's methods at runtime.

MartinNowak · 2014-08-20T14:20:39Z

How would that even work? You can't e.g. enumerate a struct's methods at runtime.

Make __traits(documentation, __FUNCTION__) return a reference to a constant string symbol, so it can't be interpreted at compile time but read at runtime.

mihails-strasuns · 2014-08-20T14:54:07Z

Making it runtime string / struct sounds like a good compromise to me.

TurkeyMan · 2014-08-21T03:57:15Z

It sounds pretty lame to me. There can be an awful lot of text in the docs, and I don't want my exe bigger by that amount of raw data. It needs to be processable, so I can pick and choose the bits that are relevant.

Like we've said before, there are many much better ways to produce a string literal that may be operated on in D. People aren't going to go around making /** ... */ strings 'because they can', and if they do, why is that your problem? I am confident you're not going to see it become some sort of 'trend' which will get out of control.

The reason I want this is so I can populate separate editor tools with information relating to the objects they are editing. The binary defines the objects, and the properties they have. If you update the binary, typically you have to update the editor. It's a classic problem for decades in my industry, and it's a lame state of affairs.
With this __traits, we could parse the documentation blocks and feed it through the existing editor API to populate the editor UI with useful and correct information that can't drift out of sync with the rest of the toolset.
The key advantage is that, since the state of editor metadata is embedded in the binary itself, it's possible for a single editor instance to edit multiple binaries (a daily routine when dealing with cross-platform code), which has traditionally been a huge pain in the butt.

Currently, we do what I have described by abusing UDA's. We stick editor documentation text in a UDA, and it is almost always an exact duplicate of the ddoc comment.
For things that are exposed to the editor, the documentation is repeated; once for ddoc, and once again pasted in a string in a UDA to be supplied via the editor API to the tools.
This is pretty error prone; nobody is familiar with it yet, highly vulnerable to human error, and if we agree that duplicating code sucks, duplicating comment data sucks even more, because you never get any warning when it goes out of sync...

MartinNowak · 2014-08-27T11:02:58Z

Well accessing documentation at runtime can also change the program behavior so I withdraw my argument. In any case we should carefully evaluate this solution against alternatives, e.g. dmd's json output or customized ddoc macros.

mleise · 2014-11-05T00:21:35Z

In the way the way Ddoc affects a web browser's behavior I think it is a good thing. The program might itself want to add tool tips for any See_Also in the Ddoc. And being a systems language there is a lot of things people can do, but shouldn't, like going around bounds checks with arr.ptr[idx]. Add a comment to the trait's documentation: "Documentation is not code and it is not checked for semantic errors. So use this trait only to format and display documentation and the appropriate other traits to reflect on parameter types and similar in order to build code."

bearophile · 2014-11-05T10:38:33Z

@mleise >"Documentation is not code and it is not checked for semantic errors.<

Currently the D compilers thankfully verify the correctness of the arguments of functions inside the function ddoc.

mleise · 2014-11-05T10:44:42Z

@bearophile It checks that you didn't miss(spell) a name, yes. But we still don't have exception lists on functions that the compiler could compare with Ddoc "Throws:" for example.

bearophile · 2014-11-06T00:30:19Z

@mleise I think probably there's no interest in reintroducing checked exceptions from the window after they were thrown out of the door.

andralex · 2015-02-07T22:27:57Z

This has apparently reached an impasse. Should we close in a few days?

DmitryOlshansky · 2015-06-26T12:13:39Z

Few days struck long ago, just saying...

CyberShadow · 2015-06-26T12:17:51Z

Currently, we do what I have described by abusing UDA's. We stick editor documentation text in a UDA, and it is almost always an exact duplicate of the ddoc comment.

I think this is a good argument in favor of this change. Either that, or there should be a way to do the reverse (populate symbol documentation somehow from UDAs).

Kapps · 2015-07-02T15:19:38Z

This is a very useful feature. The editor documentation is likely the most common use case, but it is a very prevalent one. C# has the [Description] attribute, and in practice I've found that it's pretty much always a copy and paste of the documentation, with the result being that I tend to not document the property in order to avoid the duplication.

Yes, it's possible to go out of your way to abuse it in a way that makes no sense, but that's possible with virtually every feature in D. You could inject things FILE or LINE in your code in much the same way to invoke different code depending on the module/line/function/package calling your function. You could even use something like "#line 99999" and invoke special code inside a static if to add members to a struct if LINE is 99999. Many features add potential for abuse like this, and while it is a good thing to avoid, at some point you just have to hope people are sane enough to not abuse them. That being said, I do like the idea of it not being possible to use at compile-time, but it would be nice to select at compile-time (by using the trait) which functions should actually have the documentation text included in the runtime executable.

wilzbach · 2017-06-08T06:30:20Z

@MasonMcGill and others: I revived this PR at #6872
Feedback on the revival is welcome ;-)

TurkeyMan reviewed May 7, 2014
View reviewed changes

Add support for __traits(documentation, ...)

1d3aa7e

MasonMcGill reviewed May 7, 2014
View reviewed changes

MasonMcGill mentioned this pull request May 7, 2014

Documentation for __traits(documentation, ...) dlang/dlang.org#574

Closed

MasonMcGill changed the title ~~Adding support for __traits(comment, ...)~~ Adding support for __traits(documentation, ...) May 7, 2014

ghost added Enhancement labels Aug 4, 2014

wilzbach added a commit to wilzbach/dmd that referenced this pull request Jun 8, 2017

Revive dlang#3531 - Adding support for __traits(documentation, ...)

ec393c1

wilzbach added a commit to wilzbach/dmd that referenced this pull request Jun 8, 2017

Revive dlang#3531 - Adding support for __traits(documentation, ...)

e399941

wilzbach added a commit to wilzbach/dmd that referenced this pull request Jun 8, 2017

Revive dlang#3531 - Adding support for __traits(documentation, ...)

1814029

wilzbach mentioned this pull request Jun 8, 2017

Revive #3531 - Adding support for __traits(documentation, ...) #6872

Closed

wilzbach closed this Jun 8, 2017

Uh oh!

Conversation

MasonMcGill commented May 7, 2014

Uh oh!

Hackerpilot commented May 7, 2014

Uh oh!

TurkeyMan commented May 7, 2014

Uh oh!

TurkeyMan May 7, 2014

Choose a reason for hiding this comment

Uh oh!

alexrp May 7, 2014

Choose a reason for hiding this comment

Uh oh!

MasonMcGill May 7, 2014

Choose a reason for hiding this comment

Uh oh!

alexrp May 7, 2014

Choose a reason for hiding this comment

Uh oh!

jacob-carlborg commented May 7, 2014

Uh oh!

MasonMcGill commented May 7, 2014

Uh oh!

jacob-carlborg commented May 7, 2014

Uh oh!

jacob-carlborg commented May 7, 2014

Uh oh!

TurkeyMan commented May 7, 2014

Uh oh!

MasonMcGill commented May 7, 2014

Uh oh!

jacob-carlborg commented May 7, 2014

Uh oh!

TurkeyMan commented May 7, 2014

Uh oh!

MasonMcGill commented May 7, 2014

Uh oh!

MasonMcGill commented May 7, 2014

Uh oh!

MasonMcGill commented May 7, 2014

Uh oh!

yebblies commented May 7, 2014

Uh oh!

ghost commented May 7, 2014

Uh oh!

jacob-carlborg commented May 7, 2014

Uh oh!

TurkeyMan commented May 7, 2014

Uh oh!

ghost commented May 7, 2014

Uh oh!

MasonMcGill commented May 7, 2014

Uh oh!

MasonMcGill commented May 7, 2014

Uh oh!

alexrp commented May 7, 2014

Uh oh!

MasonMcGill commented May 7, 2014

Uh oh!

bearophile commented May 7, 2014

Uh oh!

TurkeyMan commented May 7, 2014

Uh oh!

MasonMcGill May 7, 2014

Choose a reason for hiding this comment

Uh oh!

TurkeyMan commented May 7, 2014

Uh oh!

MasonMcGill commented May 7, 2014

Uh oh!

MartinNowak commented May 9, 2014

Uh oh!

TurkeyMan commented May 10, 2014

Uh oh!

SFrijters commented Jun 26, 2014

Uh oh!

MasonMcGill commented Jun 26, 2014

Uh oh!

CyberShadow commented Aug 5, 2014