Generate baselines for HtmlDocumentTest#2591
Conversation
NTaylorMullen
left a comment
There was a problem hiding this comment.
It's awesome to see it all come together. Had a few specific comments and one general comment:
It was hard to understand what source was parsed compared to the old serialization. This is due to the extra nesting we now have which is of course by design. It'd probably be valuable to spit out the combined token content of say an HtmlTextLiteral (same for other nodes of similar function) so what's being parsed is extra clear.
| SyntaxKind.HtmlTextLiteral - [16..18) - FullWidth: 2 - Slots: 1 - Gen<None> - SpanEditHandler;Accepts:Any | ||
| SyntaxKind.Whitespace;[ ]; | ||
| SyntaxKind.SingleQuote;[']; | ||
| SyntaxKind.HtmlBlock - [18..21) - FullWidth: 3 - Slots: 1 |
There was a problem hiding this comment.
What's the difference between HtmlBlock and HtmlMarkupBlock?
There was a problem hiding this comment.
This may change when I do pass 2 of HtmlParser rewrite but as of now here is what they mean,
HtmlBlock is just a holder that contains bunch of other nodes. E.g. Attribute value, where it can be a combination of different kinds of values
HtmlMarkupBlock is more conceptually meaningful and acts as a parent in multiple cases
There was a problem hiding this comment.
Think of HtmlBlock as just a wrapper for SyntaxList<RazorSyntaxNode>
There was a problem hiding this comment.
Super confusing, we should change that 😄 (don't care if we do now or later)
There was a problem hiding this comment.
I agree. I should change the name atleast.
| SyntaxKind.Whitespace;[ ]; | ||
| SyntaxKind.Text;[Bar]; | ||
| SyntaxKind.HtmlDocument - [0..14) - FullWidth: 14 - Slots: 1 - [Foo </div> Bar] | ||
| SyntaxKind.HtmlMarkupBlock - [0..14) - FullWidth: 14 - Slots: 1 |
There was a problem hiding this comment.
Is there ever a line that doesn't start with SyntaxKind.? Seems repetitive is all.
There was a problem hiding this comment.
Good point. I'll clean it up
There was a problem hiding this comment.
Is there a point to including the bounds + the width + the slots? I would imagine we just want the bounds.
The slot count from the number of children.
There was a problem hiding this comment.
Is there a point to including the bounds + the width + the slots?
I'll remove slots. But I feel the width is nice to have for easier understanding and debugging.
One other thing to note is that I displayed the entire content just for the top node. This way we don't have to switch back to the test to find the input. I know @NTaylorMullen wanted this.
| SyntaxKind.LeftBrace;[{]; | ||
| Code span - Gen<Stmt> - [LF] - AutoCompleteEditHandler;Accepts:Any,AutoComplete:[<null>];AtEOL - (2:0,2) - Tokens:1 | ||
| SyntaxKind.HtmlDocument - [0..17) - FullWidth: 17 - Slots: 1 - [@{LF} <html>LF] | ||
| SyntaxKind.HtmlMarkupBlock - [0..17) - FullWidth: 17 - Slots: 1 |
There was a problem hiding this comment.
Will every HtmlDocument always have 1 child which is an HtmlMarkupBlock?
There was a problem hiding this comment.
Yes. I added HtmlDocument to act as an ultimate parent for an HTML document. In other cases like if we parse a block of html from within a CSharp context, it won't be wrapped in an HtmlDocument
| SyntaxKind.HtmlDocument - [0..15) - FullWidth: 15 - Slots: 1 - [@{LF} LF<html>] | ||
| SyntaxKind.HtmlMarkupBlock - [0..15) - FullWidth: 15 - Slots: 1 | ||
| SyntaxKind.HtmlTextLiteral - [0..0) - FullWidth: 0 - Slots: 1 - Gen<Markup> - SpanEditHandler;Accepts:Any | ||
| SyntaxKind.Unknown;[]; |
There was a problem hiding this comment.
? Looks like it was there before as well but what does this represent?
There was a problem hiding this comment.
Correct me if I am wrong, I assume this is to get HTML intellisense before the csharp transition
There was a problem hiding this comment.
Ooooh this is a marker symbol. We should make this more first-class in the syntax tree. We should have specific node types for the C# and Html variants of this.
There was a problem hiding this comment.
How do you feel about renaming SyntaxKind.Unknown to SyntaxKind.Marker and wrap them in Html and CSharp literals appropriately? Right now everything that is SyntaxKind.Unknown is a marker symbol
| SyntaxKind.ForwardSlash;[/]; | ||
| SyntaxKind.Text;[a]; | ||
| SyntaxKind.CloseAngle;[>]; | ||
| SyntaxKind.HtmlTagBlock - [47..51) - FullWidth: 4 - Slots: 1 |
There was a problem hiding this comment.
Did this change parents? Just double checking because the old TagBlock had the same parental relationship as its sibling HtmlTextLiteral. Now it looks like this block was indented 1 parent but the sibling above was not.
There was a problem hiding this comment.
I don't think parents changed. The open tag block, content of the tag (Email Me), and the close tag block are all siblings just like before. It's probably that Github diff is being weird
| SyntaxKind.RazorMetaCode - [1..2) - FullWidth: 1 - Slots: 1 - Gen<None> - SpanEditHandler;Accepts:None | ||
| SyntaxKind.LeftBrace;[{]; | ||
| SyntaxKind.CSharpCodeBlock - [2..20) - FullWidth: 18 - Slots: 1 | ||
| SyntaxKind.CSharpStatementLiteral - [2..4) - FullWidth: 2 - Slots: 1 - Gen<Stmt> - AutoCompleteEditHandler;Accepts:Any,AutoComplete:[<null>];AtEOL |
There was a problem hiding this comment.
Ya, thinking more and more expression literals and statement literals should just be CSharpTextLiteral. My reasoning is that Statement is overloaded in C# and in Razor i.e. @{ ... } and if (...) {...}
There was a problem hiding this comment.
Okay that makes sense. This is a good time to bring up two other literal types I've defined,
CSharpHiddenLiteralSyntax - to represent the text that we don't want to render in the output (SpanChunkGenerator.Null in the old world)
CSharpNoneLiteralSyntax - to represent the text where we don't want any design time intellisense (SpanKind.None in the old world)
Thoughts on those?
There was a problem hiding this comment.
CSharpNoneLiteralSyntax
Hmm, is it prefixed with CSharp because it contains CSharp tokens? Makes me think we should have an unclassified token type / span kind. Basically if the node was named UnclassifiedTextLiteralSyntax with unclassified tokens it'd truly represent what was intended.
There was a problem hiding this comment.
As for CSharpHiddenLiteralSyntax, i'm uncertain. What's an example of a SpankChunkGenerator.Null again? Having a hard time remembering where we do that.
There was a problem hiding this comment.
Missed these questions earlier,
Hmm, is it prefixed with CSharp because it contains CSharp tokens?
We currently use it in only one place. Whitespace at the end of SingleLine extensible directives. Renaming it to unclassified should work, Actually, it can also contain CSharp comments.
What's an example of a SpankChunkGenerator.Null again?
One example is just like in #2594,
@{ @@Da }
^
That transition is a code span with SpanChunkGenerator.Null
| SyntaxKind.HtmlTextLiteral - [0..0) - FullWidth: 0 - Slots: 1 - Gen<Markup> - SpanEditHandler;Accepts:Any | ||
| SyntaxKind.Unknown;[]; | ||
| SyntaxKind.CSharpCodeBlock - [0..53) - FullWidth: 53 - Slots: 1 | ||
| SyntaxKind.CSharpDirective - [0..53) - FullWidth: 53 - Slots: 2 |
There was a problem hiding this comment.
I'd rename CSharpDirectiveX => RazorDirectiveX for your nodes.
A directive consists of a few things.
- transition
- Some identifier (in this case
section) - A place for parameters/tokens
- Possible body
Does the current implementation provide these separations? I'd figure that a directive node would have access to all of these components
There was a problem hiding this comment.
I'd rename CSharpDirectiveX => RazorDirectiveX for your nodes.
Hmm. But directives aren't all metacode right? Most of the time, it contains csharp too. Right now anything that is named RazorxyzSyntax are all things that are either metacode or comments.
Does the current implementation provide these separations? I'd figure that a directive node would have access to all of these components
It does. https://github.com/aspnet/Razor/blob/feature/razor-parser/src/Microsoft.AspNetCore.Razor.Language/Syntax/Syntax.xml#L168-L181
One difference though is that I've combined 3 and 4 as just an optional body. I didn't think having that separate would provide much value. We can always add it later if we need it.
There was a problem hiding this comment.
It does. https://github.com/aspnet/Razor/blob/feature/razor-parser/src/Microsoft.AspNetCore.Razor.Language/Syntax/Syntax.xml#L168-L181
One difference though is that I've combined 3 and 4 as just an optional body. I didn't think having that separate would provide much value. We can always add it later if we need it.
Ah, ya i'd definitely break that out at some point.
Hmm. But directives aren't all metacode right? Most of the time, it contains csharp too. Right now anything that is named
RazorxyzSyntaxare all things that are either metacode or comments.
They aren't but they aren't C# either. They don't default to C# or Html, their structure is defined by a descriptor; I bring up this point because that's what differentiates CodeBlock from MarkupBlock.
| SyntaxKind.Text;[Bar]; | ||
| SyntaxKind.HtmlDocument - [0..14) - FullWidth: 14 - Slots: 1 - [Foo </div> Bar] | ||
| SyntaxKind.HtmlMarkupBlock - [0..14) - FullWidth: 14 - Slots: 1 | ||
| SyntaxKind.HtmlTextLiteral - [0..4) - FullWidth: 4 - Slots: 1 - Gen<Markup> - SpanEditHandler;Accepts:Any |
There was a problem hiding this comment.
I've always felt like these things are kinda gibberish to me Gen<Markup> - SpanEditHandler;Accepts:Any - maybe I just never learned how to read it.
There was a problem hiding this comment.
Gen<Markup> is just the chunk generator which will go away. SpanEditHandler;Accepts:Any is just the EditHandler type and the AcceptedCharacters
| SyntaxKind.ForwardSlash;[/]; | ||
| SyntaxKind.Text;[div]; | ||
| SyntaxKind.CloseAngle;[>]; | ||
| SyntaxKind.HtmlTextLiteral - [10..14) - FullWidth: 4 - Slots: 1 - Gen<Markup> - SpanEditHandler;Accepts:Any |
There was a problem hiding this comment.
So an interesteing discussion is whether we want to go with the nomenclature Html... or Markup.... I like Markup better because it's more general. In the Blazor world we anticipate supporting non-HTML markup int he future. Putting HTML everywhere is a little less good to me. Thoughts?
There was a problem hiding this comment.
Interesting. Just to clear, do you mean you prefer Markup.. for everything that is HTML or specifically for a few nodes like TextLiteral?
There was a problem hiding this comment.
Should be consistent. Markup works for me.
There was a problem hiding this comment.
In the Blazor world we anticipate supporting non-HTML markup int he future.
@NTaylorMullen I understand we want it to be consistent. But if we name everything as Markup.., what would differentiate HTML and non-HTML markup when we support it in the future?
There was a problem hiding this comment.
We'd name it Blazor or something equivalent for the type of Markup that we're adding (if we had to). Html just seems too specific because it makes it seem like we understand things like JavaScript, CSS etc. I could get on board for HtmlStartTag though (when the time comes) because we know it's Html.
|
🆙 📅
Feedback not addressed (Will do at a later stage):
@NTaylorMullen @rynowak, No need to review the source changes, just reviewing the baseline changes should be sufficient. |
| return base.VisitMarkupTextLiteral(node); | ||
| } | ||
|
|
||
| public override SyntaxNode VisitMarkupEscapedTextLiteral(MarkupEscapedTextLiteralSyntax node) |
There was a problem hiding this comment.
Wait,MarkupEscapedTextLiteralSyntax and the CSharp equivalent ones are only used for escaped transitions right? is EscapedTextLiteral the right wording?
There was a problem hiding this comment.
Spoke offline. This is for more cases than just the transitions. Going to rename this to MarkupEphemeralTextLiteralSyntax to indicate these are represented in the tree but are not significant and won't be rendered in the output (This is one of the scenarios that should be considered to represent as trivia)
|
🆙 📅
|
| <Field Name="ValueSuffix" Type="MarkupTextLiteralSyntax" Optional="true" /> | ||
| </Node> | ||
| <Node Name="HtmlLiteralAttributeValueSyntax" Base="HtmlSyntaxNode"> | ||
| <Node Name="HtmlLiteralAttributeValueSyntax" Base="MarkupSyntaxNode"> |
There was a problem hiding this comment.
Some stuff is 'Markup' and some is still Html?
There was a problem hiding this comment.
Yes. This is intentional. I named the constructs we know are Html as Html like Attributes, HtmlComments etc
There was a problem hiding this comment.
so, what's the diff between 'markup' and html?
There was a problem hiding this comment.
Anything that is parsed in a markup context(read: not CSharp) and not necessarily HTML specific like text literals, Tags(they can also be xml tags) etc are "markup". Things that are specific to certain type of markup like HTML, XML etc are named respectively. For example, I can imagine having a node for XMLDeclaration and that will be named XMLDeclarationSyntax as opposed to MarkupDeclarationSyntax.
There was a problem hiding this comment.
Right, and attributes are in all forms of markup right?
There was a problem hiding this comment.
Spoke offline. We'll just name everything Markup. That will always be correct because Markup is a superset of Html. We can revisit and rename certain nodes if there is a need in the future.
There was a problem hiding this comment.
Right, and attributes are in all forms of markup right?
We can revisit and rename certain nodes if there is a need in the future.
Don't see why we made things we know are Html less specific. I disagree with the all-markup naming but don't feel strongly enough to push back. Need a RIP emoji 😉
There was a problem hiding this comment.
I think whatever you guys talked about here is wicked confusing. Doesn't pass the dan test.
|
🆙 📅 |
|
@rynowak signed-off offline |
* Generate baselines for HtmlDocumentTest * Feedback * More feedback * Rename all Html to Markup
* Generate baselines for HtmlDocumentTest * Feedback * More feedback * Rename all Html to Markup
* Generate baselines for HtmlDocumentTest * Feedback * More feedback * Rename all Html to Markup
* Generate baselines for HtmlDocumentTest * Feedback * More feedback * Rename all Html to Markup
* Generate baselines for HtmlDocumentTest * Feedback * More feedback * Rename all Html to Markup
* Generate baselines for HtmlDocumentTest * Feedback * More feedback * Rename all Html to Markup
Suggest reviewing this with w=1
@NTaylorMullen @rynowak, suggest keeping a lookout for the following when reviewing this,
Feedback and suggestions welcome on how to improve the diff in some way to make it more review friendly