Skip to content
This repository was archived by the owner on Dec 19, 2018. It is now read-only.

Generate baselines for HtmlDocumentTest#2591

Merged
ajaybhargavb merged 4 commits into
feature/razor-parserfrom
ajbaaska/baseline-html-document
Sep 18, 2018
Merged

Generate baselines for HtmlDocumentTest#2591
ajaybhargavb merged 4 commits into
feature/razor-parserfrom
ajbaaska/baseline-html-document

Conversation

@ajaybhargavb
Copy link
Copy Markdown
Contributor

@ajaybhargavb ajaybhargavb commented Sep 12, 2018

Suggest reviewing this with w=1

@NTaylorMullen @rynowak, suggest keeping a lookout for the following when reviewing this,

  • Inconsistencies
  • Node names and structure
  • How each node is serialized (what information to include/exclude when serializing the syntax nodes)

Feedback and suggestions welcome on how to improve the diff in some way to make it more review friendly

Copy link
Copy Markdown

@NTaylorMullen NTaylorMullen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's awesome to see it all come together. Had a few specific comments and one general comment:

It was hard to understand what source was parsed compared to the old serialization. This is due to the extra nesting we now have which is of course by design. It'd probably be valuable to spit out the combined token content of say an HtmlTextLiteral (same for other nodes of similar function) so what's being parsed is extra clear.

SyntaxKind.HtmlTextLiteral - [16..18) - FullWidth: 2 - Slots: 1 - Gen<None> - SpanEditHandler;Accepts:Any
SyntaxKind.Whitespace;[ ];
SyntaxKind.SingleQuote;['];
SyntaxKind.HtmlBlock - [18..21) - FullWidth: 3 - Slots: 1
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the difference between HtmlBlock and HtmlMarkupBlock?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This may change when I do pass 2 of HtmlParser rewrite but as of now here is what they mean,
HtmlBlock is just a holder that contains bunch of other nodes. E.g. Attribute value, where it can be a combination of different kinds of values
HtmlMarkupBlock is more conceptually meaningful and acts as a parent in multiple cases

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Think of HtmlBlock as just a wrapper for SyntaxList<RazorSyntaxNode>

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Super confusing, we should change that 😄 (don't care if we do now or later)

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree. I should change the name atleast.

SyntaxKind.Whitespace;[ ];
SyntaxKind.Text;[Bar];
SyntaxKind.HtmlDocument - [0..14) - FullWidth: 14 - Slots: 1 - [Foo </div> Bar]
SyntaxKind.HtmlMarkupBlock - [0..14) - FullWidth: 14 - Slots: 1
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there ever a line that doesn't start with SyntaxKind.? Seems repetitive is all.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point. I'll clean it up

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a point to including the bounds + the width + the slots? I would imagine we just want the bounds.

The slot count from the number of children.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a point to including the bounds + the width + the slots?

I'll remove slots. But I feel the width is nice to have for easier understanding and debugging.

One other thing to note is that I displayed the entire content just for the top node. This way we don't have to switch back to the test to find the input. I know @NTaylorMullen wanted this.

SyntaxKind.LeftBrace;[{];
Code span - Gen<Stmt> - [LF] - AutoCompleteEditHandler;Accepts:Any,AutoComplete:[<null>];AtEOL - (2:0,2) - Tokens:1
SyntaxKind.HtmlDocument - [0..17) - FullWidth: 17 - Slots: 1 - [@{LF} <html>LF]
SyntaxKind.HtmlMarkupBlock - [0..17) - FullWidth: 17 - Slots: 1
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will every HtmlDocument always have 1 child which is an HtmlMarkupBlock?

Copy link
Copy Markdown
Contributor Author

@ajaybhargavb ajaybhargavb Sep 13, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. I added HtmlDocument to act as an ultimate parent for an HTML document. In other cases like if we parse a block of html from within a CSharp context, it won't be wrapped in an HtmlDocument

SyntaxKind.HtmlDocument - [0..15) - FullWidth: 15 - Slots: 1 - [@{LF} LF<html>]
SyntaxKind.HtmlMarkupBlock - [0..15) - FullWidth: 15 - Slots: 1
SyntaxKind.HtmlTextLiteral - [0..0) - FullWidth: 0 - Slots: 1 - Gen<Markup> - SpanEditHandler;Accepts:Any
SyntaxKind.Unknown;[];
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

? Looks like it was there before as well but what does this represent?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Correct me if I am wrong, I assume this is to get HTML intellisense before the csharp transition

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ooooh this is a marker symbol. We should make this more first-class in the syntax tree. We should have specific node types for the C# and Html variants of this.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How do you feel about renaming SyntaxKind.Unknown to SyntaxKind.Marker and wrap them in Html and CSharp literals appropriately? Right now everything that is SyntaxKind.Unknown is a marker symbol

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds great to me.

SyntaxKind.ForwardSlash;[/];
SyntaxKind.Text;[a];
SyntaxKind.CloseAngle;[>];
SyntaxKind.HtmlTagBlock - [47..51) - FullWidth: 4 - Slots: 1
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did this change parents? Just double checking because the old TagBlock had the same parental relationship as its sibling HtmlTextLiteral. Now it looks like this block was indented 1 parent but the sibling above was not.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think parents changed. The open tag block, content of the tag (Email Me), and the close tag block are all siblings just like before. It's probably that Github diff is being weird

SyntaxKind.RazorMetaCode - [1..2) - FullWidth: 1 - Slots: 1 - Gen<None> - SpanEditHandler;Accepts:None
SyntaxKind.LeftBrace;[{];
SyntaxKind.CSharpCodeBlock - [2..20) - FullWidth: 18 - Slots: 1
SyntaxKind.CSharpStatementLiteral - [2..4) - FullWidth: 2 - Slots: 1 - Gen<Stmt> - AutoCompleteEditHandler;Accepts:Any,AutoComplete:[<null>];AtEOL
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ya, thinking more and more expression literals and statement literals should just be CSharpTextLiteral. My reasoning is that Statement is overloaded in C# and in Razor i.e. @{ ... } and if (...) {...}

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay that makes sense. This is a good time to bring up two other literal types I've defined,
CSharpHiddenLiteralSyntax - to represent the text that we don't want to render in the output (SpanChunkGenerator.Null in the old world)
CSharpNoneLiteralSyntax - to represent the text where we don't want any design time intellisense (SpanKind.None in the old world)
Thoughts on those?

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CSharpNoneLiteralSyntax

Hmm, is it prefixed with CSharp because it contains CSharp tokens? Makes me think we should have an unclassified token type / span kind. Basically if the node was named UnclassifiedTextLiteralSyntax with unclassified tokens it'd truly represent what was intended.

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As for CSharpHiddenLiteralSyntax, i'm uncertain. What's an example of a SpankChunkGenerator.Null again? Having a hard time remembering where we do that.

Copy link
Copy Markdown
Contributor Author

@ajaybhargavb ajaybhargavb Sep 17, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Missed these questions earlier,

Hmm, is it prefixed with CSharp because it contains CSharp tokens?

We currently use it in only one place. Whitespace at the end of SingleLine extensible directives. Renaming it to unclassified should work, Actually, it can also contain CSharp comments.

What's an example of a SpankChunkGenerator.Null again?

One example is just like in #2594,

@{ @@Da }
   ^

That transition is a code span with SpanChunkGenerator.Null

SyntaxKind.HtmlTextLiteral - [0..0) - FullWidth: 0 - Slots: 1 - Gen<Markup> - SpanEditHandler;Accepts:Any
SyntaxKind.Unknown;[];
SyntaxKind.CSharpCodeBlock - [0..53) - FullWidth: 53 - Slots: 1
SyntaxKind.CSharpDirective - [0..53) - FullWidth: 53 - Slots: 2
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd rename CSharpDirectiveX => RazorDirectiveX for your nodes.

A directive consists of a few things.

  1. transition
  2. Some identifier (in this case section)
  3. A place for parameters/tokens
  4. Possible body

Does the current implementation provide these separations? I'd figure that a directive node would have access to all of these components

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd rename CSharpDirectiveX => RazorDirectiveX for your nodes.

Hmm. But directives aren't all metacode right? Most of the time, it contains csharp too. Right now anything that is named RazorxyzSyntax are all things that are either metacode or comments.

Does the current implementation provide these separations? I'd figure that a directive node would have access to all of these components

It does. https://github.com/aspnet/Razor/blob/feature/razor-parser/src/Microsoft.AspNetCore.Razor.Language/Syntax/Syntax.xml#L168-L181
One difference though is that I've combined 3 and 4 as just an optional body. I didn't think having that separate would provide much value. We can always add it later if we need it.

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It does. https://github.com/aspnet/Razor/blob/feature/razor-parser/src/Microsoft.AspNetCore.Razor.Language/Syntax/Syntax.xml#L168-L181
One difference though is that I've combined 3 and 4 as just an optional body. I didn't think having that separate would provide much value. We can always add it later if we need it.

Ah, ya i'd definitely break that out at some point.

Hmm. But directives aren't all metacode right? Most of the time, it contains csharp too. Right now anything that is named RazorxyzSyntax are all things that are either metacode or comments.

They aren't but they aren't C# either. They don't default to C# or Html, their structure is defined by a descriptor; I bring up this point because that's what differentiates CodeBlock from MarkupBlock.

SyntaxKind.Text;[Bar];
SyntaxKind.HtmlDocument - [0..14) - FullWidth: 14 - Slots: 1 - [Foo </div> Bar]
SyntaxKind.HtmlMarkupBlock - [0..14) - FullWidth: 14 - Slots: 1
SyntaxKind.HtmlTextLiteral - [0..4) - FullWidth: 4 - Slots: 1 - Gen<Markup> - SpanEditHandler;Accepts:Any
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've always felt like these things are kinda gibberish to me Gen<Markup> - SpanEditHandler;Accepts:Any - maybe I just never learned how to read it.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Gen<Markup> is just the chunk generator which will go away. SpanEditHandler;Accepts:Any is just the EditHandler type and the AcceptedCharacters

SyntaxKind.ForwardSlash;[/];
SyntaxKind.Text;[div];
SyntaxKind.CloseAngle;[>];
SyntaxKind.HtmlTextLiteral - [10..14) - FullWidth: 4 - Slots: 1 - Gen<Markup> - SpanEditHandler;Accepts:Any
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So an interesteing discussion is whether we want to go with the nomenclature Html... or Markup.... I like Markup better because it's more general. In the Blazor world we anticipate supporting non-HTML markup int he future. Putting HTML everywhere is a little less good to me. Thoughts?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interesting. Just to clear, do you mean you prefer Markup.. for everything that is HTML or specifically for a few nodes like TextLiteral?

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should be consistent. Markup works for me.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the Blazor world we anticipate supporting non-HTML markup int he future.

@NTaylorMullen I understand we want it to be consistent. But if we name everything as Markup.., what would differentiate HTML and non-HTML markup when we support it in the future?

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We'd name it Blazor or something equivalent for the type of Markup that we're adding (if we had to). Html just seems too specific because it makes it seem like we understand things like JavaScript, CSS etc. I could get on board for HtmlStartTag though (when the time comes) because we know it's Html.

@ajaybhargavb
Copy link
Copy Markdown
Contributor Author

ajaybhargavb commented Sep 18, 2018

🆙 📅

  • Removed HtmlBlock vs HtmlMarkupBlock confusion. They are now GenericBlock and MarkupBlock.
  • Removed SyntaxKind. prefix from the baselines
  • Rename HtmlDocument to RazorDocument
  • Change CSharpHiddenLiteralSyntax to CSharpEscapedTextLiteralSyntax and MarkupEscapedTextLiteralSyntax (No baseline changes because related tests haven't been ported yet)
  • Change CSharpNoneLiteralSyntax to UnclassifiedTextLiteralSyntax (No baseline changes - same reason)
  • Rename CSharpDirectiveX to RazorDirectiveX
  • Renamed Html... to Markup.. wherever it makes sense
  • Don't display SlotCount in the baselines
  • Renamed Unknown to Marker

Feedback not addressed (Will do at a later stage):

  • Combine CSharpStatementLiteral and CSharpExpressionLiteral

@NTaylorMullen @rynowak, No need to review the source changes, just reviewing the baseline changes should be sufficient.

return base.VisitMarkupTextLiteral(node);
}

public override SyntaxNode VisitMarkupEscapedTextLiteral(MarkupEscapedTextLiteralSyntax node)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wait,MarkupEscapedTextLiteralSyntax and the CSharp equivalent ones are only used for escaped transitions right? is EscapedTextLiteral the right wording?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Spoke offline. This is for more cases than just the transitions. Going to rename this to MarkupEphemeralTextLiteralSyntax to indicate these are represented in the tree but are not significant and won't be rendered in the output (This is one of the scenarios that should be considered to represent as trivia)

@ajaybhargavb
Copy link
Copy Markdown
Contributor Author

🆙 📅

  • Rename MarkupEscapedTextLiteral to MarkupEphemeralTextLiteral
  • Updated more cases to use the above node (there are now some related baseline changes)

<Field Name="ValueSuffix" Type="MarkupTextLiteralSyntax" Optional="true" />
</Node>
<Node Name="HtmlLiteralAttributeValueSyntax" Base="HtmlSyntaxNode">
<Node Name="HtmlLiteralAttributeValueSyntax" Base="MarkupSyntaxNode">
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some stuff is 'Markup' and some is still Html?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. This is intentional. I named the constructs we know are Html as Html like Attributes, HtmlComments etc

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so, what's the diff between 'markup' and html?

Copy link
Copy Markdown
Contributor Author

@ajaybhargavb ajaybhargavb Sep 18, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Anything that is parsed in a markup context(read: not CSharp) and not necessarily HTML specific like text literals, Tags(they can also be xml tags) etc are "markup". Things that are specific to certain type of markup like HTML, XML etc are named respectively. For example, I can imagine having a node for XMLDeclaration and that will be named XMLDeclarationSyntax as opposed to MarkupDeclarationSyntax.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, and attributes are in all forms of markup right?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Spoke offline. We'll just name everything Markup. That will always be correct because Markup is a superset of Html. We can revisit and rename certain nodes if there is a need in the future.

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, and attributes are in all forms of markup right?
We can revisit and rename certain nodes if there is a need in the future.

Don't see why we made things we know are Html less specific. I disagree with the all-markup naming but don't feel strongly enough to push back. Need a RIP emoji 😉

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think whatever you guys talked about here is wicked confusing. Doesn't pass the dan test.

@ajaybhargavb
Copy link
Copy Markdown
Contributor Author

🆙 📅

@ajaybhargavb
Copy link
Copy Markdown
Contributor Author

@rynowak signed-off offline

@ajaybhargavb ajaybhargavb merged commit 140a11f into feature/razor-parser Sep 18, 2018
@ajaybhargavb ajaybhargavb deleted the ajbaaska/baseline-html-document branch September 18, 2018 23:07
ajaybhargavb added a commit that referenced this pull request Sep 20, 2018
* Generate baselines for HtmlDocumentTest

* Feedback

* More feedback

* Rename all Html to Markup
ajaybhargavb added a commit that referenced this pull request Sep 27, 2018
* Generate baselines for HtmlDocumentTest

* Feedback

* More feedback

* Rename all Html to Markup
ajaybhargavb added a commit that referenced this pull request Sep 27, 2018
* Generate baselines for HtmlDocumentTest

* Feedback

* More feedback

* Rename all Html to Markup
ajaybhargavb added a commit that referenced this pull request Nov 2, 2018
* Generate baselines for HtmlDocumentTest

* Feedback

* More feedback

* Rename all Html to Markup
ajaybhargavb added a commit that referenced this pull request Nov 9, 2018
* Generate baselines for HtmlDocumentTest

* Feedback

* More feedback

* Rename all Html to Markup
ajaybhargavb added a commit that referenced this pull request Nov 10, 2018
* Generate baselines for HtmlDocumentTest

* Feedback

* More feedback

* Rename all Html to Markup
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants