Skip to content
This repository was archived by the owner on Oct 17, 2021. It is now read-only.

Added support for DOM walking using a visitor#13

Merged
mattt merged 12 commits intoSwiftDocOrg:masterfrom
regexident:visitor
Jan 14, 2021
Merged

Added support for DOM walking using a visitor#13
mattt merged 12 commits intoSwiftDocOrg:masterfrom
regexident:visitor

Conversation

@regexident
Copy link
Member

@regexident regexident commented Apr 28, 2020

While accessing certain nodes in a document via node.children works alright for shallow document structures it becomes cumbersome as soon as you are interested in some more interesting queries such as "collect all links, from anywhere in the document".

(Writing a visitor pattern in time of a pandemic might not have been the most responsible thing, but I did wear a 😷, I promise.)

@regexident
Copy link
Member Author

The reason all func visit(…: …) methods on protocol Visitor return -> VisitorContinueKind, regardless of whether the passed-in type actually has children to visit, is to make the API more resilient against future changes.

@regexident regexident marked this pull request as draft April 28, 2020 14:27
@mattt
Copy link
Contributor

mattt commented Apr 28, 2020

@regexident What a great addition to this library! I was in such a rush to jump to HTML-rendered output and XPath queries for swift-doc, that I neglected to appreciate the beauty of the CommonMark AST for what it was. Really inspiring stuff, and I'm a big fan of the API naming consistency with SwiftSyntax.

Is there anything left to do before this is ready for review?

@regexident
Copy link
Member Author

I made it into a draft for now as I wanted to try getting support for subtype/override semantics, which needed me to change some stuff.

This would allow for visiting super-types, which can be super useful:

func visit(node: Node) ->func visit(block: Block) ->func visit(containerBlock: ContainerBlock) ->func visit(leafBlock: LeafBlock) ->func visit(inline: Inline) -> 

But with that one would either have to gave these higher-level visit methods have no say in controlling the walk via VisitorContinueKind, or extend VisitorContinueKind with an case .inherit and implement some kind of overriding.

I am currently working on the latter as that would further more allow for really nice (i.e. need for very little boiler-plate when implementing Visitor) support for building visitors that, e.g. with the exception of Paragraph skip all Blocks:

final class TestVisitor: Visitor {
    var numberOfTexts: Int = 0

    func visit(block: Block) -> VisitorContinueKind {
        return .skipChildren
    }

    func visit(paragraph: Paragraph) -> VisitorContinueKind {
        return .visitChildren
    }

    func visit(text: Text) -> VisitorContinueKind {
        self.numberOfTexts += 1

        return .visitChildren
    }
}

protocol Visitor's default implementations of func visit(…:) would default to .inherit, instead of .visitChildren, with self.defaultContinueKind being the implicit root.

The order of precedence would be (a > b meaning a can override b):

T > Inline > LeafBlock > ContainerClock > Block > Node

Inline, LeafBlock and ContainerClock should be disjoint sets, but given that we cannot guarantee this from a type-system perspective I defined an order that should reflect the nesting in a document, akin to "cascading" style sheets.

All of this complexity should remain fairly hidden in the package and require minimal configuration.

@mattt
Copy link
Contributor

mattt commented Apr 28, 2020

@regexident Just a quick heads-up: We're in the process of changing how containers are implemented (#11). I'll try to integrate that in the next day or two so that it doesn't block your work here.

@regexident
Copy link
Member Author

regexident commented Apr 28, 2020

We might also want to have at least a generic "end" event for Visitor:

public protocol Visitor {
    // ...
    func didEndVisit()
}

So that a Visitor could maintain an internal stack during traversal such as for inserting closing tags while building HTML from a document.

@mattt
Copy link
Contributor

mattt commented Apr 28, 2020

We might also want to have least add a generic "end" event for Visitor [...] so that a Visitor could maintain an internal stack during traversal such as for inserting closing tags while building HTML from a document.

The way this is done in SwiftSyntax is with visitPost(_:) complements to visit(_:). So the last method that a visitor calls would be visitPost(_:) for the Document.

@regexident
Copy link
Member Author

regexident commented Apr 28, 2020

Yeah. My only concern would be that it would double the API-surface.

My thought was that if you needed info about the node post visit you could just keep a stack of nodes. That would certainly not be as efficient as having visitPost(…:) , but due to nodes being classes should be reasonably efficient memory-wise.

That being said visitPost(…:) would certainly be cleaner.

Anyway, I would be fine with either. I'll happily leave that decision up to you. 🙂

@regexident regexident marked this pull request as ready for review April 28, 2020 16:23
@regexident
Copy link
Member Author

regexident commented Apr 28, 2020

Done.

Lemme know what kind of visitPost(…:) API you would prefer and I'll quickly add it. 🙂

Also, rather than merge now I'd say let's wait for #11 first, and I'll rebase once that's been merged.

@mattt
Copy link
Contributor

mattt commented Apr 29, 2020

Lemme know what kind of visitPost(…:) API you would prefer and I'll quickly add it. 🙂

While it would offer a smaller API surface area, I think using a single method for all "did visit" delegate methods has a strong likelihood of causing confusion. For example, a user may mistakenly see visit for text elements as a specialization for visit for inline elements, and expect only one of them to be called; so one "did visit" callback instead of two.

Given the strong example set by SwiftSyntax and the clarity of their adopted approach, I think visitPost is a better option — in spite of its API surface area.

If you're particularly concerned about API surface area for consumers, one option might be to borrow the approach used by SwiftSyntax with its AnyVisitor class

Also, rather than merge now I'd say let's wait for #11 first, and I'll rebase once that's been merged.

Agreed!

@regexident
Copy link
Member Author

I just added the func visitPosts. 🙂

@regexident
Copy link
Member Author

With #14 closed (🙁) this PR should be ready to merge now.

@mattt
Copy link
Contributor

mattt commented May 23, 2020

@regexident Thanks for the reminder. Working to get this merged now!

Copy link
Contributor

@mattt mattt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks again for your work on this, @regexident. Just a few pieces of feedback for your consideration.

In addition to the inline comments, two other requests:

  • Add a changelog entry
  • Move Visitor.swift and Visitable.swift to the Supporting Types directory.

@regexident
Copy link
Member Author

Should be resolved. 🙂

@gonzalonunez
Copy link

gonzalonunez commented Sep 11, 2020

Hi!

I'd love to try this, is there anything else needed to get this merged? Happy to be of any help that I can.

@mattt I don't expect you to remember 😀, but we met briefly over dinner with some folks when you gave a presentation at Airbnb! I'm now at Primer (where we almost crossed paths I believe!), and the Education team writes projects in Markdown.

I'm exploring using CommonMark to traverse the AST (for some custom components) + render these projects natively in a UICollectionView.

@regexident
Copy link
Member Author

Hey @mattt, is there anything else needed to get this PR merged?

@regexident
Copy link
Member Author

Hate to bother you, @mattt, but any chance to get this merged at some point?
(I have two projects where this would be tremendously helpful, but I'd rather not maintain a fork.)

@regexident regexident requested a review from mattt January 14, 2021 14:35
@mattt
Copy link
Contributor

mattt commented Jan 14, 2021

@regexident Thank you for the final nudge on this and your extraordinary patience in waiting for this to get merged. I'll cut a new minor release with this (and your wonderful memory leak fixes) ASAP.

Edit: This is now live in 0.5.0

@mattt mattt merged commit 1198cc7 into SwiftDocOrg:master Jan 14, 2021
@regexident
Copy link
Member Author

Thank you so much @mattt!

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants