Skip to content

Conversation

@tompng
Copy link
Member

@tompng tompng commented Nov 29, 2025

To replace RDoc::Parser::RipperStateLex

@tompng tompng requested a deployment to fork-preview-protection November 29, 2025 07:48 — with GitHub Actions Waiting
@tompng tompng force-pushed the prism_syntax_highlighter branch from cf98f2b to 870b68f Compare November 29, 2025 08:15
@tompng tompng requested a deployment to fork-preview-protection November 29, 2025 08:15 — with GitHub Actions Waiting
@tompng tompng force-pushed the prism_syntax_highlighter branch from 870b68f to 5db5b5d Compare November 29, 2025 08:16
@tompng tompng requested a deployment to fork-preview-protection November 29, 2025 08:16 — with GitHub Actions Waiting
@tompng tompng force-pushed the prism_syntax_highlighter branch from 5db5b5d to 0c393a5 Compare November 29, 2025 08:42
@tompng tompng requested a deployment to fork-preview-protection November 29, 2025 08:42 — with GitHub Actions Waiting
@tompng tompng force-pushed the prism_syntax_highlighter branch from 0c393a5 to 46cc1f8 Compare November 29, 2025 09:23
@tompng tompng temporarily deployed to fork-preview-protection November 29, 2025 09:23 — with GitHub Actions Inactive
@matzbot
Copy link
Collaborator

matzbot commented Nov 30, 2025

🚀 Preview deployment available at: https://1dae1d57.rdoc-6cd.pages.dev (commit: 46cc1f8)

Copy link
Member

@st0012 st0012 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why don't we make Prism a dependency and just remove RipperStateLex?

@tompng
Copy link
Member Author

tompng commented Dec 10, 2025

Why don't we make Prism a dependency and just remove RipperStateLex?

RipperStateLex is still used in parser/ruby.rb for parsing, so we can't remove it now.
This new tokenizer doesn't generate state bits required in parser/ruby.rb, so we can't replace RipperStateLex.

This pull request makes parser/prism_ruby.rb not to depend on RipperStateLex. Tokenize to a compatible token stream so that the same syntax highlighter (TokenStream.to_html) can be used, while trying to make colorization unchanged as possible.
For this constraint, tokenizer logic is a bit complicated than it needs to be:

  • Have Prism token name to Ripper token name conversion
  • Have token squashing which is generally impossible if there is a heredoc. (≒ buggy)

We can change this, but it will also change syntax highlight result. It may also be a relatively large change.

@st0012
Copy link
Member

st0012 commented Dec 10, 2025

So at the moment we have

  • 1 tokenizer using Ripper
  • 1 parser using Ripper
  • 1 parser using Prism

And this PR will add another tokenizer using Prism. Is this correct?

Have Prism token name to Ripper token name conversion

Will we avoid this if we fully migrate to Prism parser?

I think my main concern is that after this we'll have 2 tokenizers and 2 parsers but it's not clear when we'll be able to drop the old ones.
Do we know:

  • Will this change make migrating to Prism easier
  • Will migrating to Prism make this or a similar change simpler

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants