Refactor lexer logic by 6cdh · Pull Request #209 · jeapostrophe/racket-langserver

6cdh · 2026-04-13T11:50:57Z

This is the first PR of hover enhancement: #206 I guess I need 4 PRs in total for this better hover plan.

This PR:

Fix some services behavior. When start = end (empty range), means the identifier is not found in the source file, but exists in expanded syntax. The old behavior added it by giving a 1 length range. It no longer adds this type of items now.
Refactor and extract some helper functions from doc-hover function.
Split and refactor lexer logic into doclib/lexer.rkt, add a thread safe lazy cache abstraction for it. Doc struct now has a field to store a lazy cache of lexer snapshot. The lexer snapshot resets after each edit, and only runs in query functions when needed. So it's lazy. And it's safe as the lazy abstraction is semaphore-protected.
Rewrite lexer callers' logic to use let them use the lexer module.
Remove doc-get-symbols and doc-guess-token API because they have inappropriate design. doc-get-symbols return an interval map of symbol tokens, which is a full data. A good design should provide query-based interface, not return full data. doc-guess-token has weird edge behavior not matching its name, and also too complex to describe in a sentence.
Add doc-token-at API which really return token at the given position, or false if no token exists at the position.
Add doc-token-prefix-at API which return the token content at the given position, but only the part before the given position.
Fix a off-by-one bug in document-symbol logic. It's because the lexer returns 1-based indexed data. But all other places expect 0-based indexed data. It will make document-symbol slightly more useful. This bug won't appear because all lexer logic is at a single place now.

For the lexer, each pass does a full tokenize process and store all tokens into a sorted vector. Each pass takes ~2ms for doclib/doc.rkt. At most one pass runs if no more edits.

I also explored other options for lexer:

Full, eager lexer. Lexer runs after each edit eagerly. It's simplest, but the computation might be wasted. And each pass requires a full text copy of current text buffer. This partially defeats the purpose of the efficient text buffer.
Incremental, eager lexer. Lexer runs after each edit eagerly, but incremental. This will be fast, efficient, but complex. The lexer can be stateful. And this requires us to use complex data structure to store the lexer state for each token, and more complex than current used sorted vector.
Full, lazy lexer. It's using this method currently. Not simpler than option 1, but much simpler than option 2, and no wasted computation. The only risk is it needs to write the data in query path which is assumed to be readonly. I explored many concurrency options to make it safe, and a semaphore-protected lazy cache abstraction is the simplest. I didn't use data/interval-map because it's almost 30x slower than build a sorted vector.

When the text range is empty (start = end), ignore the item.

- Make lexer lazy, only runs when needed - Use a sorted vector to store tokens - Fix the document symbol off by one bug API changes: - Remove doc-get-symbols - Remove doc-guess-token - Add doc-token-at which returns token at given positon - Add doc-token-prefix-at which retuns token prefix before given position

6cdh · 2026-04-13T12:01:47Z

The Resyntax CI failed because of a upstream problem: sorawee/pretty-expressive#5

EDIT: fixed

6cdh added 6 commits April 8, 2026 20:10

Fix hover related services

04a3ae4

When the text range is empty (start = end), ignore the item.

Small refactor doc-hover function

0362f47

hover: cache lexer snapshots in doc

4673adf

Fix #:do clause in for/list in old Racket compatibility

b627788

Fix completion behavior

7391853

6cdh requested a review from dannypsnl April 13, 2026 13:23

dannypsnl approved these changes Apr 14, 2026

View reviewed changes

6cdh merged commit 5cbf431 into jeapostrophe:master Apr 14, 2026
19 of 20 checks passed

6cdh deleted the better-hover branch April 14, 2026 11:19

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Refactor lexer logic#209

Refactor lexer logic#209
6cdh merged 6 commits intojeapostrophe:masterfrom
6cdh:better-hover

6cdh commented Apr 13, 2026 •

edited

Loading

Uh oh!

6cdh commented Apr 13, 2026 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Conversation

6cdh commented Apr 13, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

6cdh commented Apr 13, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

6cdh commented Apr 13, 2026 •

edited

Loading

6cdh commented Apr 13, 2026 •

edited

Loading