-
Notifications
You must be signed in to change notification settings - Fork 43
Description
Hello. Thanks for this useful lib
In most cases for building parser/tokenizer on top of strcan, I dont need such method to peek character from some offset without moving position. But while building https://github.com/le0pard/json_mend to repair broken JSON I found that in many cases I need look ahead in string, so can understand with what broken part of a JSON I am dealing with. For now I have such method:
# Peeks the next character without advancing the scanner
def peek_char(offset = 0)
# Handle the common 0-offset case
if offset.zero?
# peek(1) returns the next BYTE, not character
byte_str = @scanner.peek(1)
return nil if byte_str.empty?
# Fast path: If it's a standard ASCII char (0-127), return it directly.
# This avoids the regex overhead for standard JSON characters ({, [, ", etc).
return byte_str if byte_str.getbyte(0) < 128
# Slow path: If it's a multibyte char (e.g. “), use regex to match the full character.
return @scanner.check(/./m)
end
# For offsets > 0, we must scan to skip correctly (as characters can be variable width)
saved_pos = @scanner.pos
res = nil
(offset + 1).times do
res = @scanner.getch
break if res.nil?
end
@scanner.pos = saved_pos
res
endAs you can see, I can use check(/./m) to get first character without advancing, but regex is not so fast (that is why even exists this byte_str.getbyte(0) < 128 optimization). For read in some offset I need loop by getch and back original position.
Will be good, if library will have similar method like peek, but which works with characters (name can be peekch).
Thanks