Skip to content

Proposal: Method to peek character (like peek for bytes) #183

@le0pard

Description

@le0pard

Hello. Thanks for this useful lib

In most cases for building parser/tokenizer on top of strcan, I dont need such method to peek character from some offset without moving position. But while building https://github.com/le0pard/json_mend to repair broken JSON I found that in many cases I need look ahead in string, so can understand with what broken part of a JSON I am dealing with. For now I have such method:

    # Peeks the next character without advancing the scanner
    def peek_char(offset = 0)
      # Handle the common 0-offset case
      if offset.zero?
        # peek(1) returns the next BYTE, not character
        byte_str = @scanner.peek(1)
        return nil if byte_str.empty?

        # Fast path: If it's a standard ASCII char (0-127), return it directly.
        # This avoids the regex overhead for standard JSON characters ({, [, ", etc).
        return byte_str if byte_str.getbyte(0) < 128

        # Slow path: If it's a multibyte char (e.g. “), use regex to match the full character.
        return @scanner.check(/./m)
      end

      # For offsets > 0, we must scan to skip correctly (as characters can be variable width)
      saved_pos = @scanner.pos
      res = nil
      (offset + 1).times do
        res = @scanner.getch
        break if res.nil?
      end
      @scanner.pos = saved_pos
      res
    end

As you can see, I can use check(/./m) to get first character without advancing, but regex is not so fast (that is why even exists this byte_str.getbyte(0) < 128 optimization). For read in some offset I need loop by getch and back original position.

Will be good, if library will have similar method like peek, but which works with characters (name can be peekch).

Thanks

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions