Skip to content

In JRuby, StringScanner.new("") can only hold Encoding:US-ASCII encoding.Β #78

@naitoh

Description

@naitoh

No Problem case (Ruby 3.3.0) πŸ™†β€β™‚οΈ

$ ruby -v
ruby 3.3.0 (2023-12-25 revision 5124f9ac75) [arm64-darwin22]
$ gem list strscan

*** LOCAL GEMS ***

strscan (3.0.8, default: 3.0.7)
$ irb
> require 'strscan'
=> true
> s = StringScanner.new("test")
=> #<StringScanner 0/4 @ "test">
> s.rest.encoding
=> #<Encoding:UTF-8>
> s = StringScanner.new("")
=> #<StringScanner fin>
> s.rest.encoding
=> #<Encoding:UTF-8>
> s.string.force_encoding("ASCII-8BIT")
=> ""
> s.rest.encoding
=> #<Encoding:ASCII-8BIT>
 s.string.force_encoding("UTF-8")
=> ""
> s.rest.encoding
=> #<Encoding:UTF-8>

Problem case (JRuby 9.4.5.0 ) πŸ™…

$ ruby -v
jruby 9.4.5.0 (3.1.4) 2023-11-02 1abae2700f Java HotSpot(TM) 64-Bit Server VM 25.121-b13 on 1.8.0_121-b13 +jit [x86_64-darwin]
$ gem list strscan

*** LOCAL GEMS ***

strscan (3.0.8 java, default: 3.0.7 java)
$ irb
> require 'strscan'
=> true
> s = StringScanner.new("test")
=> #<StringScanner 0/4 @ "test">
> s.rest.encoding
=> #<Encoding:UTF-8>
> s = StringScanner.new("")
=> #<StringScanner fin>
> s.rest.encoding
=> #<Encoding:US-ASCII>
> s.string.force_encoding("UTF-8")
=> ""
> s.rest.encoding
=> #<Encoding:US-ASCII>

StringScanner.new("") can only hold Encoding:US-ASCII encoding.

The above causes the following differences in behavior.

  • Ruby 3.3.0
> s = StringScanner.new("")
=> #<StringScanner fin>
> s.string = s.rest + "test"
=> "test"
> s.rest.encoding
=> #<Encoding:UTF-8>
  • JRuby 9.4.5.0
> s = StringScanner.new("")
=> #<StringScanner fin>
> s.string = s.rest + "test"
=> "test"
> s.rest.encoding
=> #<Encoding:US-ASCII>

The following appear to be unaffected.

  • Ruby 3.3.0
> s = StringScanner.new("")
=> #<StringScanner fin>
> s << "test"
=> #<StringScanner 0/4 @ "test">
> s.rest.encoding
=> #<Encoding:UTF-8>
  • JRuby 9.4.5.0
> s = StringScanner.new("")
=> #<StringScanner fin>
> s << "test"
=> #<StringScanner 0/4 @ "test">
> s.rest.encoding
=> #<Encoding:UTF-8>

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions