Skip to content

add support for symbol_keys equivalent to OJ / JSON.parse #22

@danmayer

Description

@danmayer

I recently wrote up a post about how much faster simdjson_ruby can be than OJ and other options.

see blog post

While that is all true, when you need things to have symbols due to the expected upstream usage, the benchmark falls apart... I was looking to add support to create symbols while building up the hash vs having to convert to them after.

My benchmark updated to have symbolized keys for all implementations...

require 'benchmark/ips'
require 'json'
require 'oj'
require 'simdjson'
require 'memory_profiler'
require 'rails'

json = File.read("./json_data.json")

puts "ensure these match"
puts  Oj.load(json.dup, symbol_keys: true) == Simdjson.parse(json.dup).deep_symbolize_keys! && Simdjson.parse(json.dup).deep_symbolize_keys! == JSON.parse(json.dup, symbolize_names: true)

Benchmark.ips do |x|
  x.config(:time => 15, :warmup => 3)

  x.report("oj parse") { Oj.load(json.dup, symbol_keys: true) }
  x.report("simdjson parse") { Simdjson.parse(json.dup).deep_symbolize_keys }
  x.report("stdlib JSON parse") { JSON.parse(json.dup, symbolize_names: true) }

  x.compare!
end

The resulting output shows that all the perf improvements of the parser are lost to having to do a second pass for symbolizing, at least in the case of large JSON files.

ensure these match
true
Warming up --------------------------------------
            oj parse   101.000  i/100ms
      simdjson parse    44.000  i/100ms
   stdlib JSON parse    58.000  i/100ms
Calculating -------------------------------------
            oj parse	      1.016k (± 4.9%) i/s -     15.251k in  15.051368s
      simdjson parse    420.256  (± 6.7%) i/s -      6.292k in  15.052436s
   stdlib JSON parse    503.879  (±11.1%) i/s -      7.482k in  15.037979s

Comparison:
            oj parse:     1016.2 i/s
   stdlib JSON parse:      503.9 i/s - 2.02x  (± 0.00) slower
      simdjson parse:      420.3 i/s - 2.42x  (± 0.00) slower

I haven't wrote a C extension for Ruby for years, but happy to help if I can get the full build / test cycle working... Or if this all makes sense to you happy to review a PR if you think this is a good idea and know-how to tackle it.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions