Skip to content

refactor: trim output allocations and unify unknown-field sort#96

Merged
Boshen merged 2 commits into
mainfrom
refactor/perf-output-and-sort-buckets
May 8, 2026
Merged

refactor: trim output allocations and unify unknown-field sort#96
Boshen merged 2 commits into
mainfrom
refactor/perf-output-and-sort-buckets

Conversation

@Boshen
Copy link
Copy Markdown
Member

@Boshen Boshen commented May 8, 2026

Summary

Refactor sort_package_json from first principles, trimming allocations and simplifying bucket logic.

  • Output pipeline: stream into a single Vec<u8> (BOM bytes prepended, serde_json::to_writer{,_pretty} appends, single from_utf8_unchecked at the end). Replaces the old path of to_string_prettyString::push('\n') → reallocate-with-BOM, which incurred a full output-sized memcpy on BOM inputs and a UTF-8 validation walk in general.
  • sort_object_keys: collapse non_private and private into one unknown Vec sorted by composite (starts_with('_'), name). One alloc, one sort; the underscore check leaves the per-entry hot path.
  • sort_object_by_key_order: drop the obj.sort_keys() + per-key shift_remove pattern (each shift_remove shifts the IndexMap tail — O(k·n)). Replaced with single-pass classify into a slot Vec + remainder, then merge.

Net: −41 lines, all tests pass (snapshot + idempotency + size-limit + UTF-8 BOM), cargo clippy --all-targets -- -D warnings clean.

Benchmarks

cargo bench vs. main on this machine:

Benchmark Δ
sort already sorted package.json −6.06%
sort large package.json −4.36%
sort small package.json within noise
sort minimal package.json within noise

The wins scale with output size — they come from eliminating the extra String copy and the post-serialize UTF-8 walk.

Stream output into a single byte buffer (BOM bytes prepended, serde_json
appends, one `from_utf8_unchecked` at the end) instead of building a
String, pushing '\n', and re-allocating to prepend the BOM. Saves a full
output-sized memcpy on BOM inputs and the post-serialize UTF-8 walk in
general.

Collapse `non_private` and `private` into a single `unknown` Vec sorted
by `(starts_with('_'), name)` — one allocation, one sort, and the
underscore check leaves the per-entry hot path.

Replace `sort_object_by_key_order`'s `obj.sort_keys()` + per-key
`shift_remove` (each shift moves the IndexMap tail) with a single-pass
classify into a slot Vec + remainder, then merge.

cargo bench (vs. main):
  sort already sorted package.json: -6.06%
  sort large package.json:          -4.36%

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@codspeed-hq
Copy link
Copy Markdown

codspeed-hq Bot commented May 8, 2026

Merging this PR will not alter performance

✅ 4 untouched benchmarks


Comparing refactor/perf-output-and-sort-buckets (7e85b85) with main (103dafd)

Open in CodSpeed

`Vec::with_capacity(input.len())` was sized to fit the JSON body
exactly when the input is already pretty-printed, so the trailing
`buf.push(b'\n')` would always trip a reallocation. For the minimal
benchmark (~77 byte input) this caused a measurable regression (CodSpeed
reported +13.83%) — `to_string_pretty` previously used serde's default
128-byte capacity, which had the headroom we lacked.

Pad the capacity by 16 bytes so the newline push never realloc's, while
still keeping the win for larger inputs where the hint avoids serde's
repeated 128 → 256 → 512 → … growth.

cargo bench (vs. main):
  sort small package.json:           -3.08%
  sort already sorted package.json:  -4.34%
  sort minimal package.json:         -1.35%
  sort large package.json:           -1.37%

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@Boshen Boshen merged commit 2c6d8aa into main May 8, 2026
5 checks passed
@Boshen Boshen deleted the refactor/perf-output-and-sort-buckets branch May 8, 2026 13:16
@oxc-guard oxc-guard Bot mentioned this pull request May 8, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant