spec: update constraint ID system#570
Conversation
ReviewThis PR adds stable, content-derived constraint IDs to the spec rendering pipeline: an FNV-1a hash of each constraint's structure is encoded into a short 4-char human-readable tag (e.g. The approach is sound. Three bugs found: Medium — Crash on empty variable groups (src.typ:191–197)
Low —
|
Codex Code ReviewNo actionable issues found in the PR diff. The changes are limited to the Typst spec rendering/tagging logic and chip metadata. I did not find security-relevant runtime changes, VM behavior changes, or significant performance concerns. I could not run a full Typst/shiroa render because neither |
Codex Code ReviewFindings:
No security issues found in the PR diff. I did not run a Typst render because |
Review: spec/constraint_idOverviewReplaces sequential constraint indices with content-derived IDs using FNV-1a hashing. Variables are identified by (chip, group, position, type) so renaming a variable doesn't change constraint IDs, but reordering does. The 4-character Base32 ID (20 bits of entropy) is stable across reorganisation and signals stale IDs when a constraint's semantics change. TOML chips all gain a Bugs[High] [Medium] [Low] Wrong variable in error message — FNV-1a implementationThe two-limb 64-bit multiplication is mathematically correct: carry propagation from Minor observations
|
Co-authored-by: claude[bot] <209825114+claude[bot]@users.noreply.github.com>
RobinJadoul
left a comment
There was a problem hiding this comment.
Very brief look only, so far; haven't properly looked at the canonicalization/serialization of the constraints yet
| /// | ||
| /// This implementation operates on two 32-bit limbs, rather than a single | ||
| /// 64-bit limb, since Typst does not support u64s. | ||
| #let FNV-1a(bytes) = { |
There was a problem hiding this comment.
Any reasoning for not using a typst package for hashing?
e.g. digestify should be running in wasm, so speed should not be a massive concern (probably)
There was a problem hiding this comment.
digestifyis available under the MIT license. When I skimmed the license last week, I thought that using it might require our spec to also be released under the MIT (or compatible) license. Having a closer look now, this might not actually be the case. Still, I'm not a legal expert, so I decided to avoid it just in case.- The other hashing package (
jumble) is slower than this implementation. - It was fun to learn a bit about non-cryptographic hashing, and implement this technique here.
|
|
||
| let indices = (("",) + iters_of(constraint).map(it => it.at(0))).join(".") | ||
|
|
||
| let z-fill(s) = "0" * calc.max(2 - s.len(), 0) + s |
There was a problem hiding this comment.
Appears a few times, maybe lift to the top level.
There was a problem hiding this comment.
It is used twice (in src.typ and chip.typ). Since those two files don't have any shared dependencies/don't depend on one another, I'd have to create a new file that both can depend on. I didn't feel like doing this just for this single one-line function.
| let z-fill(s) = "0" * calc.max(2 - s.len(), 0) + s | ||
| let ref-tag(i) = raw(tag) + sub("/" + z-fill(str(i))) | ||
| return ( | ||
| context super[#emph(z-fill(str(counter(figure.where(kind: counter-kind)).get().at(0) + 1)))], |
There was a problem hiding this comment.
counter(...).final() could give us the total count of constraints, to maybe be able to z-fill more than 2 digits if required
| } | ||
|
|
||
| /// Converts a byte array to a hexadecimal string | ||
| #let bytes-to-hex(bytes) = { |
There was a problem hiding this comment.
Is there much need to carry hex around instead of going straight from bytes to base 32?
There was a problem hiding this comment.
not really. dropped it
99898d2 to
a27e78f
Compare
This PR introduces content-derived constraint IDs, i.e., the ID of a constraint is derived from its content, rather than its index in the constraint list. This
i.e., two constraints with the same ID are 100-ε% certain to be identical.
Note: the ID is derived in a way that updating a variable name does not lead to a name update. This is at the expense of an ID update when variables are reordered.
Before:

After:
