Problem
fuzzy_match() in scripts/verify_test_fidelity.py uses substring word matching (sum(1 for w in words if w in existing)). Because Python test names use _ as a separator but the words themselves come from lowercased/stripped TS titles, a word like post will match any Python test containing the substring post (e.g. test_postable_object_...). Combined with the 60% threshold, this can produce surprising matches when TS test titles use hyphens or compound terms.
Proposed work
- Switch from substring (
w in existing) to whole-word matching (w in existing.split("_")).
- Add unit tests for the matcher covering (a) hyphen-stripped titles, (b) compound words, (c) the 60% threshold edge case.
- Re-run
--strict to confirm no regressions after the tightening.
Why it matters
False-positive fuzzy matches silently hide real coverage gaps — exactly the kind of failure --strict is supposed to prevent. The current behavior is conservative (no known false positives at 588/588) but brittle against future TS test additions.
Related
Problem
fuzzy_match()inscripts/verify_test_fidelity.pyuses substring word matching (sum(1 for w in words if w in existing)). Because Python test names use_as a separator but the words themselves come from lowercased/stripped TS titles, a word likepostwill match any Python test containing the substringpost(e.g.test_postable_object_...). Combined with the 60% threshold, this can produce surprising matches when TS test titles use hyphens or compound terms.Proposed work
w in existing) to whole-word matching (w in existing.split("_")).--strictto confirm no regressions after the tightening.Why it matters
False-positive fuzzy matches silently hide real coverage gaps — exactly the kind of failure
--strictis supposed to prevent. The current behavior is conservative (no known false positives at 588/588) but brittle against future TS test additions.Related