Fix #143: [Model] NAESatisfiability by GiggleLiu · Pull Request #717 · CodingThrust/problem-reductions

GiggleLiu · 2026-03-20T19:28:03Z

Summary

Add the implementation plan for NAESatisfiability and execute issue #143 in follow-up commits.

Fixes #143

GiggleLiu · 2026-03-20T19:50:34Z

Implementation Summary

Changes

Added NAESatisfiability in src/models/formula/nae_satisfiability.rs with registry metadata, example-db registration, and shared validation via try_new plus serde try_from.
Added model coverage in src/unit_tests/models/formula/nae_satisfiability.rs, including the issue example, complement symmetry, satisfying-solution count, and invalid-clause rejection.
Wired the model into formula exports and the public prelude via src/models/formula/mod.rs, src/models/mod.rs, and src/lib.rs.
Added pred create NAESAT support and help text updates in problemreductions-cli/src/commands/create.rs and problemreductions-cli/src/cli.rs.
Added the paper entry and display name in docs/paper/reductions.typ using the canonical example.

Deviations from Plan

Instead of enforcing the clause-length invariant only in new(), the final implementation shares that validation through try_new and serde try_from so invalid JSON is rejected and pred create returns a normal error instead of panicking.

Open Questions

None.

codecov · 2026-03-20T21:01:26Z

Codecov Report

❌ Patch coverage is 98.46154% with 3 lines in your changes missing coverage. Please review.
✅ Project coverage is 97.61%. Comparing base (fc1cdf8) to head (215c2f2).
⚠️ Report is 5 commits behind head on main.

Files with missing lines	Patch %	Lines
src/models/formula/nae_satisfiability.rs	97.05%	3 Missing ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##             main     #717      +/-   ##
==========================================
+ Coverage   97.60%   97.61%   +0.01%     
==========================================
  Files         389      391       +2     
  Lines       48478    48673     +195     
==========================================
+ Hits        47316    47514     +198     
+ Misses       1162     1159       -3

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

GiggleLiu · 2026-03-21T04:32:36Z

Agentic Review Report

Structural Check

Structural Review: model NAESatisfiability

Structural Completeness

#	Check	Status
1	Model file exists	PASS — `src/models/formula/nae_satisfiability.rs` is present
2	`inventory::submit!` present	PASS — problem schema entry is registered
3	`#[derive(...Serialize, Deserialize)]` on struct	PASS — `NAESatisfiability` derives both
4	`Problem` trait impl	PASS — `impl Problem for NAESatisfiability` exists
5	`OptimizationProblem` or `SatisfactionProblem` impl	PASS — `impl SatisfactionProblem for NAESatisfiability` exists
6	`#[cfg(test)]` + `#[path = "..."]` test link	PASS — model file links its dedicated test module
7	Test file exists	PASS — `src/unit_tests/models/formula/nae_satisfiability.rs` is present
8	Test file has >= 3 test functions	PASS — 13 `test_...` functions found
9	Registered in `{C}/mod.rs`	PASS — wired in `src/models/formula/mod.rs`
10	Re-exported in `models/mod.rs`	PASS — re-exported in `src/models/mod.rs`
11	`declare_variants!` entry exists	PASS — default sat variant is declared
12	CLI `resolve_alias` entry	PASS — alias resolution is registry-backed via `aliases: &["NAESAT"]`, and `pred create NAESAT ...` resolves successfully
13	CLI `create` support	PASS — `problemreductions-cli/src/commands/create.rs` handles `NAESatisfiability`
14	Canonical model example registered	PASS — the formula example spec is chained into the example DB, and `pred create --example NAESAT` resolves successfully
15	Paper `display-name` entry	PASS — `docs/paper/reductions.typ` includes `NAESatisfiability` in `display-name`
16	Paper `problem-def` block	PASS — `docs/paper/reductions.typ` includes a `problem-def("NAESatisfiability")` block
17	Blacklisted auto-generated file check	PASS — none of the blacklisted generated files appear in the PR diff

Build Status

make test: PASS
make clippy: PASS

Semantic Review

evaluate() correctness: ISSUE — config_to_assignment() maps every value other than 1 to false, and literal_value() treats missing variables as false, so evaluate() accepts wrong-length and non-binary configs instead of rejecting them; is_valid_solution() inherits the same behavior.
dims() correctness: OK — dims() returns vec![2; self.num_vars], matching the binary configuration space.
Serialization/validation behavior: ISSUE — try_new() and validate_clause_lengths() only enforce clause length >= 2; they do not reject literal 0 or |literal| > num_vars, so malformed instances can deserialize successfully and later panic or receive unintended semantics in literal_value().
Registry/example/paper integration: OK — registry wiring, CLI creation, canonical example lookup, and paper integration all work from this worktree.
Complexity metadata: ISSUE — declare_variants! registers "2^num_variables" instead of the issue-required "2^num_vars".

Issue Compliance

#	Check	Status
1	Problem name matches issue	OK — implemented as `NAESatisfiability`
2	Mathematical definition matches	ISSUE — NAE semantics are correct for well-formed clauses, but constructor/deserialization do not enforce that literals stay within the declared `num_vars` domain
3	Problem type (opt/sat) matches	OK — satisfaction problem with `Metric = bool`
4	Type parameters match	OK — no type parameters, matching the issue
5	Configuration space matches	OK — binary assignments of length `num_vars`
6	Feasibility check matches	ISSUE — malformed configs outside that space are still evaluated instead of rejected
7	Objective function matches	OK — for well-formed inputs/configs, it returns `true` iff every clause has at least one true literal and at least one false literal
8	Complexity matches	ISSUE — metadata uses `"2^num_variables"` rather than `"2^num_vars"`

Summary

17/17 structural checks passed
5/8 issue compliance checks passed
evaluate() accepts malformed configs instead of rejecting them.
try_new() / deserialization do not reject literals outside the declared variable range, and literal 0 can later panic in literal_value().
declare_variants! does not match the issue-required complexity string.

Quality Check

Quality Review

Design Principles

DRY: ISSUE — nae_satisfiability.rs:106 adds another hand-rolled CNF literal evaluator beside sat.rs:55 and sat.rs:212, so literal semantics now need to stay synchronized across three call paths.
KISS: OK — the core NAE check stays a single linear scan in nae_satisfiability.rs:116.
HC/LC: OK — model logic remains isolated in nae_satisfiability.rs:41, while CLI wiring stays in create.rs:1309.

HCI (if CLI/MCP changed)

Error messages: ISSUE — create.rs:1316 routes NAESAT through create.rs:3620, which still says "SAT problems require --clauses", and invalid short clauses bubble from nae_satisfiability.rs:174 via create.rs:1318 without an NAE-specific usage hint.
Discoverability: OK — the general help text and example catalog both advertise the new problem at cli.rs:221 and create.rs:399.
Consistency: ISSUE — successful guidance uses the alias NAESAT in cli.rs:221, while failure paths mention either NAESatisfiability in create.rs:1312 or generic "SAT problems" in create.rs:3620.
Least surprise: OK — once required flags are present, the create flow matches the existing SAT-family pattern in create.rs:1295 and create.rs:1309.
Feedback: OK — the branch serializes through the standard ser(...) path in create.rs:1318, so there is no silent special-case behavior.

Test Quality

Naive test detection: ISSUE
nae_satisfiability.rs:63 checks only len() == 10 plus two member solutions; it does not verify the full satisfying set or cross-check against an independent oracle, unlike neighboring SAT/KSAT parity coverage.
The added suite starting at nae_satisfiability.rs:19 never exercises a non-empty unsatisfiable NAE instance, so it misses the key negative path for evaluate, is_valid_solution, and find_satisfying.
The new CLI branch at create.rs:1309 has no accompanying CLI test in the touched create.rs test module, leaving pred create NAESAT parsing and error text unguarded.

Issues

Critical (Must Fix)

None.

Important (Should Fix)

pred create NAESAT failure paths are not NAE-specific: missing --clauses reports a SAT-generic message, and invalid clause-length errors lack the usual usage guidance. See create.rs:1316, create.rs:3620, and nae_satisfiability.rs:174.
Test coverage is too light for the changed surface area: there is no CLI coverage for the new create branch and no non-empty unsatisfiable NAE formula case. See create.rs:1309 and nae_satisfiability.rs:19.

Minor (Nice to Have)

The new model repeats SAT-style config/literal handling instead of reusing a shared helper, increasing maintenance cost for CNF semantics. See nae_satisfiability.rs:102, nae_satisfiability.rs:106, sat.rs:55, and sat.rs:176.

Summary

Important: pred create NAESAT error handling is not NAE-specific and gives inconsistent guidance across failure paths.
Important: the added tests miss both CLI coverage and a non-empty unsatisfiable NAE case.
Minor: NAE-SAT duplicates existing CNF literal/config scaffolding instead of sharing it.

Agentic Feature Tests

Feature: NAESatisfiability (NAESAT)
Use case: downstream CLI user discovers the new model, inspects it, creates the canonical example, and solves it from the worktree-local CLI.
CLI used: ./target/debug/pred
Outcome: partial

Checks

pred list: pass. NAESatisfiability appears in the catalog with alias NAESAT.
pred show NAESatisfiability: pass. Description, complexity, and fields display correctly.
pred show NAESAT: pass. Alias resolves to the same canonical model details.
pred create --example NAESatisfiability: pass. Canonical example JSON was written successfully.
pred create --example NAESAT: pass. Alias also works; the alias-generated JSON is identical to the canonical output.
pred inspect <example>: pass. The created example is a NAESatisfiability problem and reports solvers: ["brute-force"].
pred solve <example>: fail. Bare solve exits with an ILP-path error instead of using the only advertised solver.
pred solve <example> --solver brute-force: pass. Solves the example and returns a satisfying assignment with evaluation: "true".
pred create --example NAESAT | pred solve - --solver brute-force: pass. Piped alias workflow works end-to-end.

Issue

confirmed | medium | Bare pred solve fails on the canonical example for this model.
Reproduction:

./target/debug/pred create --example NAESatisfiability -o target/agentic-tests/nae-canonical.json
./target/debug/pred inspect target/agentic-tests/nae-canonical.json
./target/debug/pred solve target/agentic-tests/nae-canonical.json

Observed:

Error: No reduction path from NAESatisfiability to ILP or ILP solver found no solution. Try `--solver brute-force`.

The same file advertises solvers: ["brute-force"], and the workaround succeeds:

./target/debug/pred solve target/agentic-tests/nae-canonical.json --solver brute-force

Recommended fix: when a problem advertises only brute-force, either auto-select it for bare pred solve or surface that requirement before the user hits the default ILP failure.

Not reproducible in current worktree: none.

Generated by review-pipeline

GiggleLiu added 3 commits March 21, 2026 03:27

Add plan for #143: [Model] NAESatisfiability

6c48987

Implement #143: [Model] NAESatisfiability

75ad496

chore: remove plan file after implementation

9824fca

Merge remote-tracking branch 'origin/main' into issue-143

215c2f2

GiggleLiu merged commit accbb66 into main Mar 21, 2026
3 checks passed

GiggleLiu mentioned this pull request Mar 21, 2026

[Model] NAESatisfiability #143

Closed

6 tasks

GiggleLiu deleted the issue-143 branch April 12, 2026 00:51

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix #143: [Model] NAESatisfiability#717

Fix #143: [Model] NAESatisfiability#717
GiggleLiu merged 4 commits intomainfrom
issue-143

GiggleLiu commented Mar 20, 2026

Uh oh!

GiggleLiu commented Mar 20, 2026

Uh oh!

codecov bot commented Mar 20, 2026 •

edited

Loading

Uh oh!

GiggleLiu commented Mar 21, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

GiggleLiu commented Mar 20, 2026

Summary

Uh oh!

GiggleLiu commented Mar 20, 2026

Implementation Summary

Changes

Deviations from Plan

Open Questions

Uh oh!

codecov bot commented Mar 20, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

GiggleLiu commented Mar 21, 2026

Agentic Review Report

Structural Check

Structural Review: model NAESatisfiability

Structural Completeness

Build Status

Semantic Review

Issue Compliance

Summary

Quality Check

Quality Review

Design Principles

HCI (if CLI/MCP changed)

Test Quality

Issues

Critical (Must Fix)

Important (Should Fix)

Minor (Nice to Have)

Summary

Agentic Feature Tests

Agentic Feature Tests

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

codecov bot commented Mar 20, 2026 •

edited

Loading