Skip to content

Photonics2D.optimize returns a design whose simulate() score doesn't match optimization_history[-1] (double-projection bug) #250

@mkeeler43

Description

@mkeeler43

Bug: in Photonics2D, simulate(returned_design) ≠ optimization_history[-1]

What you'd expect

If you optimise a design and then simulate the design optimize() returns, the score simulate() reports should match the last value in optimization_history. They're meant to be the same number.

What actually happens

The two disagree by ~50%.

import numpy as np
from engibench.utils.all_problems import BUILTIN_PROBLEMS

problem = BUILTIN_PROBLEMS["photonics2d"](seed=7)
start = np.full(problem.design_space.shape, 0.5, dtype=np.float32)

design, history = problem.optimize(start, config={"num_optimization_steps": 30})
score = problem.simulate(design)

print("history[-1] =", history[-1].obj_values[0])   # 3.976
print("simulate()  =", score[0])                    # 1.826

Why

Both optimize and simulate push the design rho through the same internal pipeline before scoring it:

rho  →  blur  →  project  →  ε(rho)  →  FDFD  →  score

That's defined once in epsr_parameterization and re-used everywhere. The optimiser's history records the score of rho after this pipeline, step by step.

The bug is that optimize() runs project one more time on the design before returning it (v0.py:457–462):

rho_optimum = rho_optimum_flat.reshape((num_elems_x, num_elems_y))
rho_optimum = operator_proj(rho_optimum, ...)   # ← extra projection
rho_optimum = np.rint(rho_optimum)              # ← then rounding
return rho_optimum.astype(np.float32), opti_steps_history

So the design that gets returned is not the design Adam was scoring. When you call simulate() on the returned design, the pipeline runs blur+project on something that has already been projected — the second blur smears the binary pattern, the second projection lands on a different ε, and the score drops.

Picture:

What history recorded:    rho  ─→ blur ─→ project ─→ score = 3.976
What is returned:         rho  ─→ project ─→ rint ─→ returned_design
What simulate computes:   returned_design  ─→ blur ─→ project ─→ score = 1.826
                                              ^^^^^^^^^^^^^^^^^
                                            the "extra round" through
                                            blur+project causes the gap

(np.rint itself doesn't cost anything — at high β the projection is already nearly binary, so rounding is a no-op. The damage comes from the extra project.)

Fix

Just don't post-process. Return the design Adam was actually optimising:

# v0.py:457–462
rho_optimum = rho_optimum_flat.reshape((num_elems_x, num_elems_y)).astype(np.float32)
return rho_optimum, opti_steps_history

Now simulate() runs the pipeline once on the same rho Adam was scoring, and the numbers agree.

Proof

Same script as above, with the fix applied:

BEFORE: history[-1] = 3.976,  simulate() = 1.826,  gap = +2.150
AFTER:  history[-1] = 3.976,  simulate() = 3.990,  gap = +0.014   ← float noise

Regression test

def test_optimize_simulate_consistency():
    problem = Photonics2D(seed=0)
    start = np.full(problem.design_space.shape, 0.5, dtype=np.float32)
    design, history = problem.optimize(start, config={"num_optimization_steps": 20})
    sim = problem.simulate(design)
    assert np.isclose(sim[0], history[-1].obj_values[0], rtol=1e-2)

Why this matters for users

Anything that compares "the score the optimiser thought it achieved" to "the score simulate() reports" is reading two different numbers. In particular:

  • The published dataset was generated with the buggy code. The saved optimal_design is the binarised post-projection design, while the saved optimization_history was logged on the continuous design. So dataset["total_overlap"] (= simulate(saved_design)) is much smaller than |optimization_history[-1]| for the same row — by exactly the same gap mechanism.
    • With the fix, fresh simulate(returned_design) ≈ history[-1] — they agree, as you'd expect.
    • Existing dataset entries are frozen artifacts of the bug; comparing fresh fixed runs to old saved total_overlap will look favourable until the dataset is regenerated.
  • A separate, smaller note: the saved optimization_history is sign-flipped vs. what current optimize() returns (stored as negative −(total_overlap − penalty), current code returns positive). Worth either regenerating the dataset or adding a sign conversion in the loader.

Possible same-class bug elsewhere

Worth a sweep of other Problem subclasses for optimize() methods that post-process a design before returning it without re-recording the final history step on the returned design.

Metadata

Metadata

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions