Update transpilation optimizations with SABRE tutorial to new template#4970
Update transpilation optimizations with SABRE tutorial to new template#4970henryzou50 wants to merge 31 commits into
Conversation
Restructure the tutorial to match the standard template format with learning outcomes, prerequisites, and the four-step Qiskit patterns workflow. Key changes: - Remove qiskit_serverless content (runtime performance issues, will revisit separately) - Add small-scale simulator section using qiskit_aer with noise model from real backend, running 10 trials with error bars for statistical reliability - Add large-scale hardware section comparing basic, decay, and lookahead SABRE heuristics across multiple seeds - Improve plots with percentage annotations, value labels, and side-by-side fidelity bar charts - Use consistent "2Q depth" labeling throughout - Add analysis commentary connecting transpilation quality to execution fidelity - Update requirements to Qiskit SDK v2.0+ and add qiskit-aer - Add next steps with links to custom transpiler pass, transpiler plugins, and DAG representation guides - Ran tox -e fix
|
One or more of the following people are relevant to this code:
|
|
Check out this pull request on See visual diffs & provide feedback on Jupyter Notebooks. Powered by ReviewNB |
kaelynj
left a comment
There was a problem hiding this comment.
This looks good to me so far! Just a handful of style/grammar changes to make.
| "pm_2.layout.replace(index=2, passes=sl_2)\n", | ||
| "pm_3.layout.replace(index=2, passes=sl_3)\n", |
There was a problem hiding this comment.
It might make sense to split the pass manager construction and mutation into a multiple cells. Then you can visualize the pass manager and show where the indexing and modifications come from. Using the PassManager api for modification of existing pass manager is kind of opaque at the best of times and I think visualizing the details helps explain why this works.
There was a problem hiding this comment.
Done!
Step 2 is now split into separate "inspect → modify → run" cells: pm_1.layout.draw() shows the default layout stage so the [2] index and the ConditionalController wrapping SabreLayout are visible, and a follow-up pm_3.layout.draw() after the replacement makes the structural change explicit.
The replacement also re-wraps the new SabreLayout in the same ConditionalController + BarrierBeforeFinalMeasurements as the preset, so the only behavioral difference from the default is the SABRE configuration itself. For the most part, this re-warp wouldn't matter, but I added it here for consistency.
| "**Key takeaways:**\n", | ||
| "- The `decay` and `lookahead` heuristics are substantially better than `basic` for non-trivial circuits. Always prefer one of the two for production workloads.\n", | ||
| "- The best heuristic depends on your circuit and hardware. Testing multiple heuristics with multiple seeds is the most reliable strategy.\n", | ||
| "- For even broader exploration of the layout space, consider parallelizing seed trials with [Qiskit Serverless](https://quantum.cloud.ibm.com/docs/en/guides/serverless)." |
There was a problem hiding this comment.
I'm not sure I buy this as a conclusion for a recommendation. The notebook already demonstrates how you can efficiently try more seeds by increasing the trial count locally in the pass. I can't imagine a scenario where trying to distribute that work across multiple remote nodes provides any speed benefit besides insanely large numbers of trials/seeds or intractably large circuits (like billions of gates). The overhead of resource deployment and communication will far outweigh the runtime of increasing the number of trials in your local thread pool. If you want to try even more seed values you should just increase the trial number further.
There was a problem hiding this comment.
Agreed, and good points. I adjusted the key takeaways section to now recommend bumping swap_trials / layout_trials locally instead, and notes that SABRE already parallelizes trials across local threads so distribution overhead would dominate any speedup at this work-per-trial.
| "- [Write a custom transpiler pass](https://quantum.cloud.ibm.com/docs/en/guides/custom-transpiler-pass): build your own transpilation logic\n", | ||
| "- [Transpiler plugins](https://quantum.cloud.ibm.com/docs/en/guides/transpiler-plugins): extend Qiskit's transpilation pipeline with third-party passes\n", | ||
| "- [DAG representation](https://quantum.cloud.ibm.com/docs/en/guides/DAG-representation): understand the directed acyclic graph used internally by the transpiler\n", | ||
| "</Admonition>" |
There was a problem hiding this comment.
I know it's in the introduction but do you want to link to the lightsabre and sabre papers here too?
| "metadata": {}, | ||
| "source": [ | ||
| "### Step 1: Map classical inputs to a quantum problem\n", | ||
| "\n", | ||
| "A **GHZ (Greenberger-Horne-Zeilinger)** circuit is a quantum circuit that prepares an entangled state where all qubits are either in the `|0...0⟩` or `|1...1⟩` state. The GHZ state for $n$ qubits is mathematically represented as:\n", | ||
| "$$ |\\text{GHZ}\\rangle = \\frac{1}{\\sqrt{2}} \\left( |0\\rangle^{\\otimes n} + |1\\rangle^{\\otimes n} \\right) $$\n", | ||
| "We construct a **star-topology GHZ circuit** with 15 qubits. The first qubit is the hub, with CNOT gates connecting it directly to every other qubit. This topology creates a challenging layout problem because it does not map trivially to the device's coupling map.\n", |
There was a problem hiding this comment.
If we're showing a star topology GHZ circuit it feels like we should also show star prerouting: https://quantum.cloud.ibm.com/docs/en/api/qiskit/qiskit.transpiler.passes.StarPreRouting
Like sabre struggles to find this, but there is a known optimal routing available. So while it's important to show how to experiment with sabre to improve layout, it's also equally important to show that there is sometimes a path to apply specific non-general optimization techniques if you know it improves quality. In the case of star prerouting that the entire circuit is a single star and the backend has a linear path large enough for the optimal routing.
There was a problem hiding this comment.
Great suggestion! I added StarPreRouting is now integrated as a fourth comparison point in both the small-scale example and the large-scale example. There's also a short note in Step 1 introducing the pass for context.
Co-authored-by: Kaelyn Ferris <43348706+kaelynj@users.noreply.github.com>
Co-authored-by: Kaelyn Ferris <43348706+kaelynj@users.noreply.github.com>
Co-authored-by: Kaelyn Ferris <43348706+kaelynj@users.noreply.github.com>
Co-authored-by: Kaelyn Ferris <43348706+kaelynj@users.noreply.github.com>
Co-authored-by: Kaelyn Ferris <43348706+kaelynj@users.noreply.github.com>
Co-authored-by: Kaelyn Ferris <43348706+kaelynj@users.noreply.github.com>
Co-authored-by: Kaelyn Ferris <43348706+kaelynj@users.noreply.github.com>
Co-authored-by: Kaelyn Ferris <43348706+kaelynj@users.noreply.github.com>
Co-authored-by: Kaelyn Ferris <43348706+kaelynj@users.noreply.github.com>
Co-authored-by: Kaelyn Ferris <43348706+kaelynj@users.noreply.github.com>
Co-authored-by: Matthew Treinish <mtreinish@kortar.org>
…orial - Step 1: Add note about StarPreRouting as a specialized alternative for star-topology GHZ circuits, with link to the API reference. - Step 2: Split the monolithic pass-manager construction cell into inspect → modify → run cells, and visualize the layout stage with `pm.layout.draw()` so readers can see where SabreLayout sits and how the structure changes after replacement. - Step 2 (small + large scale): Re-wrap the custom SabreLayout (and the custom SabreSwap in the large-scale loop) in the same `ConditionalController` + `BarrierBeforeFinalMeasurements` that the default preset uses. This preserves the `_vf2_match_not_found` / `_swap_condition` gating and the protective barrier, so the only behavioral difference from the default is the SABRE configuration itself rather than silently disabling VF2's perfect-layout fallback. - Analysis: Drop the Qiskit Serverless seed-parallelization recommendation; bumping `swap_trials`/`layout_trials` locally is more efficient since SABRE already parallelizes trials across threads. - Next steps: Add links to the SABRE and LightSABRE papers. Note will expand on StarPreRouting in a latter commit
Co-authored-by: abbycross <across@us.ibm.com>
Co-authored-by: abbycross <across@us.ibm.com>
Co-authored-by: abbycross <across@us.ibm.com>
Co-authored-by: abbycross <across@us.ibm.com>
Co-authored-by: abbycross <across@us.ibm.com>
Co-authored-by: abbycross <across@us.ibm.com>
Co-authored-by: abbycross <across@us.ibm.com>
Co-authored-by: abbycross <across@us.ibm.com>
Co-authored-by: abbycross <across@us.ibm.com>
Demonstrates that for circuits with a known structure, a specialized pre-routing pass can outperform any SABRE heuristic. Applies StarPreRouting to the same 100-qubit star-topology GHZ circuit used in the SABRE heuristic comparison, transpiles with the default level-3 preset, runs on hardware, and plots the result against the basic / decay / lookahead baselines. Refreshes the existing hardware-comparison output images (re-run during this update) and adds the new StarPreRouting comparison plot.
| "prerouter = PassManager([StarPreRouting()])\n", | ||
| "qc_linear = prerouter.run(qc)\n", | ||
| "\n", | ||
| "# Transpile the pre-routed circuit with the default level-3 pass manager\n", | ||
| "pm_star = generate_preset_pass_manager(\n", | ||
| " optimization_level=3, backend=backend, seed_transpiler=seed\n", | ||
| ")\n", | ||
| "tqc_star = pm_star.run(qc_linear)\n", |
There was a problem hiding this comment.
You can simplify this code slightly by running StarPreRouting as a pre_layout stage that will run right before layout. This then gets the efficiency improvements of not having to round trip between a dag and circuit multiple times and integrates the better routing as part of the single pass manager.
| "prerouter = PassManager([StarPreRouting()])\n", | |
| "qc_linear = prerouter.run(qc)\n", | |
| "\n", | |
| "# Transpile the pre-routed circuit with the default level-3 pass manager\n", | |
| "pm_star = generate_preset_pass_manager(\n", | |
| " optimization_level=3, backend=backend, seed_transpiler=seed\n", | |
| ")\n", | |
| "tqc_star = pm_star.run(qc_linear)\n", | |
| "prerouter = PassManager([StarPreRouting()])\n", | |
| "# Transpile the pre-routed circuit with the default level-3 pass manager\n", | |
| "pm_star = generate_preset_pass_manager(\n", | |
| " optimization_level=3, backend=backend, seed_transpiler=seed\n", | |
| ")\n", | |
| "pm_star.pre_layout = prerouter\n", | |
| "tqc_star = pm_star.run(qc)\n", |
There was a problem hiding this comment.
Thanks Matt! I'm currently working on applying StarPreRouting to both the small-scale and large-scale comparison sections (will be in my next push), and while doing that I noticed qiskit's own test suite for StarPreRouting uses an even shorter pattern that avoids the explicit PassManager wrapper:
pm_star = generate_preset_pass_manager(
optimization_level=3, backend=backend, seed_transpiler=seed
)
pm_star.init += StarPreRouting()This is a single line on top of the default preset and also avoids the dag↔circuit round-trip (since it runs as part of the StagedPassManager's normal flow). Do you think this is preferable, or is there a reason to use a pre_layout stage instead, e.g. semantic clarity that StarPreRouting is layout-related rather than init-related, or a difference in how it interacts with the other init passes that I should be aware of?
- Remove the dedicated "Beating SABRE" subsection. StarPreRouting is now compared head-to-head as a fourth pass manager (pm_star) in the small-scale example and as a fourth entry alongside the basic / decay / lookahead heuristics in the large-scale example, using the canonical `pm.init += StarPreRouting()` pattern. - Update all analysis markdown to match the new run results: pm_star now produces the shallowest small-scale circuit and ties pm_3 on fidelity (within error bars), and StarPreRouting substantially outperforms every SABRE heuristic on the large-scale hardware fidelity comparison. - Fix value-label-vs-title overlap on the small-scale fidelity bar chart by computing the y-axis top from mean+std+headroom (so the labels never escape the plot area) and adding title pad. - Refresh all hardware/simulation output images for the new run.
|
Thanks @kaelynj, @mtreinish, and @abbycross for the careful reviews and helpful feedback! I've worked through all the suggestions and pushed updates. One further point, since the tutorial title changed to "Transpilation optimization with SABRE", should I also adjust the notebook name to "transpilation-optimization-with-sabre.ipynb"? Most of the changes here address @mtreinish's review. Summary of changes: Structure / readability of the small-scale
StarPreRouting integration
Analysis & content
Other small fixes
Happy to iterate further if anything still looks off, especially around the |
|
@henryzou50 I don't suggest changing the file name - it means we'd have to add a redirect. |
Summary
Revised
transpilation-optimizations-with-sabre.ipynbto follow the Tutorial_Template structure, splitting the tutorial into a small-scale simulator walkthrough and a large-scale hardware comparison of SABRE routing heuristics.Key changes from the old notebook:
qiskit_serverlessandqiskit-ibm-catalogfor Part II. The revised version runs everything locally using standard Qiskit transpiler APIsqiskit_aerwith a real backend noise model to measure execution fidelity across threeSabreLayoutconfigurations, demonstrating how parameter tuning affects actual expectation valuesbasic,decay, andlookaheadrouting heuristics on real hardware, replacing the old serverless-based approachibm_bostontoleast_busy()Tutorial structure:
SabreLayoutparameters (layout_trials,swap_trials,max_iterations), transpile a star-topology GHZ circuit, and validate fidelity improvements with an Aer noise modelbasic,decay,lookahead) at scale on real hardware with execution and analysis