Use pipeline.sh for the standard end-to-end workflow:
# On the GPU host: ROME-only smoke benchmark on one model
bash pipeline.sh --models gpt2-medium --n 1 --compute-cov
# On the GPU host: structural benchmark + detector/new graph post-processing
bash pipeline.sh --structural --models gpt2-medium --n 1 --compute-cov
# Optional: orchestrate the same run over SSH from another machine
bash pipeline.sh --remote user@gpu-host --structural --models gpt2-medium --n 1
# Show all available pipeline options
bash pipeline.sh --helppipeline.sh runs locally by default. The default mode runs the ROME-only
benchmark via rome_benchmark.py; --structural switches to the structural
benchmark plus detector and new graph post-processing. Add --remote <host> when
you want the same workflow launched over SSH/tmux from another machine.
Structural pipeline graphs are written under pipeline_out/<run>/graphs/.
ROME (and related commands) is driven via the Hydra-based CLI in src/cli.py.
Single intervention:
python -m src.cli command=rome model=gpt2-mediumBatch evaluation:
python -m src.cli command=batch-rome model=gpt2-mediumCompute second-moment statistics (required before running ROME on a new model):
python -m src.cli command=second-moment model=gpt2-mediumCluster-local smoke on a GPU host:
# First cold run: downloads model/datasets locally and builds second moments
python -m src.cli command=second-moment model=gpt2-medium
# Single local edit smoke test
python -m src.cli command=rome model=gpt2-medium
# Local ROME-only benchmark (same benchmark family used by pipeline.sh default mode)
python rome_benchmark.py --models gpt2-medium --n-tests 1 --start-idx 0 --output-dir ./analysis_out_local_romeNotes:
- You can use other models, e.g. gpt2-xl, qwen3-4b etc (scroll down for full model list)
python -m src.cli command=rome ...does not auto-compute missing second moments unlessROME_ALLOW_SECOND_MOMENT_AUTOCOMPUTE=1is set.- By default, model downloads are cached under
../models, dataset downloads under../datasets, and computed covariance files under./second_moment_stats. - A true cold first run on a GPU host can stay quiet for several minutes while
command=second-momentdownloads assets and builds the covariance file.
The default config is at src/config/config.yaml. Override any value on the command line using Hydra syntax (e.g. model=gpt2-large).
Alternatively, use the console fallback (no Hydra overhead):
python -m src.cli --console rome --config src/config/config.yamlpython -m src.cli command=causal-trace model=gpt2-mediumTo inspect the computed noise multiplier without running a full trace:
python -m src.cli command=compute-multiplier model=gpt2-mediumcovariance_a100_remote.sh computes second-moment statistics on a remote GPU node (e.g. A100) and pulls the resulting artifacts back locally.
# Run with default models (deepseek-7b-base, granite4-micro, llama2-7b, mistral-7b-v0.1, mistral-7b-v0.3):
./covariance_a100_remote.sh user@gpu-host
# Override models:
MODEL_KEYS="gpt2-xl gpt-j-6b" ./covariance_a100_remote.sh user@gpu-host /path/to/Latium optim latium
# Arguments: <user@host> [remote_repo_path] [remote_branch] [conda_env]The script syncs model configs and src/rome/common.py to the remote, runs covariance computation per model, and downloads the .pt artifacts into second_moment_stats/.
src/causal_trace/layer_heuristic.py recommends the best MLP layer for ROME edits using multiple signals (causal trace, weight norms, spectral gap, architectural prior).
# CSV-only (no GPU needed):
python -m src.causal_trace.layer_heuristic \
--csvs analysis_out/causal_trace_deepseek*.csv \
--num-layers 30
# Full analysis (GPU + model):
python -m src.causal_trace.layer_heuristic \
--model deepseek-ai/deepseek-llm-7b-base \
--layer-template 'model.layers.{}.mlp.down_proj' \
--num-layers 30 \
--csvs analysis_out/causal_trace_deepseek*.csvstructural_benchmark.py applies ROME edits across a dataset and evaluates all structural detectors (MSD, blind MSD, spectral, IPR) on the modified weights. Results are written as JSON to analysis_out/.
For the lightweight payload used by the post-hoc detector and
paper_graphs.ipynb, run structural_benchmark.py --posthoc-only ...
or structural_benchmark.py --paper ...
(--analysis-profile paper remains the underlying profile name).
python structural_benchmark.py \
--model gpt2-large \
--n-tests 30 \
--start-idx 0 \
--output-dir ./analysis_out \
--spectral-top-k 50 \
--trim-first-layers 2 \
--trim-last-layers 2 \
--spectral-neighbor-layers 1Key arguments:
| Argument | Default | Description |
|---|---|---|
--model |
gpt2-large |
Model name (must match a config in src/config/model/) |
--n-tests |
30 |
Number of ROME edits to benchmark |
--start-idx |
0 |
Starting index in the facts dataset |
--output-dir |
./analysis_out |
Directory for JSON result files |
--spectral-top-k |
50 |
Top-K singular values used by the spectral detector |
--trim-first-layers |
2 |
Layers to exclude from the head of the model |
--trim-last-layers |
2 |
Layers to exclude from the tail of the model |
--n-prompts |
auto | Number of ROME prefix prompts (scales with model size if omitted) |
Detailed documentation for the detection methods is in the docs/ directory:
docs/structural-docs.md- structural detector metrics (L2 discrepancy, relative discrepancy, directional coherence, MSD, IPR, etc.)docs/spectral-docs.md- spectral detector signals and the mathematics behind singular-value z-scores and ratio scores
With a final_n500_bundle/ artifact present at the repo root:
bash scripts/bundle_graphs/run_all_graphs.sh --bundle-root final_n500_bundleFrom the directory containing a downloaded bundle:
bash final_n500_bundle/scripts_for_graphs/run_all_graphs.shThe runner rebuilds per-model paper graphs, bundle summary graphs, windowed-detector reports, cohort figures, artifact grids, and refreshes the bundle indices.
| Supported Models | Causal Trace | Weight intervention | Mean ES (n=500) | Notes |
|---|---|---|---|---|
| gpt2-medium | ✔️ | ✔️ | 0.988 | works |
| gpt2-large | ✔️ | ✔️ | 0.986 | works |
| gpt2-xl | ✔️ | ✔️ | 0.986 | works |
| gpt-j-6b | ✔️ | ✔️ | 0.996 | works |
| qwen3-0.6b | ✔️ | ✔️ | ||
| qwen3-1.7b | ✔️ | ✔️ | ||
| qwen3-4b | ✔️ | ✔️ | 0.992 | |
| qwen3-8b | ✔️ | ✔️ | 1.000 | |
| granite4-micro | ✔️ | ✔️ | 0.978 | Weird architecture |
| mistral-7b-v0.1 | ✔️ | ✔️ | 0.948 | |
| mistral-7b-v0.3 | ✔️ | ✔️ | 0.934 | |
| llama2-7b | ✔️ | ✔️ | 0.614 | Weird architecture |
| falcon-7b | ✔️ | ✔️ | 0.976 | |
| opt-6.7b | ✔️ | ✔️ | 0.978 | |
| deepseek-7b-base | ✔️ | ✔️ | 0.976 | |
| llama3 | planned | |||
| gpt-neo | planned | |||
| qwen2.5 | planned | |||
| baichuan | planned | |||
| chatglm | planned | |||
| t5 | planned |
pipeline.sh runs either the ROME-only benchmark or the structural benchmark.
Run it directly on a GPU host after cloning the repo, or pass --remote <host>
to sync the repo and launch the selected mode over SSH/tmux.
# Local ROME-only smoke benchmark on one model
bash pipeline.sh --models gpt2-medium --n 1 --compute-cov
# Local structural run with detector/new graph processing
bash pipeline.sh --structural --models gpt2-medium --n 1 --compute-cov
# Local structural run, then rebuild final-bundle paper graphs if the bundle is present
bash pipeline.sh --structural --bundle-graphs --bundle-root final_n500_bundle
# Remote run with env setup
bash pipeline.sh --remote ubuntu@132.145.129.234 --setup-env
# Remote structural benchmark, N=1 smoke test
bash pipeline.sh --remote user@gpu-host --models gpt2-medium --n 1 --structural
# Compute covariance first, then benchmark
bash pipeline.sh --compute-cov --n 10
# Specific models only
bash pipeline.sh --models gpt2-xl mistral-7b-v0.1 --n 5For structural runs, the current renderer outputs are under
pipeline_out/<run>/graphs/:
rome_success_metrics/- stored ROME metric tables, heatmap, and barsdetector_stacked_variants/- stacked SG/TE detector signal panelsdetector_layer_window/- strict and +/- window detector-layer scoring
| Flag | Default | Description |
|---|---|---|
--compute-cov |
off | Compute covariance matrices (otherwise uses existing) |
--n <int> |
50 | Number of test edits per model |
--structural |
off | Run the structural benchmark and render the new per-run graph set under pipeline_out/<run>/graphs/ |
--bundle-graphs |
off | After a structural run, rebuild graphs from --bundle-root |
--bundle-root <path> |
final_n500_bundle |
Final bundle root used by --bundle-graphs |
--setup-env |
off | Set up conda env + deps on the remote host |
--remote <host> |
local current host | SSH target for remote execution |
--models <m1 ..> |
final paper model set | Override model list |
--output-dir <path> |
./pipeline_out |
Output directory |
prefixtest/experiment.py measures how sensitive the spectral
detection pipeline is to the prefix/template used during the ROME edit. A
single fact is edited 20 times under different prefix strategies (self-generated,
template-based, external) while all other parameters remain fixed. The spectral
detector runs on each result, producing per-layer signal curves that reveal
which prefixes amplify or suppress the edit's spectral footprint.
An additional baseline_unedited run captures the spectral detector output on the original (unmodified) model weights, so that the edited curves can be compared against the clean noise floor.
# Default: Qwen/Qwen3-8B, case 0
python prefixtest/experiment.py
# Custom model / case
python prefixtest/experiment.py --model gpt2-large --case-idx 3prefixtest/run_remote.sh automates upload, environment setup, and tmux-based execution
on a remote machine:
# Launch (uploads code + second-moment stats, installs deps, starts in tmux)
./prefixtest/run_remote.sh # default: Qwen/Qwen3-8B, case 0
./prefixtest/run_remote.sh gpt2-large 3 # custom model & case
# Monitor progress
./prefixtest/run_remote.sh --status
# Download results when finished
./prefixtest/run_remote.sh --fetchThe notebook prefixtest/prefixtest.ipynb is a thin wrapper around
prefixtest/prefixtest_support.py. It auto-discovers the latest artifact in
prefixtest/artifacts/ or analysis_out/, writes outputs into
prefixtest/output/, plots grouped layer-wise spectral curves with the
unedited baseline, adds composite-detector graphs, and shows summary tables.
prefixtest/prefixtest.ipynb # open in Jupyter
prefixtest/prefixtest_support.py # all data-loading and plotting logic
prefixtest/output/ # saved graphs and summary tables
prefixtest/artifacts/ # selected local experiment artifacts
| Error code | Name of the error | Description |
|---|---|---|
1 |
Help | Help invoked. Typically caused by incorrect script usage. |
2 |
Resource already exists | Trying to create a resource that already exists. |
-1 |
Unknown | An unknow error. Create GitHub issue with the reproduction steps |