-
Modify the
prefixinenvironment.yamlto point to the location where the environment will be stored. -
Create a
condaenvironment usingconda env create -f environment.yaml -
Activate the environment
conda activate subg_matching
- Navigate to
scripts/. You are supposed to run therun_large_dataset.shorrun_new_dataset.shscripts. - Set the
gpusvariable to indicate a tuple of all the GPU indices available for the experiment. If just GPU 2 is available, set it to(2). Any non-zero length for the list works. - Run
bash run_large_dataset.shon the command line. This will start training the model. The model evaluates on the test dataset at the end of training by default. - Note that in our original codebase, we have used
wandbto manage and monitor runs. However, we have setWANDB_MODE=disabledin the bash script since we don't expect every user to be familiar withwandb. In case the user has experience using it, theWANDB_MODE=disabledpart of the command can be deleted, so that it starts as such -CUDA_VISIBLE_DEVICES=... - Results will be stored in the
<experiment_dir>/<experiment_id>directory, which in this case isexperiments/rqX_custom_models. This includes trained models, partial configs and logs. The train/validation scores are printed at every epoch in the corresponding log file, and the test score is evaluated at the end of training.
- We provide additional files here - https://rebrand.ly/dessub.
-
Some models have a different naming convention in the codebase than in the paper.
-
Models discussed in the main text
-
Our-Early-Best -
configs/edge_models/scoring=sinkhorn_pp=hinge___tp=sinkhorn_pp=hinge_when=post___unify=true.yaml -
Our-Late-Best -
configs/edge_models/scoring=sinkhorn_pp=hinge___tp=none.yaml -
GMN adaptation -
configs/node_models/scoring=attention_pp=identity___tp=attention_pp=identity_when=post.yaml -
IsoNet adapation -
configs/edge_models/scoring=sinkhorn_pp=lrl___tp=none.yaml
-
-
AIDS, MUTAG, PTC-FM, PTC-FR, PTC-MM and PTC-MR are under the directory
large_datasetwhile NCI-H23H, MOLT-4H, MCF-7H and MSRC-21 come undernew_dataset. -
Training GraphSim for
newdatasets requires theconv_pool_size: [4,4,3,3]line to be active while forlargedatasets,conv_pool_size: [3,3,2,2]should be chosen.