Cache-Coliseum

A benchmark for evaluating learning-augmented caching algorithms, featuring multiple implemented predictors, including PLECO, POPU, LRB, and Parrot.

Experimental Code & Acknowledgements

The experimental scripts, training/evaluation pipelines, data preprocessing, and result analysis were implemented and maintained by Jiaji Zhang.

Some parts of the project code were inspired by or adapted from existing open-source libraries, including:

Traces

Download traces from the Dropbox provided by ML caching with guarantees^[1], and organize them into per-workload directories, e.g., ./traces/xalanc for xalanc_test.csv.

And, we have reorganized the Brightkite and Cite datasets to conform to a unified format.

You can easily download all trace files, boost traces pickle and GBM Model Checkpoints from the Releases page.

Environment

We use an environment based on Python 3.10.16 with Anaconda3 (Anaconda3-2021.05-Linux-x86_64.sh), we strongly recommend using Anaconda for Python package management and virtual environment.

You can also install your own environment based on another Python Version but only pay attention that:

If you want to use the Parrot Model, you should install your torch env(in our benchmark, we use CUDA 12.4 to enable Torch GPU)
Because of different Python pickle rules, the released boost-traces.zip may not work in your env, so you need to generate your own boost trace file.

Usage

Benchmark

A Simple usage:

python -m benchmark --dataset xalanc --real --pred pleco --boost --boost_fr

You will see the benchmark results of all supported algorithms on the console.

All Usages

python -m benchmark [--dataset DATASET] [--test_all] [--device DEVICE] (--oracle | --real)
                   [--pred {parrot,pleco,popu,pleco-bin,lrb,oracle_bin,oracle_dis}]
                   [--noise_type {dis,bin,logdis}] [--dump_file] [--output_root_dir OUTPUT_ROOT_DIR] [--verbose]
                   [--boost] [--num_workers NUM_WORKERS] [--boost_fr] [--boost_preds_dir BOOST_PREDS_DIR]
                   [--model_fraction MODEL_FRACTION] [--checkpoints_root_dir CHECKPOINTS_ROOT_DIR]
                   [--parrot_config_path PARROT_CONFIG_PATH] [--lightgbm_config_path LIGHTGBM_CONFIG_PATH]

Dataset

--dataset: Benchmark source dataset name, supported name:
- brightkite
- citi
- astar
- bwaves
- bzip
- cactusadm
- gems
- lbm
- leslie3d
- libq
- mcf
- milc
- omnetpp
- sphinx3
- xalanc
You can also support your own dataset.

--test_all: This option only works for brightkite and citi datasets, meaning the entire traces will be used in these datasets. It cannot be used with parrot or lrb predictors as they require a training set.
Device

--device: Model target device, like cpu, cuda:0 and cuda:1...

This option only works for the parrot predictor, in other cases we only use CPU.
Mode

real and oracle: You must specify one of them in command.
- Real Mode
  
  Supported Predictors:
  - parrot
  - pleco
  - popu
  - pleco-bin
  - gbm
  verbose default to True
  
  boost will use pickle traces (num_workers doesn't work)
  
  noise_type doesn't work
- Oracle Mode
  
  Supported Predictors:
  - oracle_dis
  - oracle_bin
  verbose default to False
  
  boost will use multiprocess pool (num_workers works)
  
  boost_preds_dir model_fraction checkpoints_root_dir parrot_config_path lightgbm_config_path don't work
Noise Type
- logdis: Log Normal Noise on Next Request Time (Reuse Distance)
- dis: Normal Noise on Next Request Time (Reuse Distance)
- bin: Binary Flipping Noise on Belady's label (FBP label)

Algorithm

We implemented most of the existing algorithms in our benchmark, you can also implement and test your own algorithm in the benchmark.

Algorithms and Predictors compatibility

Algorithm	PLECO	POPU	Parrot	LRB	Oracle-Dis	Oracle-Bin
Rand	❌	❌	❌	❌	❌	❌
LRU	❌	❌	❌	❌	❌	❌
Marker	❌	❌	❌	❌	❌	❌
Predict (BlindOracle/LRB/Parrot)	✔	✔	✔	✔	✔	✔
PredictiveMarker^[2]	✔	✔	❌	❌	✔	❌
LMarker^[3]	✔	✔	❌	❌	✔	❌
LNonMarker^[3]	✔	✔	❌	❌	✔	❌
Follower&Robust(F&R)^[4]	✔	✔	❌	❌	✔	❌
Mark0^[5]	❌	❌	❌	✔	❌	✔
Mark&Predict^[5]	❌	❌	❌	❌	❌	✔
CombineDeterministic^[6]	✔	✔	✔	✔	✔	✔
CombineRandom^[6]	✔	✔	✔	✔	✔	✔
The Guard Framework	✔	✔	✔	✔	✔	✔
Guard&BlindOracle	✔	✔	❌	❌	✔	❌
Guard&LRB	❌	❌	❌	✔	❌	✔
Guard&Parrot	❌	❌	✔	❌	❌	❌

Predictor
- pleco: A PLECO Model that predicts next request times (NRT).
- popu: A Popularity Model that predicts next request times (NRT).
- lrb: A Gradient Boosting Machine based on Delta and EDC features that predicts belady's binary labels.
- oracle_bin: An offline predictor that predicts next request times (NRT), potentially affected by noise (dis).
- oracle_dis: An offline predictor that predicts belady's binary label, potentially affected by noise (logdis, dis).
- parrot: A Parrot Imitation Model that predicts eviction priority for each page.
Dump and Verbose

dump_file: Enables result dumping, saving test results in the format $output_root_dir/$dataset/$fraction/$predictor.csv.

verbose: Enables detailed output. When enabled, all statistical data (hits, misses, hit rates, and competitive ratios) will be displayed; otherwise, only the hit rate and competitive ratio will be shown.
Boost

In real mode, boost uses a pickle file from boost_preds_dir (creating it if it doesn't exist) to accelerate testing.

In oracle mode, boost utilizes a multiprocessing pool to handle all test tasks concurrently.

We also provide the boost_fr parameter, which enables the Follow&Robust algorithm to use the Belady trace for calculating the Belady cost within this algorithm. Without this optimization, the entire test process would be significantly slower.
Model Config

To use the Parrot or GBM predictors, you must specify the following parameters (or use the defaults):
- model_fraction (default to 1): Specifies the fraction of the training set to be used for training. (In benchmarking, you must select a fraction corresponding to an existing pre-trained model.)
- checkpoints_root_dir: Checkpoints dir.
- parrot_config_path: The parrot predictor's config file.
- lightgbm_config_path: The gbm predictor's config file.

Model Training

GBM Model

python -m model.lightgbm [--dataset DATASET] [--device DEVICE]
                    [--model_fraction MODEL_FRACTION] [--model_config_path MODEL_CONFIG_PATH]
                    [--checkpoints_root_dir CHECKPOINTS_ROOT_DIR] [--traces_root_dir TRACES_ROOT_DIR]
                    [--iter_threshold] [--real_test]

You can use our GBM training tool to generate and test the cache model.

Params:
- dataset: Source dataset
- device: Model load device (in GBM, only cpu can work)
- model_fraction: The proportion of the training set used for training. The default value is 1, indicating that the entire training set will be used.
- model_config_path: Model's config
- checkpoints_root_dir: The directory where the checkpoint files are saved after training is completed.
- traces_root_dir: The directory where the datasets are stored.
- iter_threshold: When enabled, this iteratively adjusts the binary splitting threshold from 0 to 1 to determine the optimal threshold for binary classification. If disabled, the threshold defaults to 0.5.
- real_test: After training is completed, the test set will be loaded to evaluate the model's performance.

Config Example:

{
    "delta_nums": 10,
    "edc_nums": 10,
    "training": {
        "boosting_type": "gbdt", 
        "objective": "binary", 
        "metric": "l2",
        "learning_rate": 0.01,
        "num_boost_round": 8000,
        "num_leaves": 31, 
        "max_depth": 6,
        "subsample": 0.8, 
        "colsample_bytree": 0.8
    }
}

delta_nums and edc_nums: The dimensions of the Delta and EDC features, respectively. For more details, refer to webcachesim2 and Learning relaxed Belady for content distribution network caching^[7].

training: The training parameters that will be used by LightGBM.

Parrot Model

python -m model.parrot [--dataset DATASET] [--device DEVICE]
                    [--model_fraction MODEL_FRACTION] [--model_config_path MODEL_CONFIG_PATH]
                    [--checkpoints_root_dir CHECKPOINTS_ROOT_DIR] [--traces_root_dir TRACES_ROOT_DIR]

Params:
- dataset: Source dataset
- device: Model load device (Parrot use CUDA_VISIBLE_DEVICES to set target device)
- model_fraction: The proportion of the training set used for training. The default value is 1, indicating that the entire training set will be used.
- model_config_path: Model's config
- checkpoints_root_dir: The directory where the checkpoint files are saved after training is completed.
- traces_root_dir: The directory where the datasets are stored.

Config Example:

{
  "address_embedder": {
      "embed_dim": 64,
      "max_vocab_size": 5000,
      "type": "dynamic-vocab"
  },
  "cache_line_embedder": "address_embedder",
  "cache_pc_embedder": "none",
  "loss": [
      "ndcg",
      "reuse_dist"
  ],
  "lstm_hidden_size": 128,
  "max_attention_history": 30,
  "pc_embedder": {
      "embed_dim": 64,
      "max_vocab_size": 5000,
      "type": "dynamic-vocab"
  },
  "positional_embedder": {
      "embed_dim": 128,
      "type": "positional"
  },
  "sequence_length": 80,
  "lr": 0.001,
  "total_steps": 25000,
  "eval_freq": 5000,
  "save_freq": 5000,
  "batch_size": 32,
  "collection_multiplier": 5,
  "dagger_init": 1,
  "dagger_final": 1,
  "dagger_steps": 200000,
  "dagger_update_freq": 50000
}

You can find more details about Parrot and its training in cache_replacement^[8].

Script

We also provide some scripts for training and testing, for reference purposes only. You can find them in the scripts folder.

References

[1] Chłędowski, Jakub Polak, Adam Szabucki, Bartosz Zolna, Konrad. 2021. Robust Learning-Augmented Caching: An Experimental Study. 10.48550/arXiv.2106.14693.

[2] Thodoris Lykouris and Sergei Vassilvitskii. 2021. Competitive Caching with Machine Learned Advice. J. ACM 68, 4, Article 24 (August 2021), 25 pages. https://doi.org/10.1145/3447579

[3] Dhruv Rohatgi. 2020. Near-optimal bounds for online caching with machine-learned advice. In Proceedings of the Thirty-First Annual ACM-SIAM Symposium on Discrete Algorithms (SODA '20). Society for Industrial and Applied Mathematics, USA, 1834–1845.

[4] Sadek, K. A. and Elias, M. Algorithms for caching and MTS with reduced number of predictions. arXiv preprint arXiv:2404.06280, 2024.

[5] Antonios Antoniadis, Joan Boyar, Marek Eliáš, Lene M. Favrholdt, Ruben Hoeksma, Kim S. Larsen, Adam Polak, and Bertrand Simon. 2023. Paging with succinct predictions. In Proceedings of the 40th International Conference on Machine Learning (ICML'23), Vol. 202. JMLR.org, Article 39, 952–968.

[6] Wei, A. Better and simpler learning-augmented online caching. arXiv preprint arXiv:2005.13716, 2020.

[7] Zhenyu Song, Daniel S. Berger, Kai Li, and Wyatt Lloyd. 2020. Learning relaxed Belady for content distribution network caching. In Proceedings of the 17th USENIX Conference on Networked Systems Design and Implementation (NSDI'20). USENIX Association, USA, 529–544.

[8] Evan Zheran Liu, Milad Hashemi, Kevin Swersky, Parthasarathy Ranganathan, and Junwhan Ahn. 2020. An imitation learning approach for cache replacement. In Proceedings of the 37th International Conference on Machine Learning (ICML'20), Vol. 119. JMLR.org, Article 579, 6237–6247.

Citation

If you have any questions regarding the paper, feel free to contact me at pgchen@zju.edu.cn.

@misc{chen2025robustifyinglearningaugmentedcachingefficiently,
      title={Robustifying Learning-Augmented Caching Efficiently without Compromising 1-Consistency},
      author={Peng Chen and Hailiang Zhao and Jiaji Zhang and Xueyan Tang and Yixuan Wang and Shuiguang Deng},
      year={2025},
      eprint={2507.16242},
      archivePrefix={arXiv},
      primaryClass={cs.DS},
      url={https://arxiv.org/abs/2507.16242},
}

Name		Name	Last commit message	Last commit date
Latest commit History 72 Commits
benchmark		benchmark
cache		cache
data_trace		data_trace
model		model
scripts		scripts
utils		utils
.gitignore		.gitignore
README.md		README.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Cache-Coliseum

Experimental Code & Acknowledgements

Traces

Environment

Usage

Benchmark

All Usages

Algorithms and Predictors compatibility

Model Training

GBM Model

Parrot Model

Script

References

Citation

About

Uh oh!

Releases 1

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Cache-Coliseum

Experimental Code & Acknowledgements

Traces

Environment

Usage

Benchmark

All Usages

Algorithms and Predictors compatibility

Model Training

GBM Model

Parrot Model

Script

References

Citation

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases 1

Contributors

Uh oh!

Languages