A curated list of Machine Learning for System papers.
Please feel free to pull requests or open an issue to add papers.
| System | Series | Scenario/Objective | Algorithm | Code |
|---|---|---|---|---|
| Clara | SOSP '21 | SmartNIC Offloading | MLs | Code |
| TraceGen | SOSP '21 | Generate VM Trace | RNN | Code |
| ParM | SOSP '19 | Prediction Serving Systems | NN | Code |
| Resource Central | SOSP '17 | VM Resource Management | Random Forest | |
| Polyjuice | OSDI '21 | Transaction Concurrency Control | EA(RL) | Code |
| FIRM | OSDI '20 | Microservices Scheduler | SVM+RL | Code |
| Ansor | OSDI '20 | Graph-level Optimization for Deep Learning | GBDT | Code |
| Narya | OSDI '20 | VM Failure Mitigation Service | RL | |
| XSTORE | OSDI '20 | RDMA-based Ordered Key-Value Store | LR+NN | Code |
| Bourbon | OSDI '20 | Log-Structured Merge Trees | Piecewise LR | Code |
| LinnOS | OSDI '20 | Flash Storage | NN | Code |
| Harvest VMs | OSDI '20 | VM Scheduling | Random Forest | |
| PredictPower | ATC '21 | Datacenter Power Oversubscription | MLs | |
| CrystalPerf | ATC '21 | Performance Debugging and Reasoning | NN | |
| Habitat | ATC '21 | DNN Performance Modeling | NN | Code |
| GoCC | ATC '21 | Optimistic Concurrency Control for Go Programs | NN | Code |
| Lumi | ATC '21 | Natural Language for Network Management | NER | Code |
| Ayudante | ATC '21 | Assist Persistent Memory Programming | RL | Delete by the author |
| AUTO | ATC '21 | Congestion Control | RL | |
| HDDse | ATC '20 | Generic Disk Failure Detection System | LSTM | |
| Percival | ATC '20 | In-browser Perceptual Ad Blocking | CNN | Code |
| Reconstructing | ATC '20 | Video Streaming | DT | Code |
| JumpSwitches | ATC '19 | Defense Speculative Execution Attack | DT | |
| CognitiveSSD | ATC '19 | Unstructured Data Retrieval System | CNN | Code |
| RuleRanker | ATC '19 | Learning-augmented Systems | NN | |
| ATAD | ATC '19 | Time Series Anomaly Detection | Transfer & Active Learning | |
| Tributary | ATC '18 | Prediction Elastic Services | LSTM | |
| Mainstream | ATC '18 | Multi-Tenant Video Processing | Transfer Learning | |
| CDEF | ATC '18 | Disk Error Prediction | Random Tree | |
| SLAOrchestrator | ATC '18 | Cloud Data Analytics | NN | |
| Metis | ATC '18 | Tuning Configurations for Cloud Systems | Bayesian | |
| Selecta | ATC '18 | Cloud Storage Configurations for Data Analytics | SVD | |
| Sinan | ASPLOS '21 | Microservices Scheduler | NN | Code |
| Llama | ASPLOS '20 | Memory Allocation for C++ | LSTM | Code |
| FlexTensor | ASPLOS '20 | Graph-level Optimization for Deep Learning | RL | Code |
| Seer | ASPLOS '19 | Microservices Anomaly Detection | LSTM | |
| PES | ISCA '19 | Proactive Event Scheduling | LR | |
| NeuroPlan | SIGCOMM '21 | Network Planning | RL | Code |
| Decima | SIGCOMM '19 | Spark Cluster Job Scheduler | RL | Code |
| Pensieve | SIGCOMM '17 | Video Streaming | RL | Code |
| PerfD | NSDI '21 | Blackbox System Performance Prediction | MLs | Code |
| LRB | NSDI '20 | Content Distribution Network Caching | Imitation Learning | Code |
| Cherrypick | NSDI '17 | Cloud Configurations for Big Data Analytics | Bayesian | |
| Helios | SC '21 | DL Cluster Scheduler & Management | GBDT | |
| RLScheduler | SC '20 | HPC Batch Job Scheduler | RL | Code |
| Metis | SC '20 | Schedule Long-Running Applications in Shared Container | RL | Code |
| RRL | SC '19 | Prediction Serving Systems | RL | Code |
| SmartHarvest | EuroSys '21 | VM Resource Harvesting | LR | |
| OFC | EuroSys '21 | Caching System for FaaS | DT | Code |
| Apichecker | EuroSys '20 | Malware Detection | MLs | Code |
| Optimus | EuroSys '18 | DL Job Scheduler | NNLS | Code |
| Tiramisu | MLSys '21 | Cost Model for Deep Learning Compiler | NN | Code |
| PARIS | SoCC '17 | Choose VM Configurations | Random Forest | |
| Bao | SIGMOD '21 | Query Optimization | Tree CNN | Code |
| QuickSel | SIGMOD '20 | Query-driven Estimation | MLs | Code |
| Lachesis | VLDB '21 | Data Integration and UDF-Centric Analytics System | RL | Code |
| Naru | VLDB '20 | Query Optimization | Autoregressive Models | Code |
| CARDLEARNER | VLDB '18 | Learn Cardinality Model | ML | |
| Aurora | ICML '19 | Internet Congestion Control | RL | Code |
| LSTMDemo | ICML '18 | Memory Prefetching | LSTM | Code |
| Placeto | NeurIPS '19 | Model Parallel Device Placement | RL | Code |
| DL2 | TPDS '21 | DL Job Scheduler | NN+RL | Code |
| MLFS | CoNEXT '20 | DL Job Scheduler | RL | Code |
| MSCN | CIDR '19 | Cardinality Estimate | DL | Code |
| DeepRM | HotNets '16 | Resource Management | RL | Code |
| CDBTune | ICMD '19 | Cloud Database Tuning | RL | Code |
| FOOP | arXiv | Join Query Optimization | RL | Code |
| RL-Cache | JSAC '20 | Cache Admission in Edge CDN | RL | Code |
| DeepSketch | FAST '22 | Data Reduction | NN | Code |