SPLAG is an FPGA accelerator for the single-source shortest path (SSSP) problem, featuring:
- A coarse-grained priority queue (CGPQ) that enables high-throughput priority-order graph traversal with a large frontier
- A customized vertex cache (CVC) that reduces off-chip memory access and improves the random-access throughput to read and update vertex data
- Outstanding performance & energy efficiency on a single U280 FPGA
- Up to a 4.9× speedup over state-of-the-art SSSP accelerators
- Up to a 2.6× speedup over 32-thread CPU running at 4.4 GHz
- Up to a 0.9× speedup over an A100 GPU (that has 4.1× power budget and 3.4× HBM bandwidth)
- Could be placed in the 14th position of the Graph 500 benchmark for data intensive applications
- The highest using a single FPGA with only a 45 W power budget
- Open-source and fully parameterized TAPA HLS C++ implementation
- Easily portable to a different FPGA with a different configuration
- TAPA
- AutoBridge
- Xilinx Alveo U280 FPGA and its
xilinx_u280_xdma_201920_3shell platform
git clone https://github.com/UCLA-VAST/splag.git
cd splagmkdir build
cd build
cmake ..
make sssp
./sssp ../data/graph500-scale-5make sssp-hw-xomake sssp-cosimmake SSSP.xilinx_u280_xdma_201920_3.hw_xclbin./sssp -v=3 --bitstream=SSSP.xilinx_u280_xdma_201920_3.hw.xclbin ../data/graph500-scale-5Currently, SPLAG takes as input the same binary edge list format as Graph 500.
Each dataset is stored as two separate files in the same directory, e.g., dataset and dataset.weights.
The dataset file is an array of PackedEdge.
The dataset.weights file is an array of 32-bit float.
The dataset is assumed to be undirected.
See the release page for more datasets and instructions on reproducing the experimental results in the FPGA'22 paper.