CUDA implementation of DBSCAN. Inspired from paper https://www.sciencedirect.com/science/article/pii/S1877050913003438
## Source code
- In
gdbscan_paper.cuyou can find the implementation most similar to the one proposed in the original paper. - In
gdbscan.cuyou can find my naive implementation, which improves 1 removing some sequential loops from the host in the BFS code. - In
gdbscan_shifted.cuthere's an attempt to change the memory access pattern incompute_degreesandcompute_adjacency_list. It edits 2 - In
gdbscan_shared.cuthere's an attempt to exploit shared memory incompute_degreesandcompute_adjacency_list. It edits 2
G-DBSCAN.ipynb is the code run on Google Colab to compare sklearn DBSCAN against my implementations.
Profiling has been conducted with 200 000 10-dimensional points generated by sklearn make_blobs.
The data is stored in profiling/data.txt.
- In
profiling/paperthere are execution times and profiling foreach kernel related togdbscan_paper.cusource code - In
profiling/standardthere are execution times and profiling foreach kernel related togdbscan.cusource code - In
profiling/shiftedthere are execution times and profiling foreach kernel related togdbscan_shifted.cusource code - In
profiling/sharedthere are execution times and profiling foreach kernel related togdbscan_shared.cusource code
There are also the clustering output files, to verify clustering quality against sklearn's implementation.