In what follows, we present the steps we took in order to produce the results in this paper. Some steps rely on the GraphSense API and some others on Spark jobs running on a GraphSense cluster, even though we still provide the resulting data.
You may choose to run all the cells in the notebooks or skip some by re-using the data we already fetched and stored or computed.
- python3 + jupyter notebook
- scala
- R
- get the data from Zenodo here and place the
datafolder next to thesrcfolder pip install -r requirements.txt- [optional] an API token for GraphSense
- [optional] GraphSense instance for Spark jobs
You can open each notebook and follow the instructions, run cells, play with the data and reproduce and check the results.
Collect snapshots of the Lightning Network by using describegraph from the LND client and store the data in data/level_2/ in three csv files: channel.csv, node.csv, ip_address.csv.
Fetch funding and settlement transactions and addresses starting from the list of channels in channel.csv.
Cluster BTC entities based on their interaction with the LN and produce a mapping between entities and components (stars, snakes, collectors, proxies).
Cluster LN nodes based on their aliases and IP information using different metrics.
Link BTC entities to LN nodes with two linking heuristics and compare the results with ground truth data we collected.
Based on the knowledge of who created a channel, identify entities that own large capacity shares. Secondly, study attack potential based on the off-chain clustering. This includes griefing attacks, DoS, wormholes, value privacy and relationship anonymity.
- keyspace used:
btc_transformed_20200909(last block height: 618857)