Reads transactions from the bitcoin blockchain into a file representing the transaction graph and performs address clustering on the transaction graph.
The clustering is performed completely in-memory. It only takes about half an hour but requires around 64GB of memory (depending on the used heuristics).
Requires C++11, no additional dependencies.
Run make
to build all required executables.
Detailed usage instructions for each executable are shown when run without any parameter. The general workflow is as follows:
- Run bitcoind for a long time to get the current blockchain.
- Run
./parsebc
on your blockchain data. This program creates the transaction graph binary file. - Check with
./cattxgraph
, whether the binary graph file looks good. - Run
./txperaddr
to extract the number of transactions per address. This information is required for some of the used heuristics. - Run
./clusterize
on the transaction graph file and on the transactions per address file to actually perform the clustering. - Check the results with
./catcluster
or crate a histogram using the cluster sizes (./histcluster
)