A c library implementing the POST API setup method for general-purpose CPUs and for CUDA and Vulkan compute processors.
Windows 10, macOS or Ubuntu. One or more of the following processors:
- A GPU and drivers with CUDA support (minimum compute compatibility 5.0, maximum compute compatibility 8.6), such as a modern Nvidia GPU and Nvidia drivers version R450 or newer.
- A GPU and drivers with Vulkan 1.3 support such as a modern AMD, Apple M1 processor, and Intel GPUs.
- A x86-64 cpu such as AMD or Intel CPUs.
- Both discrete and on-board GPUs are supported as long as they support the minimum CUDA or Vulkan runtime version.
- We currently provide release binaries and build instructions for Ubuntu 20.04 but the library can be built on other Linux distros for usage on these systems.
- 16 KiB per CUDA core for CUDA
- 4 MiB per compute unit for Vulkan
- 2080 MiB
- For building CUDA support: NVIDIA Cuda Toolkit 11, an NVIDIA GPU with CUDA support, and an Nvdia driver version R450 or newer.
- For building Vulkan support: Vulkan SDK 1.3 and a GPU with Vulkan 1.3 runtime support.
- Windows 10 Pro.
- Microsoft Visual Studio 2022
- You may also need to install specific versions of the Windows SDK when prompted when attempting to build the library for the first time.
- Ubuntu 22.04
- Cmake, GCC 7
- Xcode
- Xcode Command Line Dev Tools
- Cmake, GCC 7
- Install latest version of Xcode with the command line dev tools.
- Download the Vulkan 1.3 sdk installer for macOS from https://vulkan.lunarg.com/sdk/home#mac
- Install Vulkan SDK with the Vulkan installer.
- Change directory to the folder where the SDK is installed (default
$ cd $HOME/VulkanSDK/1.3.xxx
) and run the install script with$ sudo ./install_vulkan.py
- Add the Vulkan env vars to your
.bash_profile
file with the root location set to the sdk directory on your hard-drive. For example, if Vulkan sdk 1.2.154 is installed then the env vars should be set like this:
export VULKAN_SDK_VERSION="1.3.xxx" # Replace xxx with actual version
export VULKAN_ROOT_LOCATION="$HOME/VulkanSDK/1.3.xxx" # adapt to install location on your machine
export VULKAN_SDK="$VULKAN_ROOT_LOCATION/macOS"
export VK_ICD_FILENAMES="$VULKAN_SDK/share/vulkan/icd.d/MoltenVK_icd.json"
export VK_LAYER_PATH="$VULKAN_SDK/share/vulkan/explicit_layers.d"
export PATH="/usr/local/opt/python/libexec/bin:$VULKAN_SDK/bin:$PATH"
export DYLD_LIBRARY_PATH="$DYLD_LIBRARY_PATH:$VULKAN_SDK/lib/"
Default build configuration:
SPACEMESHCUDA "Build with CUDA support" default: ON
SPACEMESHVULKAN "Build with Vulkan support" default: ON
SPACEMESHCUDA "Build with CUDA support" default: OFF
SPACEMESHVULKAN "Build with Vulkan support" default: ON
To build the library with full support for both CUDA and Vulkan on Windows or on Linux use a system with an Nvidia GPU and drivers. Otherwise, turn off CUDA support and build for Vulkan only. Building on macOS only supports Vulkan.
- Open project folder into Visual Studio 2017:
File -> Open -> Folder
. - Set
x64-Release
Project Settings. - Build:
CMake -> Rebuild All
. - Run test:
CMake -> Debug from Build Folder -> gpu-setup-test.exe
Configure your build using the default configuration:
cmake -B build
To disable CUDA use:
cmake -B build -DSPACEMESHCUDA=OFF
To disable VULKAN use:
cmake -B build -DSPACEMESHVULKAN=OFF
Build the project:
cmake --build build
Run the tests:
./build/test/gpu-setup-test -t
./build/test/gpu-setup-test -u
./build/test/gpu-setup-test -b
- Since the test app is not notarized, you need to enable it via
spctl --add /path/to/gpu-setup-test
or by right-click-open it and clickopen
. - Set execute permissions. e.g.
chmod a+x gpu-setup-test
- Add the test app's path to the dynamic lib search path, e.g.
export DYLD_LIBRARY_PATH=.
- Set execute permissions. e.g.
chmod a+x gpu-setup-test
- Add the test app's path to the dynamic lib search path, e.g.
export LD_LIBRARY_PATH=.
Run from the console to print usage:
$ gpu-setup-test
Usage:
--list or -l print available providers
--benchmark or -b run benchmark
--core or -c test the core library use case
--test or -t run basic test
--test-vector-check run a CPU test and compare with test-vector
--test-pow or -tp test pow computation
--test-leafs-pow or -tlp test pow computation while computing leafs
--unit-tests or -u run unit tests
--integration-tests or -i run integration tests
--label-size or -s <1-256> set label size [1-256]
--labels-count or -n <1-32M> set labels count [up to 32M]
--reference-provider or -r <id> the result of this provider will be used as a reference [default - CPU]
--print or -p print detailed data comparison report for incorrect results
--pow-diff or -d <0-256> count of leading zero bits in target D value [default - 16]
--srand-seed or -ss <unsigned int> set srand seed value for POW test: 0 - use zero id/seed [default], -1 - use random value
--solution-idx or -si <unsigned int> set solution index for POW test: index will be compared to be the found solution for Pow [default - unset]
By default, the library does not detect supported Vulkan GPUs if CUDA GPUs are detected. This behavior can be changed using two environment variables:
SPACEMESH_DUAL_ENABLED
empty or 0 - default behavior
1 - detect Vulkan GPUs even if CUDA GPUs are detected
SPACEMESH_PROVIDERS_DISABLED
empty - default behavior
"cuda" - do not detect CUDA GPUs
"vulkan" - do not detect Vulkan GPUs
The library supports multiple compute providers at runtime. For best performance, use the following providers based on your OS and GPU:
OS / GPU | Windows | Linux | macOS |
---|---|---|---|
Nvidia | CUDA | CUDA | Vulkan |
AMD | Vulkan | Vulkan | Vulkan |
Intel | Vulkan | Vulkan | Vulkan |
Apple M1 | Vulkan | Vulkan | Vulkan |
Compute leaves and/or pow solution:
int scryptPositions(
uint32_t provider_id, // POST compute provider ID
const uint8_t *id, // 32 bytes
uint64_t start_position, // e.g. 0
uint64_t end_position, // e.g. 49,999
uint32_t hash_len_bits, // (1...256) for each hash output, the number of prefix bits (not bytes) to copy into the buffer
const uint8_t *salt, // 32 bytes
uint32_t options, // compute leafs and/or compute pow
uint8_t *out, // memory buffer large enough to include hash_len_bits * number of requested hashes
uint32_t N, // scrypt N
uint32_t R, // scrypt r
uint32_t P, // scrypt p
uint8_t *D, // Target D for the POW computation. 256 bits.
uint64_t *idx_solution, // index of output where output < D if POW compute was on. MAX_UINT64 otherwise.
uint64_t *hashes_computed, // The number of hashes computed, should be equal to the number of requested hashes.
uint64_t *hashes_per_sec // Performance
);
The api currently only supports the following N, P, R scrypt params.
- Supported N values: 1 - 28835
- Supported R values: 1
- Supported P values: 1
Gets the system's GPU capabilities. E.g. CUDA and/or NVIDIA or NONE:
int stats();
Stops all GPU work and don’t fill the passed-in buffer with any more results:
int stop(
uint32_t ms_timeout // timeout in milliseconds
);
Returns non-zero if stop in progress:
SPACEMESHAPI int spacemesh_api_stop_inprogress();
Returns POS compute providers info:
SPACEMESHAPI int spacemesh_api_get_providers(
PostComputeProvider *providers, // out providers info buffer, if NULL - returns count of available providers
int max_providers// buffer size
);
- Download release artifacts from a github release in this repo for your platform or build the artifacts from source code.
- Copy all artifacts to your project resources directory. The files should be included in your app's runtime resources.
- Use api.h to link the library from your code.
Integration test of the basic library use case in a Spacemesh full node to generate proof of space and find a pow solution:
/build/test/.gpu-setup-test -c -n 100663296 -d 20
Disclaimer: these are community submitted benchmarks which haven't been verified. Your milage may vary. The library is also likely to have bugs, is in alpha quality and the gpu-post algorithm is likely to change before the release of the Spacemesh 0.2 testnet.
gpu-setup-test -b -n 2000000
Date | Reporter | Release | Compute Provider | OS & CPU | Type | Driver | mh/s |
---|---|---|---|---|---|---|---|
06/21/2021 | Obsidian | v0.1.20 | Geforce RTX 2080ti 11GB @ stock (1350 mhz / 7000 mhz) | Windows 10 Pro v20H2, Build 19042.985, Intel i7-6700K @ 4.6ghz (HT enabled: 4c/8t) | CUDA | NVIDIA 466.11 | 2.56 |
06/22/2021 | Scerbera | v0.1.20 | Geforce RTX 2060 SUPER | Windows 10 | CUDA | NVIDIA 466.11 | 1.7 |
06/22/2021 | Scerbera | v0.1.20 | AMD Radeon Pro WX 7100 | Windows 10 | CUDA | NVIDIA 466.11 | 0.88 |
06/22/2021 | Scerbera | v0.1.20 | RX VEGA 64 - Core Clock 1500 MHz - Memory Clock 960MHz | Intel i7-8700K Windows 10 | Vulkan | Pro 20.Q4 | 0.9 |
06/22/2021 | Scerbera | v0.1.20 | WX7100 - Core Clock 1250MHz - Memory Clock 1700 MHz | Intel i7-8700K Windows 10 | Vulkan | Pro 20.Q4 | 0.87 |
06/28/2021 | cmoetzing | v0.1.20 | MSI GeForce RTX 2060 VENTUS GP OC - Core Clock 1365MHz - Memory Clock 1750 MHz | Ubuntu 20.04 Core i5-11600k | CUDA | NVIDIA 465.19.01 | 1.36 |
06/29/2021 | avive | v0.1.21 | GeForce RTX 3090 | Ubuntu 20.04 | CUDA | Nvidia 460.80 | 4.97 |
06/29/2021 | avive | v0.1.21 | GeForce RTX 3080 | Ubuntu 20.04 | CUDA | Nvidia 460.80 | 4.08 |
06/30/2021 | shanyaa | v0.1.21 | GeForce RTX 3070 @ 1.9 Ghz core, 6.8 Ghz mem | Windows 10 / AMD Ryzen 5800X | CUDA | Nvidia 466.63 | 2.7 |
06/30/2021 | shanyaa | v0.1.21 | GeForce RTX 3070 @ 2 Ghz core, 8.08 Ghz mem | Windows 10 / AMD Ryzen 5800X | CUDA | Nvidia 466.63 | 3.43 |
07/01/2021 | avive | v0.1.21 | Nvdia CMP 30HX | Ubuntu 20.04.2 LTS | CUDA | Nvidia 460.80 | 1.45 |
07/01/2021 | avive | v0.1.21 | GeForce RTX 2060 | Ubuntu 20.04.2 LTS | CUDA | Nvidia 465.27 | 1.56 |
07/01/2021 | shanyaa | v0.1.21 | Intel Iris Xe (integrated graphics) | Windows 10 / Intel core i7 1165G7 | Vulkan | Intel 27.20.100.9565 | 0.28 |
07/03/2021 | neodied | v0.1.21 | Radeon 5700XT @ 1333 MHz core, 1824 MHz mem | Windows 10 / Intel core i7 9700K | Vulkan | AMD Radeon Software 21.6.1 | 1.38 |
07/03/2021 | neodied | v0.1.21 | Radeon 5700XT @ 2016 MHz core, 1748 MHz mem | Windows 10 / Intel core i7 9700K | Vulkan | AMD Radeon Software 21.6.1 | 1.87 |
Scrypt Benchmarks (n=512, r=1, p=1) 1 byte per leaf, batch size leaves per API call.
Date | Reporter | impl | cpu / gpu | Host OS | notes | kh/s | mh/s | x factor over 1 4ghz cpu native thread | x factor over 12 4ghz cpu native threads |
---|---|---|---|---|---|---|---|---|---|
11/19/2019 | ae | go-scrypt | mbp + Intel i9 @ 2.9ghz - 1 core | OS X | go scrypt crypto lib (not scrypt-jane) | 7 | 0.01 | 1 | 1 |
11/19/2019 | ae | sm-scrypt | Ryzen 5 2600x @ 4ghz - 1 core | Windows 10 | scrypt-jane c code | 7 | 0.01 | 1 | 1 |
11/19/2019 | ae | sm-scrypt | Nvidia Geforce RTX 2070 8GB | Windows 10 | pre-optimized prototype | 1,920 | 1.92 | 290 | 24.17 |
11/19/2019 | ae | sm-scrypt | AMD Radeon RX 580 | Windows 10 | pre-optimized prototype | 500 | 0.50 | 76 | 6.29 |
11/19/2019 | ar | sm-scrypt | Nvidia GTX 1060 6G | Windows 10 | pre-optimized prototype | 979 | 0.98 | 148 | 12.32 |
11/19/2019 | ar | sm-scrypt | AMD Radeon 570 4GB | Windows 10 | pre-optimized prototype | 355 | 0.36 | 54 | 4.47 |
11/12/2019 | ae | sm-scrypt | AMD Radeon RX 580 | Windows 10 | optimized prototype | 926 | 0.93 | 140 | 11.65 |
11/12/2019 | ae | sm-scrypt | AMD Radeon RX 580 | Ubuntu 18.0.4.3 LTS | optimized prototype | 893 | 0.89 | 135 | 11.24 |
11/12/2019 | ae | sm-scrypt | Nvidia Geforce RTX 2070 8GB | Ubuntu 19.10 LTS | optimized prototype | 1,923 | 1.92 | 292 | 24.37 |
01/22/2020 | seagiv | sm-scrypt | Nvidia GTX 1060 6G | Windows 10 | vulkan pre-optimized prototype | 276 | |||
01/22/2020 | seagiv | sm-scrypt | AMD Radeon 570 4GB | Windows 10 | vulkan pre-optimized prototype | 269 | |||
01/27/2020 | seagiv | sm-scrypt | Nvidia GTX 1060 6G | Windows 10 | vulkan optimized prototype | 642 | |||
01/27/2020 | seagiv | sm-scrypt | AMD Radeon 570 4GB | Windows 10 | vulkan optimized prototype | 966 | |||
01/29/2020 | seagiv | sm-scrypt | AMD Radeon Pro 555x 4GB | macOS 10.14.6 | vulkan optimized prototype | 266 | |||
01/31/2020 | avive | sm-scrypt | AMD Radeon Pro 560x 4GB | macOS 10.14.6 | vulkan optimized prototype | 406 | |||
01/31/2020 | avive | sm-scrypt | Intel(R) UHD Graphics 630 1536MB | macOS 10.14.6 | vulkan optimized prototype | 53 | |||
05/06/2020 | avive | sm-scrypt | AMD Radeon RX 580 | Windows 10 | vulkan optimized prototype | 1,074 | 1.074 | ||
09/08/2020 | avive | sm-scrypt | Nvidia Tesla V 100 (16GB) | Ubuntu 20.04 NVIDIA-SMI 450.51.06 CUDA Version: 11.0 | CUDA optimized prototype | 4,166 | 4.166 | ||
09/08/2020 | avive | sm-scrypt | Nvidia Tesla T4 (16GB) | Ubuntu 20.04 NVIDIA-SMI 450.51.06 CUDA Version: 11.0 | CUDA optimized prototype | 1,252 | 1.252 | ||
09/08/2020 | avive | sm-scrypt | Nvidia Tesla P100-PCIE (32GB) | Ubuntu 20.04 NVIDIA-SMI 450.51.06 CUDA Version: 11.0 | CUDA optimized prototype | 2,083 | 2.083 | ||
09/08/2020 | avive | sm-scrypt | Nvidia Tesla P4 (32GB) | Ubuntu 20.04 NVIDIA-SMI 450.51.06 CUDA Version: 11.0 | CUDA optimized prototype | 757 | 0.75 | ||
04/04/2020 | avive | sm-scrypt | Apple M1 | MacOS 11.2 | vulkan optimized prototype | 214 | 0.214 | ||
04/21/2020 | avive | sm-scrypt | Nvidia RTX 2070 Super, 8GB | Ubuntu 20.04, Driver 460.73.01 | CUDA optimized prototype | 2,038 | 2.038 |
The library performance on a GPU depends on the GPU's CUDA and Vulkan performance. The following benchmarks are available from geekbench: