Skip to content

Commit

Permalink
test: reduce iterations
Browse files Browse the repository at this point in the history
The `gpu_fft_consistency()` test takes a long time to run. It is run
24 times, increasing the size by factor 2.

On CI the output looks like that:

```
test domain::tests::gpu_fft_consistency ... [2020-12-17T15:59:02Z INFO  bellperson::gpu::utils] Device: Device { brand: Nvidia, name: "Tesla T4", memory: 15843721216, bus_id: 0, platform: Platform(PlatformId(0x55af1c5ca8b0)), device: Device(DeviceId(0x55af1c5ca980)) }
[2020-12-17T16:00:03Z INFO  bellperson::gpu::fft] FFT: 1 working device(s) selected.
[2020-12-17T16:00:03Z INFO  bellperson::gpu::fft] FFT: Device 0: Tesla T4
Testing FFT for 2 elements...
GPU took 0ms.
CPU (8 cores) took 0ms.
Speedup: xNaN
============================
Testing FFT for 4 elements...
GPU took 0ms.
CPU (8 cores) took 0ms.
Speedup: xNaN
============================
Testing FFT for 8 elements...
GPU took 0ms.
CPU (8 cores) took 0ms.
Speedup: xNaN
============================
Testing FFT for 16 elements...
GPU took 0ms.
CPU (8 cores) took 2ms.
Speedup: xinf
============================
Testing FFT for 32 elements...
GPU took 0ms.
CPU (8 cores) took 2ms.
Speedup: xinf
============================
Testing FFT for 64 elements...
GPU took 0ms.
CPU (8 cores) took 2ms.
Speedup: xinf
============================
Testing FFT for 128 elements...
GPU took 0ms.
CPU (8 cores) took 3ms.
Speedup: xinf
============================
Testing FFT for 256 elements...
GPU took 0ms.
CPU (8 cores) took 4ms.
Speedup: xinf
============================
Testing FFT for 512 elements...
GPU took 0ms.
CPU (8 cores) took 8ms.
Speedup: xinf
============================
Testing FFT for 1024 elements...
GPU took 0ms.
CPU (8 cores) took 16ms.
Speedup: xinf
============================
Testing FFT for 2048 elements...
GPU took 1ms.
CPU (8 cores) took 36ms.
Speedup: x36
============================
Testing FFT for 4096 elements...
GPU took 1ms.
CPU (8 cores) took 74ms.
Speedup: x74
============================
Testing FFT for 8192 elements...
GPU took 1ms.
CPU (8 cores) took 156ms.
Speedup: x156
============================
Testing FFT for 16384 elements...
GPU took 1ms.
CPU (8 cores) took 319ms.
Speedup: x319
============================
Testing FFT for 32768 elements...
GPU took 2ms.
CPU (8 cores) took 656ms.
Speedup: x328
============================
Testing FFT for 65536 elements...
GPU took 3ms.
CPU (8 cores) took 1372ms.
Speedup: x457.33334
============================
Testing FFT for 131072 elements...
GPU took 14ms.
CPU (8 cores) took 2849ms.
Speedup: x203.5
============================
Testing FFT for 262144 elements...
GPU took 22ms.
CPU (8 cores) took 5865ms.
Speedup: x266.5909
============================
Testing FFT for 524288 elements...
GPU took 33ms.
CPU (8 cores) took 12152ms.
Speedup: x368.24243
============================
Testing FFT for 1048576 elements...
GPU took 49ms.
CPU (8 cores) took 25260ms.
Speedup: x515.5102
============================
Testing FFT for 2097152 elements...
GPU took 82ms.
CPU (8 cores) took 52277ms.
Speedup: x637.5244
============================
Testing FFT for 4194304 elements...
GPU took 141ms.
CPU (8 cores) took 106119ms.
Speedup: x752.617
============================
Testing FFT for 8388608 elements...
GPU took 274ms.
CPU (8 cores) took 218919ms.
Speedup: x798.9744
============================
Testing FFT for 16777216 elements...
GPU took 533ms.
CPU (8 cores) took 448308ms.
Speedup: x841.1032
============================
```

The bigger the size, the bigger the speedup. The last iteration takes over 7 minutes
to run. I don't think it adds much value having such a long running test. Hence this
commit reduces the iterations and hence the run time. The longest iteration now takes
25s.

The same is true for `gpu_multiexp_consistency()`, which is noe reduced to a total of
6 iterations, where the longest runs 27s.
  • Loading branch information
vmx committed Dec 18, 2020
1 parent 7ee398a commit d7956c8
Show file tree
Hide file tree
Showing 2 changed files with 3 additions and 3 deletions.
2 changes: 1 addition & 1 deletion src/domain.rs
Original file line number Diff line number Diff line change
Expand Up @@ -596,7 +596,7 @@ mod tests {
let log_cpus = worker.log_num_cpus();
let mut kern = gpu::FFTKernel::create(false).expect("Cannot initialize kernel!");

for log_d in 1..25 {
for log_d in 1..=20 {
let d = 1 << log_d;

let elems = (0..d)
Expand Down
4 changes: 2 additions & 2 deletions src/multiexp.rs
Original file line number Diff line number Diff line change
Expand Up @@ -418,7 +418,7 @@ pub fn gpu_multiexp_consistency() {
let _ = env_logger::try_init();
gpu::dump_device_list();

const MAX_LOG_D: usize = 20;
const MAX_LOG_D: usize = 16;
const START_LOG_D: usize = 10;
let mut kern = Some(gpu::LockedMultiexpKernel::<Bls12>::new(MAX_LOG_D, false));
let pool = Worker::new();
Expand All @@ -432,7 +432,7 @@ pub fn gpu_multiexp_consistency() {
bases = [bases.clone(), bases.clone()].concat();
}

for log_d in START_LOG_D..(MAX_LOG_D + 1) {
for log_d in START_LOG_D..=MAX_LOG_D {
let g = Arc::new(bases.clone());

let samples = 1 << log_d;
Expand Down

0 comments on commit d7956c8

Please sign in to comment.