CUDA OOM #1

ducha-aiki · 2024-03-01T12:02:58Z

Hi,

The performance is really amazing on the few image pairs I have tried.
However, when I moved to a bigger scenes (29 images), it crashes with CUDA OOM on 16Gb V100.
Any recommendations how can I run it?

  File "/home/old-ufo/dev/dust3r/dust3r/cloud_opt/optimizer.py", line 176, in forward
    aligned_pred_i = geotrf(pw_poses, pw_adapt * self._stacked_pred_i)
  File "/home/old-ufo/dev/dust3r/dust3r/utils/geometry.py", line 86, in geotrf
    pts = pts @ Trf[..., :-1, :] + Trf[..., -1:, :]

torch.cuda.OutOfMemoryError: CUDA out of memory. Tried to allocate 2.38 GiB. GPU 0 has a total capacity of 15.77 GiB of which 775.88 MiB is free. Including non-PyTorch memory, this process has 15.01 GiB memory in use. Of the allocated memory 13.70 GiB is allocated by PyTorch, and 922.14 MiB is reserved by PyTorch but unallocated. If reserved but unallocated memory is large try setting max_split_size_mb to avoid fragmentation.  See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF

The text was updated successfully, but these errors were encountered:

jerome-revaud · 2024-03-01T13:16:02Z

Oh... No easy fix that i can see, we usually perform our experiments on A100 with 80GB so we never particularly optimized the memory, sorry ^^'
With 80GB, we could optimize scenes with 200+ images.

jerome-revaud · 2024-03-01T13:16:25Z

maybe @yocabon would have a better idea?

jerome-revaud · 2024-03-01T13:31:47Z

One solution, kindly suggested by my colleague Romain Bregier, is to have the global alignment running on CPU. Will be slower but will not crash...

ducha-aiki · 2024-03-01T13:47:13Z

Thank you, will try it.
And for GPU - is there a way to use multi GPU? I have a server with 8 V100 = 8 x 16Gb, not A100 unfortunately

yocabon · 2024-03-01T14:32:58Z

Hi,
I updated the demo script to expose the "scene_graph" parameter. By default, we make all possible pairs, but it explodes when you add many images. Use the "sliding window" or "one reference" method to make fewer pairs, then it should fit in memory.

No we didn't implement multi gpu for the inference.

ducha-aiki · 2024-03-01T14:45:52Z

Oh, that's super useful, thank you!

nickponline · 2024-03-05T00:34:33Z

How do we set the global alignment to run on CPU?

nickponline · 2024-03-05T00:46:01Z

I think maybe this scene = global_aligner(output, device="cpu", mode=mode)

nickponline · 2024-03-05T00:56:52Z

That seems to work ^

ducha-aiki · 2024-03-05T18:05:27Z

@nickponline Just tried 36 images on CPU, now I have the OOM CPU error on a machine with 120 Gb.
Is there a way to reduce number of points besides using 224x224 resolution?

RuntimeError: [enforce fail at alloc_cpu.cpp:83] err == 0. DefaultCPUAllocator: can't allocate memory: you tried to allocate 2005401600 bytes. Error code 12 (Cannot allocate memory)

jerome-revaud · 2024-03-05T18:08:04Z

@ducha-aiki do you have a scene covisibility graph? if so, this would greatly reduce the memory usage. On an A100 with 80GB, we are able to optimize scenes with 200+ images when we use 10NN per image.

jerome-revaud · 2024-03-05T18:11:35Z

we had this implemented here

dust3r/dust3r/image_pairs.py

Line 11 in b6eb957

def make_pairs(imgs, scene_graph='complete', prefilter=None, symmetrize=True):

but it didn't make it in the final version ...

ducha-aiki · 2024-03-05T19:07:59Z

I don't have a co-visibility graph, but I can probably run DINOv2, or SALAD to get an estimation. Thank you for the suggestion

don't default focal clip

xuyanging · 2024-09-24T06:06:40Z

Oh... No easy fix that i can see, we usually perform our experiments on A100 with 80GB so we never particularly optimized the memory, sorry ^^' With 80GB, we could optimize scenes with 200+ images.

Good, I replace 3090 laptop with H100 and it works!

ffrivera0 mentioned this issue Mar 6, 2024

Getting torch.cuda.OutOfMemoryError using more than 16 images #28

Open

vincent-leroy mentioned this issue Mar 11, 2024

Running on Apple Mac M2 #31

Open

yocabon pushed a commit that referenced this issue Apr 4, 2024

Merge pull request #1 from Parskatt/dont-default-clip-focal

fc299fe

don't default focal clip

kairunwen mentioned this issue Sep 11, 2024

torch.cuda.OutOfMemoryError: CUDA out of memory NVlabs/InstantSplat#10

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

CUDA OOM #1

CUDA OOM #1

ducha-aiki commented Mar 1, 2024

jerome-revaud commented Mar 1, 2024

jerome-revaud commented Mar 1, 2024

jerome-revaud commented Mar 1, 2024

ducha-aiki commented Mar 1, 2024

yocabon commented Mar 1, 2024

ducha-aiki commented Mar 1, 2024

nickponline commented Mar 5, 2024

nickponline commented Mar 5, 2024

nickponline commented Mar 5, 2024

ducha-aiki commented Mar 5, 2024

jerome-revaud commented Mar 5, 2024

jerome-revaud commented Mar 5, 2024

ducha-aiki commented Mar 5, 2024

xuyanging commented Sep 24, 2024

CUDA OOM #1

CUDA OOM #1

Comments

ducha-aiki commented Mar 1, 2024

jerome-revaud commented Mar 1, 2024

jerome-revaud commented Mar 1, 2024

jerome-revaud commented Mar 1, 2024

ducha-aiki commented Mar 1, 2024

yocabon commented Mar 1, 2024

ducha-aiki commented Mar 1, 2024

nickponline commented Mar 5, 2024

nickponline commented Mar 5, 2024

nickponline commented Mar 5, 2024

ducha-aiki commented Mar 5, 2024

jerome-revaud commented Mar 5, 2024

jerome-revaud commented Mar 5, 2024

ducha-aiki commented Mar 5, 2024

xuyanging commented Sep 24, 2024