Dedicated GPU's for time slicing on multi GPU set ups. #628

joe-schwartz-certara · 2024-04-08T00:16:09Z

I'm wondering if there is a simple way to set up a configuration for dedicating a single GPU on a multi-GPU system to time-slicing. For example, my use case is that I have some services which are critical and some which are not and I want to use time-slicing for the non-critical services and leave dedicated GPUs to the critical services.

It seems like this plugin is close to allowing that and I was expecting something like

version: v1
sharing:
timeSlicing:
renameByDefault: true
resources:
- name: nvidia.com/gpu
id: 0
replicas: 10

to select ten time-slicing replicas for the 0-th GPU as indexed via nvidia-smi and then I would request resources to pods via either nvidia.com/gpu.shared (for non-dedicated GPU usage on the 0-th GPU) or nvidia.com/gpu (for dedicated GPU usage). Is this kind of fine-grained control planned for the future or is there something simple I can do to route only some of the hardware thru the sharing part of the plugin?

frittentheke · 2024-07-02T10:19:31Z

I have the exact same question. Looking at the code doing the timeSlicing (a7c5dcf) it's possible to define devices via GPU index, GPU UUID or even MIG UUID.

But apparently device selection is currently "disabled" via

k8s-device-plugin/api/config/v1/config.go

Line 89 in 35ad180

// Disable renaming / device selection in Sharing.TimeSlicing.Resources

This restriction was there from the beginning, if you look at https://github.com/NVIDIA/k8s-device-plugin/blame/b9fe486d8b7c581e1b144ea31f0d6f6173668601/cmd/gpu-feature-discovery/main.go#L276 when the code was copied over from https://github.com/NVIDIA/gpu-feature-discovery/blob/152fa93619e973043d936f19bf20bb465c1ab289/cmd/gpu-feature-discovery/main.go#L276

@elezar @ArangoGutierrez @tariq1890 since you contributed (to) this code, may I kindly ask you to elaborate if adding the capability to "only" do timeSlicing / create replicas for a subset of GPU or MIG instances?

I myself would love to partition all my GPUs via MIG, but only also enable timeSlicing on the MIG instances of the first two.
Not being able to filter whole GPUs is even worse as this requires all GPUs in a machine to either do time-slicing or not (via node-specific config).

joe-schwartz-certara · 2024-08-05T16:26:53Z

@frittentheke I am still bouncing around ideas on how to do the kind of fine-grained GPU access control that you and I both need. I discovered that you can override the envvar assignment from the plugin by just setting

        - name: NVIDIA_VISIBLE_DEVICES
          value: <comma separated list of the exact GPU uuids that you want the pod to use>

And if you use the same uuid(s) for 2 different pod specs, the applications will share the gpu selected with no problems. My lack of problems for this oversubscription method w/o using time slicing is probably due to the nature of the applications im running (they both claim all the VRAM they will need as soon as they start up) but I still am worried that this deployment strategy has some unknown issues since I'm basically just ignoring the plugin entirely.

As has been mentioned before: https://docs.google.com/document/d/1BNWqgx_SmZDi-va_V31v3DnuVwYnF2EmN7D-O_fB6Oo/edit#heading=h.bxuci8gx6hna This feature does exactly what we want. But we have to wait...

I will also comment that another, hacky workaround is to use the 'whole' GPU mig partitions (i.e. on an 80gb a100, nvidia.com/mig-7g.80gb is the whole gpu), set only some of the node's gpus to the 'whole' partition, and then select only those that are partitioned for time-slicing. I still foresee problems if you need even more fine control i.e. where you have application a,b,c, and d and a+b can share a gpu, as well as c+d, but c+a cannot (a scenario where application a,c are large gpu requirements but b, d are small). The way k8 routing works, you cannot assure that your resources will get allocated as a+b and c+d instead of some other combination.

frittentheke · 2024-08-06T07:05:57Z

I suppose the rather new Dynamic Resource Allocation (https://kubernetes.io/docs/concepts/scheduling-eviction/dynamic-resource-allocation/) is what will somewhat solve this issue of dedicated resources to be claimed by workload.
NVIDIA apparently is working on a driver for their GPU resources: https://github.com/NVIDIA/k8s-dra-driver

github-actions · 2024-11-05T04:27:59Z

This issue is stale because it has been open 90 days with no activity. This issue will be closed in 30 days unless new comments are made or the stale label is removed.

github-actions · 2024-12-05T04:29:15Z

This issue was automatically closed due to inactivity.

frittentheke mentioned this issue Jul 16, 2024

How to specify different Strategy per node + is there a Strategy to discover GPU 'as is' ? NVIDIA/gpu-operator#835

Closed

github-actions bot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Nov 5, 2024

github-actions bot closed this as not planned Won't fix, can't repro, duplicate, stale Dec 5, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Dedicated GPU's for time slicing on multi GPU set ups. #628

Dedicated GPU's for time slicing on multi GPU set ups. #628

joe-schwartz-certara commented Apr 8, 2024

frittentheke commented Jul 2, 2024

joe-schwartz-certara commented Aug 5, 2024

frittentheke commented Aug 6, 2024

github-actions bot commented Nov 5, 2024

github-actions bot commented Dec 5, 2024

Dedicated GPU's for time slicing on multi GPU set ups. #628

Dedicated GPU's for time slicing on multi GPU set ups. #628

Comments

joe-schwartz-certara commented Apr 8, 2024

frittentheke commented Jul 2, 2024

joe-schwartz-certara commented Aug 5, 2024

frittentheke commented Aug 6, 2024

github-actions bot commented Nov 5, 2024

github-actions bot commented Dec 5, 2024