[BUG] FLUX model components fail to utilize available GPU memory (6.8GB/16GB used) with t5xxl/vae/clip falling back to CPU #199

WpythonW · 2024-10-30T11:43:31Z

Issue Description

I'm experiencing inefficient GPU utilization with FLUX model components. While the main FLUX model uses GPU (6.8GB VRAM), other components (t5xxl-q4_0, ae-fp16, clip_l-fp16) seem to run on CPU despite having 9.2GB of free GPU memory available.

Expected behavior: All model components should utilize available GPU memory for optimal performance.
Actual behavior:

Main FLUX model uses GPU (6.8GB VRAM)
t5xxl-q4_0, ae-fp16, and clip_l-fp16 components appear to run on CPU
Getting warnings: WARNING: Behavior may be unexpected when allocating 0 bytes for ggml_calloc!

Steps to Reproduce

Install Nexa SDK with CUDA support:

!CMAKE_ARGS="-DGGML_CUDA=ON -DSD_CUBLAS=ON" pip install nexaai --prefer-binary --index-url https://nexaai.github.io/nexa-sdk/whl/cu124 --extra-index-url https://pypi.org/simple --no-cache-dir

Run the following code:

from nexa.gguf import NexaImageInference
model_path = "FLUX.1-schnell:q4_0"
inference = NexaImageInference(
    model_path=model_path,
    wtype="q4_0",
    num_inference_steps=5,
    width=1024,
    height=1024,
    guidance_scale=1.5,
    random_seed=42
)
img = inference.txt2img("A sunset over a mountain range")

OS

Linux (Kaggle environment)

Python Version

Version: 3.10

Nexa SDK Version

nexaai-0.0.9.0

GPU (if using one)

NVIDIA Tesla P100 (16GB VRAM)

The text was updated successfully, but these errors were encountered:

zhiyuan8 · 2024-11-02T18:18:19Z

Thank you for reaching out with your request!

We are actively addressing issues related to inefficient GPU utilization for FLUX model components. We are trying to offload more FLUX components to GPU and will fix this soon.

WpythonW added the 🐞 bug label Oct 30, 2024

WpythonW changed the title ~~[BUG] <title>~~ [BUG] FLUX model components fail to utilize available GPU memory (6.8GB/16GB used) with t5xxl/vae/clip falling back to CPU Oct 30, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[BUG] FLUX model components fail to utilize available GPU memory (6.8GB/16GB used) with t5xxl/vae/clip falling back to CPU #199

[BUG] FLUX model components fail to utilize available GPU memory (6.8GB/16GB used) with t5xxl/vae/clip falling back to CPU #199

WpythonW commented Oct 30, 2024

zhiyuan8 commented Nov 2, 2024

[BUG] FLUX model components fail to utilize available GPU memory (6.8GB/16GB used) with t5xxl/vae/clip falling back to CPU #199

[BUG] FLUX model components fail to utilize available GPU memory (6.8GB/16GB used) with t5xxl/vae/clip falling back to CPU #199

Comments

WpythonW commented Oct 30, 2024

Issue Description

Steps to Reproduce

OS

Python Version

Nexa SDK Version

GPU (if using one)

zhiyuan8 commented Nov 2, 2024