You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I'm experiencing inefficient GPU utilization with FLUX model components. While the main FLUX model uses GPU (6.8GB VRAM), other components (t5xxl-q4_0, ae-fp16, clip_l-fp16) seem to run on CPU despite having 9.2GB of free GPU memory available.
Expected behavior: All model components should utilize available GPU memory for optimal performance.
Actual behavior:
Main FLUX model uses GPU (6.8GB VRAM)
t5xxl-q4_0, ae-fp16, and clip_l-fp16 components appear to run on CPU
Getting warnings: WARNING: Behavior may be unexpected when allocating 0 bytes for ggml_calloc!
WpythonW
changed the title
[BUG] <title>
[BUG] FLUX model components fail to utilize available GPU memory (6.8GB/16GB used) with t5xxl/vae/clip falling back to CPU
Oct 30, 2024
We are actively addressing issues related to inefficient GPU utilization for FLUX model components. We are trying to offload more FLUX components to GPU and will fix this soon.
Issue Description
I'm experiencing inefficient GPU utilization with FLUX model components. While the main FLUX model uses GPU (6.8GB VRAM), other components (t5xxl-q4_0, ae-fp16, clip_l-fp16) seem to run on CPU despite having 9.2GB of free GPU memory available.
Expected behavior: All model components should utilize available GPU memory for optimal performance.
Actual behavior:
WARNING: Behavior may be unexpected when allocating 0 bytes for ggml_calloc!
Steps to Reproduce
OS
Linux (Kaggle environment)
Python Version
Version: 3.10
Nexa SDK Version
nexaai-0.0.9.0
GPU (if using one)
NVIDIA Tesla P100 (16GB VRAM)
The text was updated successfully, but these errors were encountered: