Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

memory preset results in UnboundLocalError #43

Open
aa956 opened this issue Nov 27, 2024 · 1 comment
Open

memory preset results in UnboundLocalError #43

aa956 opened this issue Nov 27, 2024 · 1 comment

Comments

@aa956
Copy link

aa956 commented Nov 27, 2024

local-gemma version 0.2.0, pipx install on Linux, NVidia RTX 3090:

user:~$ local-gemma --model="27b" --preset="memory" "What is the capital of Germany?"
Traceback (most recent call last):
  File "/home/user/.local/bin/local-gemma", line 8, in <module>
    sys.exit(main())
             ^^^^^^
  File "/home/user/.local/share/pipx/venvs/local-gemma/lib/python3.11/site-packages/local_gemma/cli.py", line 178, in main
    if spare_memory / 1e9 > 5:
       ^^^^^^^^^^^^
UnboundLocalError: cannot access local variable 'spare_memory' where it is not associated with a value
user:~$ local-gemma --model="9b" --preset="memory" "What is the capital of Germany?"
Traceback (most recent call last):
  File "/home/user/.local/bin/local-gemma", line 8, in <module>
    sys.exit(main())
             ^^^^^^
  File "/home/user/.local/share/pipx/venvs/local-gemma/lib/python3.11/site-packages/local_gemma/cli.py", line 178, in main
    if spare_memory / 1e9 > 5:
       ^^^^^^^^^^^^
UnboundLocalError: cannot access local variable 'spare_memory' where it is not associated with a value
user:~$ local-gemma --model="2b" --preset="memory" "What is the capital of Germany?"

Loading model with the following characteristics:
- Model name: google/gemma-2-2b-it
- Assistant model name: None
- Device: cuda
- Default data type: torch.bfloat16
- Optimization preset: memory
- Generation arguments: {'do_sample': True, 'temperature': 0.7}
- Base prompt: None

Loading checkpoint shards: 100%|█████████████████████████████████████████████████████████████████████████████████████████████| 2/2 [00:01<00:00,  1.88it/s]
The 'max_batch_size' argument of HybridCache is deprecated and will be removed in v4.46. Use the more precisely named 'batch_size' argument instead.
The capital of Germany is **Berlin**. 

user:~$ nvidia-smi
Wed Nov 27 17:48:18 2024       
+---------------------------------------------------------------------------------------+
| NVIDIA-SMI 535.183.01             Driver Version: 535.183.01   CUDA Version: 12.2     |
|-----------------------------------------+----------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |         Memory-Usage | GPU-Util  Compute M. |
|                                         |                      |               MIG M. |
|=========================================+======================+======================|
|   0  NVIDIA GeForce RTX 3090        On  | 00000000:01:00.0 Off |                  N/A |
|  0%   39C    P8              29W / 220W |     10MiB / 24576MiB |      1%      Default |
|                                         |                      |                  N/A |
+-----------------------------------------+----------------------+----------------------+
                                                                                         
+---------------------------------------------------------------------------------------+
| Processes:                                                                            |
|  GPU   GI   CI        PID   Type   Process name                            GPU Memory |
|        ID   ID                                                             Usage      |
|=======================================================================================|
|    0   N/A  N/A      1380      G   /usr/lib/xorg/Xorg                            4MiB |
+---------------------------------------------------------------------------------------+

@brianN0KZ
Copy link

This defect is hard to understand. From what I see in the code, no preset other than "auto" could ever have worked as that is the only case that defines the spare_memory variable.

Is this abandonware?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants