Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to configure LiteLLM to run the benchmark with a watsonx.ai model? #3023

Open
ameza13 opened this issue Jan 27, 2025 · 2 comments
Open

Comments

@ameza13
Copy link

ameza13 commented Jan 27, 2025

Issue

Hello,

I followed the steps from benchmark/README.md to run the benchmark.py script inside the container.

As I want to run the the benchmark with watsonx/ibm/granite-3-8b-instruct , I exported my watsonx.ai creds inside the container.

WATSONX_URL=XXXXXXXXXXXXXXX
WATSONX_APIKEY=XXXXXXXXXXXXXXX
WATSONX_PROJECT_ID=XXXXXXXXXXXXXXX

The script is able to pick them up as I modify benchmark.py to print the content of these environment variables and everything looks good. However, when I execute: ./benchmark/benchmark.py api-key-test-run-12 --model watsonx/ibm/granite-3-8b-instruct --edit-format whole --threads 1 --exercises-dir polyglot-benchmark --num-tests 1 I get the error ValueError: API key is required from litellm library.

litellm.APIConnectionError: API key is required
Traceback (most recent call last):
  File "/usr/local/lib/python3.11/dist-packages/litellm/main.py", line 2572, in completion
    response = watsonx_chat_completion.completion(
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/dist-packages/litellm/llms/watsonx/chat/handler.py", line 46, in completion
    headers = watsonx_chat_transformation.validate_environment(
              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/dist-packages/litellm/llms/watsonx/common_utils.py", line 187, in validate_environment
    token = _generate_watsonx_token(api_key=api_key, token=token)
            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/dist-packages/litellm/llms/watsonx/common_utils.py", line 75, in _generate_watsonx_token
    token = generate_iam_token(api_key)
            ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/dist-packages/litellm/llms/watsonx/common_utils.py", line 43, in generate_iam_token
    raise ValueError("API key is required")
ValueError: API key is required

My watsonx creds are valid, as a basic llmlite connection script (as the one below) works perfectly.

import os
from litellm import completion

os.environ["WATSONX_URL"] = "<value>"
os.environ["WATSONX_APIKEY"] = "<value>"
os.environ["WATSONX_PROJECT_ID"] = "<value>"

response = completion(
  model="watsonx/ibm/granite-3-8b-instruct",
  api_key="naQq8v-fEbZJ9KEUCOrc9cUieBJX8TSYXtmNyY7EbTrq",
  messages=[
        {"role": "system", "content": "You are such a good LLM."},
        {"role": "user", "content": "Tell me a joke about dogs."},
    ],
)

# Parse the completion response
if "choices" in response:
    completion_text = response["choices"][0]["message"]["content"]  # Extract the text
    print("Generated Completion:")
    print(completion_text)
else:
    print("Error: Response does not contain 'text'")

Am I missing something?

Version and model info

No response

@Bourhano
Copy link

Bourhano commented Jan 28, 2025

Same issue, simple reproduction on litellm:

from litellm import completion

response = completion(
  model="watsonx/"+model,
  messages=[{ "content": "what is your favorite colour?","role": "user"}],
)

with credentials being loaded as Env variables

@ameza13
Copy link
Author

ameza13 commented Jan 28, 2025

Hello @paul-gauthier, I saw your comments bout watsonx config for the aider tool in this issue #642 . Do you have any advice/pointers to run the benchmar with watsonx models? Speifically watsonx/ibm/granite-3-8b-instruct.

Thanks,

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants