Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Question about a custom MII task #131

Open
pablogranolabar opened this issue Dec 31, 2022 · 3 comments
Open

Question about a custom MII task #131

pablogranolabar opened this issue Dec 31, 2022 · 3 comments

Comments

@pablogranolabar
Copy link

pablogranolabar commented Dec 31, 2022

Hello DeepSpeed-MII Team,

I have a project I am using MII for with GPT-J, and one of the requirements is the ability to pass return_dict_in_generate. So far I have gone thru the process flow and have modified the following:

server_client.py (mii_query_handle)
modelresponse_server.py (GeneratorReply())
modelresponse.proto (MultiStringReply)

However I am getting a Method not implemented exception in the gRPC channel after deploying the model.

So questions if anyone has the knowledge:

  1. is this the right or preferred approach for extending the text generation task to return additional information from the deployed HF model?
  2. how is proto compilation handled with MII, is that a standard part of the install if modelresponse.proto is modified or do I have to compile the protos manually?

TIA!

@mrwyattii
Copy link
Contributor

  1. You are on the right path to customizing the gRPC/proto responses. In the future, we want to expand our implementation to allow for arbitrary kwargs to be passed in the query (currently only basic types like int, float, etc. are supported) and allow for more return values.
  2. You will need to re-compile the proto if you made modifications. You should be able to do this by running https://github.com/microsoft/DeepSpeed-MII/blob/main/mii/grpc_related/proto/build_script.sh

Let me know if this solves your problems!

@pablogranolabar
Copy link
Author

Thank you @mrwyattii

If only these limited data types are currently supported, what's the best route for dealing with model debugging? One of the issues I am having with submitting a PR is that I haven't yet figured out an eloquent method of tracing and debugging the model itself based on the remote gRPC deployment. Any pointers on getting more visibility into model exception reporting, should this be passed with gRPC as well as the additional kwarg outputs?

@mrwyattii
Copy link
Contributor

Unfortunately, gRPC will truncate error messages on the server side and make debugging difficult. We plan to add a non-gRPC deployment type very soon that should address this issue. For now, I would recommend testing your model with just DeepSpeed-inference. Check out how we call deepspeed.init_inference in MII and in the DeepSpeed unit tests. This should help you get the full stack trace and error message when testing new models.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants