-
Notifications
You must be signed in to change notification settings - Fork 102
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
How to update the original model parameters after calling make_functional? #280
Comments
After looking a bit in the source code I've found |
Couldn't you do
? (Also, you might want to try functorch.jacrev instead of torch.autograd.functional.jacobian -- it may be faster) |
Is model.parameters() guaranteed to return parameters in the same order of
make_functional?
If this is the case then I can surely do this, however I would like to ask
that it is documented as proper behaviour on which one can rely on.
Thank you very much
… |
Yes
Yes, we should document this |
Thank you very much again for all this work. |
@trenta3 out of curiosity, what are you using |
I'm currently using make_functional as well as other functorch APIs, in particular jvp and jacrev to easily write more complex optimizers that need to consider also second order information of a neural network, which is unmanageable to do in pytorch. If I must say it, a thing that I miss is the ability to "lazily" compute parts of the hessian, like extracting its diagonal, without using the full memory (and compute) requirement to calculate the whole hessian. |
Hi! Thanks a lot for building this awesome functorch! I have the same issue. I'm using
Hope I can get a nicer way to achieve this with a good tutorial. Thanks! |
@kxhit thank you for your feedback. Could you give a little more context about why you want to update each original model's state_dict? |
@zou3519 Hi, thanks for your quick reply. In my case, I'm training many tiny networks and need to use the up-to-date network's weights every a few steps. So I need to assign batch weights back to the original models frequently. |
As per the title, I find that updating the tensors pointed by the
params
returned bymake_functional
does not update the real parameters in the original model.Is there a way to do this? I find that it would be extremely useful to implement optimization algorithms in a way that is more similar to their mathematical description.
To provide more context I add an example script of what standard Gradient Descent should look like in this way:
Executing the script shows that the parameters are not updated since the loss doesn't change
The text was updated successfully, but these errors were encountered: