Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Better handling for Python subclasses of types that use nb::new_() #859

Merged
merged 3 commits into from
Jan 27, 2025

Conversation

oremanj
Copy link
Contributor

@oremanj oremanj commented Jan 15, 2025

In Python, __new__ takes a first argument indicating the type of object it should create, which allows one to derive from classes that use __new__ and obtain reasonable semantics. nanobind's nb::new_() wrapper currently ignores this argument, with somewhat surprising results:

// C++
struct Base { ... };
nb::class_<Base>(m, "Base").def(nb::new_(...))...;

# Python
class Derived(mod.Base): pass

>>> type(Derived()) is mod.Base
True

This PR makes that case work more like it does in Python, so that Derived() produces an instance of Derived. This is possible safely because both Base and Derived believe they wrap the same C++ type. Since we can't easily intervene before the Base object is created, we use a call policy to intervene afterwards, and return an object of type Derived that wraps a Base* without ownership. The original Base object's lifetime is maintained via a keep-alive from the returned wrapper.

This feature requires allowing a call policy's postcall hook to modify the return value, which was not previously supported. I chose to implement this by allowing the postcall hook to take the return value via PyObject*&; then a passed PyObject* can initialize either that or the documented handle. I didn't document this new capability, because it's somewhat obscure and easy to mess up the reference counting; I figure anyone that really needs it will be able to figure it out. An alternative if you don't want to add this ability to call policies in general would be to define a new function attribute just for the new-fixup case, but that felt more invasive to me.

In Python, `__new__` takes a first argument indicating the type of object it should create, which allows one to derive from classes that use `__new__` and obtain reasonable semantics. nanobind's `nb::new_()` wrapper currently ignores this argument, with somewhat surprising results:
```
// C++
struct Base { ... };
nb::class_<Base>(m, "Base").def(nb::new_(...))...;

class Derived(mod.Base): pass

>>> type(Derived()) is mod.Base
True
```

This PR makes that case work more like it does in Python, so that `Derived()` produces an instance of `Derived`. This is possible safely because both `Base` and `Derived` believe they wrap the same C++ type. Since we can't easily intervene before the `Base` object is created, we use a call policy to intervene afterwards, and return an object of type `Derived` that wraps a `Base*` without ownership. The original `Base` object's lifetime is maintained via a keep-alive from the returned wrapper. There is not much actual code, but it's subtle so the comments are pretty long.

This feature requires allowing a call policy's postcall hook to modify the return value, which was not previously supported. I chose to implement this by allowing the postcall hook to take the return value via `PyObject*&`; then a passed `PyObject*` can initialize either that or to the documented `handle`. I didn't document this new capability, because it's somewhat obscure and easy to mess up the reference counting; I figure anyone that really needs it will be able to figure it out. An alternative if you don't want to add this ability to call policies in general would be to define a new function attribute just for the new-fixup case, but that felt more invasive to me.
@wjakob
Copy link
Owner

wjakob commented Jan 16, 2025

I'm am concerned about the implications of nb::new_. Initially, this was a feature in the category: "cool -- why not, since it is easy to support".

Since the original commit, there have been several commits to work around limitations and address corner cases. With #861 and #859 on top, the complexity is now reaching a level that I feel is starting to be unreasonable for this project given its "nano" scope.

My first reaction is that it would be better to explain the limitations of the current feature rather than making it fully general as you propose to do in these two commits.

@oremanj
Copy link
Contributor Author

oremanj commented Jan 16, 2025

Thanks for the feedback. I personally lean pretty far in the direction of taking on additional implementation complexity in order to handle all the cases rather than just the common ones, but I acknowledge that's not always the correct tradeoff, and a lot of the problems solved by these two PRs are theoretical.

The impetus for this PR #859 (the actual problem I'm trying to solve) was some code that did:

class SomeType(cppmodule.SomeBaseType):
    # ... various helper methods here, no virtual overrides or anything ...

# ... code that instantiates SomeType objects and calls the helper methods ...

When cppmodule.SomeBaseType changed from using __init__-style construction to __new__-style construction due to some refactoring in the C++ code, we were surprised to find that SomeType() no longer returned an instance of SomeType. That's different from what one would expect based on the pure-Python semantics (after class MyStr(str): pass, MyStr() still produces a MyStr object, even though str is initialized by __new__) so I figured there might be some appetite for a fix.

#861 only showed up as a problem in a particularly tortured test case that I wrote for #859, not in any real code, so I've closed it. The support in #859 for cases where __new__ returns an instance of a derived class (you tried to construct an Animal but the result was a Cat) adds quite a bit of complexity and is not needed by the case that I actually ran into, so I'll upload a version without that. If you still think it's too complex, I can go in the direction of documenting limitations instead. I do hope you might be willing to accept at least the two-line change to nb_attr.h (which I'm happy to write as a separate PR with docs/tests) so that anyone who does need the return type fixup implemented here has the option of implementing it using public API.

@wjakob wjakob force-pushed the master branch 2 times, most recently from 98def19 to 92d9cb3 Compare January 27, 2025 01:29
@wjakob
Copy link
Owner

wjakob commented Jan 27, 2025

Thank you @oremanj!

@wjakob wjakob merged commit 534fd8c into wjakob:master Jan 27, 2025
31 checks passed
@oremanj oremanj deleted the new-returntype branch January 27, 2025 17:26
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants