Better handling for Python subclasses of types that use nb::new_() #859

oremanj · 2025-01-15T18:20:39Z

In Python, __new__ takes a first argument indicating the type of object it should create, which allows one to derive from classes that use __new__ and obtain reasonable semantics. nanobind's nb::new_() wrapper currently ignores this argument, with somewhat surprising results:

// C++
struct Base { ... };
nb::class_<Base>(m, "Base").def(nb::new_(...))...;

# Python
class Derived(mod.Base): pass

>>> type(Derived()) is mod.Base
True

This PR makes that case work more like it does in Python, so that Derived() produces an instance of Derived. This is possible safely because both Base and Derived believe they wrap the same C++ type. Since we can't easily intervene before the Base object is created, we use a call policy to intervene afterwards, and return an object of type Derived that wraps a Base* without ownership. The original Base object's lifetime is maintained via a keep-alive from the returned wrapper.

This feature requires allowing a call policy's postcall hook to modify the return value, which was not previously supported. I chose to implement this by allowing the postcall hook to take the return value via PyObject*&; then a passed PyObject* can initialize either that or the documented handle. I didn't document this new capability, because it's somewhat obscure and easy to mess up the reference counting; I figure anyone that really needs it will be able to figure it out. An alternative if you don't want to add this ability to call policies in general would be to define a new function attribute just for the new-fixup case, but that felt more invasive to me.

In Python, `__new__` takes a first argument indicating the type of object it should create, which allows one to derive from classes that use `__new__` and obtain reasonable semantics. nanobind's `nb::new_()` wrapper currently ignores this argument, with somewhat surprising results: ``` // C++ struct Base { ... }; nb::class_<Base>(m, "Base").def(nb::new_(...))...; class Derived(mod.Base): pass >>> type(Derived()) is mod.Base True ``` This PR makes that case work more like it does in Python, so that `Derived()` produces an instance of `Derived`. This is possible safely because both `Base` and `Derived` believe they wrap the same C++ type. Since we can't easily intervene before the `Base` object is created, we use a call policy to intervene afterwards, and return an object of type `Derived` that wraps a `Base*` without ownership. The original `Base` object's lifetime is maintained via a keep-alive from the returned wrapper. There is not much actual code, but it's subtle so the comments are pretty long. This feature requires allowing a call policy's postcall hook to modify the return value, which was not previously supported. I chose to implement this by allowing the postcall hook to take the return value via `PyObject*&`; then a passed `PyObject*` can initialize either that or to the documented `handle`. I didn't document this new capability, because it's somewhat obscure and easy to mess up the reference counting; I figure anyone that really needs it will be able to figure it out. An alternative if you don't want to add this ability to call policies in general would be to define a new function attribute just for the new-fixup case, but that felt more invasive to me.

wjakob · 2025-01-16T03:43:27Z

I'm am concerned about the implications of nb::new_. Initially, this was a feature in the category: "cool -- why not, since it is easy to support".

Since the original commit, there have been several commits to work around limitations and address corner cases. With #861 and #859 on top, the complexity is now reaching a level that I feel is starting to be unreasonable for this project given its "nano" scope.

My first reaction is that it would be better to explain the limitations of the current feature rather than making it fully general as you propose to do in these two commits.

oremanj · 2025-01-16T07:55:33Z

Thanks for the feedback. I personally lean pretty far in the direction of taking on additional implementation complexity in order to handle all the cases rather than just the common ones, but I acknowledge that's not always the correct tradeoff, and a lot of the problems solved by these two PRs are theoretical.

The impetus for this PR #859 (the actual problem I'm trying to solve) was some code that did:

class SomeType(cppmodule.SomeBaseType):
    # ... various helper methods here, no virtual overrides or anything ...

# ... code that instantiates SomeType objects and calls the helper methods ...

When cppmodule.SomeBaseType changed from using __init__-style construction to __new__-style construction due to some refactoring in the C++ code, we were surprised to find that SomeType() no longer returned an instance of SomeType. That's different from what one would expect based on the pure-Python semantics (after class MyStr(str): pass, MyStr() still produces a MyStr object, even though str is initialized by __new__) so I figured there might be some appetite for a fix.

#861 only showed up as a problem in a particularly tortured test case that I wrote for #859, not in any real code, so I've closed it. The support in #859 for cases where __new__ returns an instance of a derived class (you tried to construct an Animal but the result was a Cat) adds quite a bit of complexity and is not needed by the case that I actually ran into, so I'll upload a version without that. If you still think it's too complex, I can go in the direction of documenting limitations instead. I do hope you might be willing to accept at least the two-line change to nb_attr.h (which I'm happy to write as a separate PR with docs/tests) so that anyone who does need the return type fixup implemented here has the option of implementing it using public API.

wjakob · 2025-01-27T12:12:51Z

Thank you @oremanj!

oremanj added 2 commits January 16, 2025 01:36

Simplify implementation, document limitations

de5192d

Tweak docs

ae7e061

wjakob force-pushed the master branch 2 times, most recently from 98def19 to 92d9cb3 Compare January 27, 2025 01:29

wjakob merged commit 534fd8c into wjakob:master Jan 27, 2025
31 checks passed

oremanj deleted the new-returntype branch January 27, 2025 17:26

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Better handling for Python subclasses of types that use nb::new_() #859

Better handling for Python subclasses of types that use nb::new_() #859

oremanj commented Jan 15, 2025 •

edited

Loading

wjakob commented Jan 16, 2025 •

edited

Loading

oremanj commented Jan 16, 2025 •

edited

Loading

wjakob commented Jan 27, 2025

Better handling for Python subclasses of types that use nb::new_() #859

Better handling for Python subclasses of types that use nb::new_() #859

Conversation

oremanj commented Jan 15, 2025 • edited Loading

wjakob commented Jan 16, 2025 • edited Loading

oremanj commented Jan 16, 2025 • edited Loading

wjakob commented Jan 27, 2025

oremanj commented Jan 15, 2025 •

edited

Loading

wjakob commented Jan 16, 2025 •

edited

Loading

oremanj commented Jan 16, 2025 •

edited

Loading