-
Notifications
You must be signed in to change notification settings - Fork 2.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Split pybind11 into headers and cpp files to speedup compilation #2445
base: master
Are you sure you want to change the base?
Conversation
I'm happy to help with the CMake setup if this continues. :) |
@@ -17,32 +17,30 @@ class options { | |||
public: | |||
|
|||
// Default RAII constructor, which leaves settings as they currently are. | |||
options() : previous_state(global_state()) {} | |||
options(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Still an interesting discussion to be had: where do we draw the line. I.e., even in "normal" C++ code, split up as usual over headers and cpp-files, this constructor might have been declared here.
I guess this won't be the part adding compilation overhead?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same with the other functions, actually. This might be a part that's good to keep as is, and to have inline even in a build with source files.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In this particular case, it is not immediately obvious if the function is trivial or not, because global_state()
could in theory expand to a huge inlined function (although this is unlikely since large functions are generally not inlined). A better rule might be to keep in headers only constructor that perform only trivial assignments.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Leaving aside the organization of the files (cpp
files shouldn't be in include
, I expect? I don't know how other projects do this), the build system (luckily we have @henryiii for this, willing to help out, so I'm not worried about this :-) ), and some bikeshedding on names (PYBIND11_DECLARATIONS_ONLY
is better, but still not a fan; if I think of something better, I'll shout, but this is not urgent in the current context), I think this all looks pretty good :-)
The more I'm making PRs and compiling (and running) pybind11's test suite, the more I think I really want this PR to go in ;-)
The one thing we should still discuss is what inline declarations are still fine to keep in the headers. I'm thinking of empty constructors (maybe they should become = default;
actually), or things as simple as setValue(int value) { m_value = value; }
. Also in non-header-only projects, these are often left in the definition of a class in the header, no? So it would be worth testing out how much they influence the time it takes to compile; I'm hoping it'd be negligible.
cpp_function(); | ||
cpp_function(std::nullptr_t); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same here. This might just stay as it is. The real gain comes from things like initialize_generic
, right?
Btw, compilation is failing because |
I think |
Sounds good to me! That should solve the circular dependency problem and the |
e4e60a5
to
5bde1c3
Compare
Ah, I found the problem, I had renamed HEADERS_ONLY to DECLARATIONS_ONLY in the last moment before pushing but forgot to do it in the CMakeLists.txt. With the correct name
Passing I would not recommend splitting yet another file to reduce maintenance overhead. |
Thanks, Henry, I might need it later on! Does the offer extends to quantum field theory help as well? Just kidding! 😆 I'm now at the point of trying to understand why can't we just use partial differential equations like Dirac's equation as done in other proper branches of physics for the model/whatever it is that you use instead 😆 One day, one day.
OK, I'll try to leave all empty contructors/trivial getters/setters in and check if the compilation time problem is solved (I expect like you that it will). |
OK, the cmake help might be needed now XD The new failures are for |
I don't like the idea of having an Better separation reduces maintenance overhead, too. |
I do think I agree with @henryiii, here. Also, fmtlib instantiates some templates in the But we can do this step by step and try this out only later, afaic, if you prefer first continuing as you started. |
OK, I'm going to do another round of coding now :-)
About .cpp in include, yes, splitting a file is the only solution, so if you guys think it's worth if for that (I don't, but bikeshedding), I'll do it, no problem. About saving ifdefs however, I don't think any of the current boilerplate would be reduced by that change, but rather slightly increased. The current boilerplate is: include/*.h:
include/*.cpp
detail/common.h
and you must compile with -DPYBIND11_DECLARATIONS_ONLY=1. What I understand, is that if I created a src/options.cpp: a new file with two new lines added:
.h: we still need to conditionally include the implementation on that header as before (now would be .inl.h: same as the old .cpp, but we can remove the detail/common.h: we still need And you must still compile with
but that further increases the boilerplate on several files. Let me know if I got anything above wrong/if there's a better way. |
523fad4
to
ca375c8
Compare
OK, I've converted all files to .cpp now, and previous testing state is unchanged. If after my last comment you still feel it is better to split in -inl.h + .cpp, I'll do it on the next push, which will be needed because every new master commit is a merge conflict, yay! As before About keeping small functions on the header, I've been very strict and kept only:
and not for other things that might be considered trivial, like |
I'm still in favor of
This PR probably should focus on the procedure, structure, and extra parts (like CMake), as we may need to pause all PRs for a time while this actually finalizes and goes in. Might be a good time to close all open PRs since they will all need serious rebasing, since all code has moved (basically). I'm assuming this would be a 3.0 level feature? |
OK, I'll split like that in the next one unless other people argue otherwise 👍 b) by compiler guards do you mean c) do you have any example in mind? Notably, what could we add to the .cpp file that would not break the headerness-only of the library? As things are looking now (this changed a bit from my previous comments after some thought), in the inl.h, the .cpp file will contain exactly one line:
Just to clarify, I was a bit anxious to do the full conversion to ensure that it was still going to work with gem5 and have the speedup, but I checked that now and it worked :-) Like you said, we should do the review here, and then I'll do a rebase after. But if other things can be paused, I wont complain 😁 One problem is that the automated tests don't seem to run if there are conflicts? But like I said, the next hopefully last "test blocker" is the |
The automated tests build on the PR merged with the branch it's targeting. So if there are conflicts, it can't run. You can manually run form your fork, though, just click "actions" in your fork, then the workflow to run (CI), and there will be a manual trigger dropdown there. |
b) I think, especially as we simplify this (I think pybind11 will actually have a few nice simplifications and dropped predeclarations when this is in place), that not having include guards on files you are including is likely dangerous. If we do the inl.h route, it's also easy to fuze the compile if there are files that should compile together. There's a significant memory/time overhead in setting up a compile unit, so being able to list multiple |
Also, CMake assume .cpp are built, so you'll have to loop over all the included files and force them to be considered headers to get it to work properly in header-only mode if you put .cpp's that are really optionally compiled headers. |
Added basic CMake support. You'll need to fix |
84b0a49
to
b25310b
Compare
Thanks so much for the cmake update @henryiii !!! I squashed with my commit, and I also had to add a new commit to make things work with gem5 with missing But when
Maybe I messed something up, but the cmake changes appear to be like you made them. Can you double check? |
b25310b
to
4878c88
Compare
OK, I split the second unrelated commit to #2476 now. |
1b25476
to
258ab9b
Compare
As discussed at pybind#2322, the split is opt-in, and has been observed to speed up the compilation of the gem5 project by around 40% from 25 minutes to 15 minutes. If the PYBIND11_DECLARATIONS_ONLY is not defined, then the .cpp files are included in the .h files, and everything works header-only as was the case before this commit. If PYBIND11_DECLARATIONS_ONLY=1, then only declarations are made visible from, the header files, and the user must the user must compile the pybind11 .cpp files, also using PYBIND11_DECLARATIONS_ONLY=1, into objects and add link those into the final link. This commit also updates CMakeLists to automate building those object files. This commit has been tested as follows. First, the original build and pytest (without PYBIND11_DECLARATIONS_ONLY) work as before: mkdir build cmake .. make -j `nproc` make pytest TODO: if this commit gets traction, I will try to add a test for the PYBIND11_DECLARATIONS_ONLY version too, otherwise it will likely break. I'd just re-run the entire pybind with PYBIND11_DECLARATIONS_ONLY as well every time. The commit has also been tested with the following minimal manual example: class_test.cpp ``` struct ClassTest { ClassTest(const std::string &name) : name(name) {} void setName(const std::string &name_) { name = name_; } const std::string &getName() const { return name; } std::string name; }; struct ClassTestDerived : ClassTest { ClassTestDerived(const std::string &name, const std::string &name2) : ClassTest(name), name2(name2) {} std::string getName2() { return name + name2 + "2"; } std::string name2; }; namespace py = pybind11; PYBIND11_MODULE(class_test, m) { m.doc() = "pybind11 example plugin"; py::class_<ClassTest>(m, "ClassTest") .def(py::init<const std::string &>()) .def("setName", &ClassTest::setName) .def("getName", &ClassTest::getName) .def_readwrite("name", &ClassTest::name); py::class_<ClassTestDerived, ClassTest>(m, "ClassTestDerived") .def(py::init<const std::string &, const std::string &>()) .def("getName2", &ClassTestDerived::getName2) .def_readwrite("name", &ClassTestDerived::name); } ``` class_test_main.py ``` import class_test my_class_test = class_test.ClassTest('abc'); assert(my_class_test.getName() == 'abc') my_class_test.setName('012') assert(my_class_test.name == '012') my_class_test_derived = class_test.ClassTestDerived('abc', 'def'); assert(my_class_test_derived.getName2() == 'abcdef2') ``` without PYBIND11_DECLARATIONS_ONLY (`python3-config --cflags`) is opened up and hacked to point to the custom pybind11 source: ``` g++ \ -save-temps \ -I/usr/include/python3.8 \ -I/home/ciro/git/pybind11/include \ -Wno-unused-result \ -Wsign-compare \ -g \ -fdebug-prefix-map=/build/python3.8-fKk4GY/python3.8-3.8.2=. \ -specs=/usr/share/dpkg/no-pie-compile.specs \ -fstack-protector \ -Wformat \ -Werror=format-security \ -DNDEBUG \ -g \ -fwrapv \ -O3 \ -Wall \ -shared \ -std=c++11 \ -fPIC class_test.cpp \ -o class_test`python3-config --extension-suffix` \ `python3-config --libs` \ ; ./class_test_main.py ``` with PYBIND11_DECLARATIONS_ONLY: ``` g++ \ -DPYBIND11_DECLARATIONS_ONLY=1 \ -save-temps \ -I/usr/include/python3.8 \ -I/home/ciro/git/pybind11/include \ -Wno-unused-result \ -Wsign-compare \ -g \ -fdebug-prefix-map=/build/python3.8-fKk4GY/python3.8-3.8.2=. \ -specs=/usr/share/dpkg/no-pie-compile.specs \ -fstack-protector \ -Wformat \ -Werror=format-security \ -DNDEBUG \ -g \ -fwrapv \ -O3 \ -Wall \ -shared \ -std=c++11 \ -fPIC class_test.cpp \ /home/ciro/git/pybind11/build/libpybind11.a \ -o class_test`python3-config --extension-suffix` \ `python3-config --libs` \ ; ./class_test_main.py ```
73b6804
to
94a24c9
Compare
OK, after hitting my head a bit more and trying a bunch of stuff, I managed to get Doxygen and therefore all tests passing: https://github.com/cirosantilli2/pybind11/actions/runs/278383781 Ideally we should also add one test run with the split enabled, and even better, also add one But yeah, CMake 😨 😨 😨 Let me know how you want to proceed. |
Ping 🖖 |
Sorry for the delay, this has been and will be on hold for a bit while we work on getting 2.6.0 out. Once that is out, then hopefully several of us will have more time to help with this. We should be able to do a test with and a test without, it should be easy to turn on and off for testing. |
Ahh, thanks, I didn't know about the 2.6.0 release coming, I'll wait for that then. |
Ping while waiting for a compilation :-) |
Happy new year ping! 🥳 🥳 🥳 🥳 🥳 🥳 🥳 |
Happy new year, @cirosantilli2! :-) I'll ping along ;-) @henryiii, what about picking this up once 2.6.2 has been released? Is there anything stopping us from pushing this into 2.7.0? |
We can make it 2.7.0 or 2.8.0. Personally I'm going to prioritize CLI11's transition first, and then hopefully learn from what I find there. But that shouldn't stop anyone else from working on this! |
For context, the 2.6.2 release should happen "soon"; after that I think the codebase is fairly open, except for maybe a small handful of waiting PR? |
There will always be some PR waiting, but after running tests again and again and again, in the context of that Valgrind PR, I'm yearning for a faster build! |
Oooops, I had made some email setting changes, and didn't get notified after my ping! Should be fixed now. When the time arrives, just say NOW, and I'll jump on it and rebase 💯 |
Hi! My name is Gabe, and I'm a main developer/maintainer on the gem5 project. We are anxiously (but also patiently!) waiting for this change to go in, since it will be a major improvement in our build overhead for our tooling, etc, and also a major quality of life improvement for people who have to build our project often, like myself :-). |
Friendly ping :) May I ask if there is a timeline for this PR to get accepted? |
We will be having a maintainers meeting tomorrow and this is one of our talking points. :) So hopefully will know more after that. |
Now that we have the Python 2 code removal and global clang-format PRs out of the way, we are interested in tackling both runtime and compilation performance of pybind11 and are happy to discuss compilation performance improvements such as this. |
This is great @Skylion007. Do you have a timeline? Let us know what we can do to help you in the process |
@Skylion007 any update on a roadmap for this? I'm also interested in getting this to work and could help if needed. |
@royjacobson We would welcome a new version of this PR. The main issue is that it's gotten way too stale and needs to be updated. There would also likely have to be a similar PR to address the smart holder branch as well. |
Also ping @giactra if you are interested in helping out. |
Sounds great! I would like to get the CMake + doxygen changes committed with one sample TU to make sure they work, and then proceed to split the remaining headers in subsequent PRs. |
@Skylion007 @royjacobson Yes happy to help. |
Edit: I updated it to all files. But some time passed, so a massive rebase will be needed, which I'll do when I get the greenlight from contribs that the unrebased patch is the approach that we want.
TODO Fo now, only pybind11.h and options.h have been split. If maintainers
feel that this is going in the right direction, I will update this commit
to split all files.
As discussed at #2322, the split
is opt-in, and has been observed to speed up the compilation of the gem5
project by around 40% from 25 minutes to 15 minutes.
If the PYBIND11_DECLARATIONS_ONLY is not defined, then the .cpp files are
included in the .h files, and everything works header-only as was the case
before this commit.
If PYBIND11_DECLARATIONS_ONLY=1, then only declarations are made visible
from, the header files, and the user must the user must compile the
pybind11 .cpp files, also using PYBIND11_DECLARATIONS_ONLY=1, into objects
and add link those into the final link. This commit also updates CMakeLists
to automate building those object files.
This commit has been tested as follows.
First, the original build and pytest (without PYBIND11_DECLARATIONS_ONLY)
work as before:
TODO: if this commit gets traction, I will try to add a test for the
PYBIND11_DECLARATIONS_ONLY version too, otherwise it will likely break.
I'd just re-run the entire pybind with PYBIND11_DECLARATIONS_ONLY as well
every time.
The commit has also been tested with the following minimal manual example:
class_test.cpp
class_test_main.py
without PYBIND11_DECLARATIONS_ONLY (
python3-config --cflags
) is opened upand hacked to point to the custom pybind11 source:
with PYBIND11_DECLARATIONS_ONLY: