-
Notifications
You must be signed in to change notification settings - Fork 781
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
gtsam doesn't build on i386 or armhf #1605
Comments
I need to spin up an armhf box first, thanks for reporting! Also, do you also build the Python wrapper? Currently I think they need to be built together. |
Hi. Thanks for replying
I need to spin up an armhf box first, thanks for reporting!
I see the same behavior on i686 and armhf, and since you probably have
an amd64 box, you can more easily reproduce the i686 failure. If you
have an amd64 install of debian, install the i686 cross-compiler:
sudo apt install g++-i686-linux-gnu
And then you can compile the offending file as noted in the bug report.
But tell it to use the i686 compiler. I see this:
***@***.***:~/debianstuff/gtsam$ i686-linux-gnu-g++ \
-DBOOST_ALL_NO_LIB \
-DBOOST_ATOMIC_DYN_LINK \
-DBOOST_CHRONO_DYN_LINK \
-DBOOST_DATE_TIME_DYN_LINK \
-DBOOST_FILESYSTEM_DYN_LINK \
-DBOOST_PROGRAM_OPTIONS_DYN_LINK \
-DBOOST_REGEX_DYN_LINK \
-DBOOST_SERIALIZATION_DYN_LINK \
-DBOOST_SYSTEM_DYN_LINK \
-DBOOST_THREAD_DYN_LINK \
-DBOOST_TIMER_DYN_LINK \
-DNDEBUG \
-I"." \
-I"CppUnitLite" \
-isystem /usr/include/eigen3 \
-fstack-protector-strong \
-Wformat \
-Werror=format-security \
-Wno-deprecated-declarations \
-Wdate-time \
-D_FORTIFY_SOURCE=2 \
-DNDEBUG \
-Wall \
-fPIC \
-Wreturn-local-addr \
-Werror=return-local-addr \
-Wreturn-type \
-Werror=return-type \
-Wformat \
-Werror=format-security \
-Wsuggest-override \
-Wno-unused-local-typedefs \
-o /tmp/tst.o \
-c "gtsam/linear/tests/testSparseEigen.cpp"
In file included from ./gtsam/base/FastSet.h:28,
from ./gtsam/inference/Key.h:22,
from ./gtsam/inference/Ordering.h:23,
from ./gtsam/inference/EliminateableFactorGraph.h:26,
from ./gtsam/linear/GaussianFactorGraph.h:24,
from gtsam/linear/tests/testSparseEigen.cpp:21:
./gtsam/base/Testable.h: In instantiation of 'bool gtsam::assert_equal(const V&, const V&, double) [with V = int]':
gtsam/linear/tests/testSparseEigen.cpp:44:3: required from here
./gtsam/base/Testable.h:99:26: error: incomplete type 'gtsam::traits<int>' used in nested name specifier
99 | if (traits<V>::Equals(actual,expected, tol))
| ~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~
./gtsam/base/Testable.h:102:21: error: incomplete type 'gtsam::traits<int>' used in nested name specifier
102 | traits<V>::Print(expected,"expected:\n");
| ~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~
./gtsam/base/Testable.h:103:21: error: incomplete type 'gtsam::traits<int>' used in nested name specifier
103 | traits<V>::Print(actual,"actual:\n");
| ~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~
Also, do you also build the Python wrapper? Currently I think they
need to be built together.
I do, yes. This creates other problems, but I have workarounds. Once all
the things build on all the arches, I'll submit more bug reports to
hopefully ingest the fixes into your tree.
|
@dkogan I think you only need to change the |
Thanks much. Appears to work. Doing another upload; let's see how it does... |
OK, this issue appears to be fixed, but now the tests fail on armhf: https://buildd.debian.org/status/fetch.php?pkg=gtsam&arch=armhf&ver=4.2%7E9%2Bdfsg-5&stamp=1692186176&raw=0 The other arches still have problems too: https://buildd.debian.org/status/package.php?p=gtsam I'm going to have to come back to this later. If you want to work on some of these issues in the meantime, that would be great. The package tree is here: https://buildd.debian.org/status/package.php?p=gtsam |
@dkogan I think we currently only support 64-bit platforms, is this supported on Debian? |
Fan Jiang ***@***.***> writes:
@dkogan I think we currently only support 64-bit platforms, is this
supported on Debian?
What is "support"? There's no reason gtsam should be arch-specific: it's
a tool to solve a math problem.
The 10 arches on top of the buildd page are the ones that must be
functional for this to be included in Debian:
https://buildd.debian.org/status/package.php?p=gtsam
The build issues are varied. armel needs wants to be linked with
-latomic. I asked for that in the build, but apparently only some parts
of the build respected that setting. This is almost certainly a bug in
the gtsam build system:
https://buildd.debian.org/status/fetch.php?pkg=gtsam&arch=armel&ver=4.2%7E9%2Bdfsg-5&stamp=1692172353&raw=0
armhf has failing tests:
https://buildd.debian.org/status/fetch.php?pkg=gtsam&arch=armhf&ver=4.2%7E9%2Bdfsg-5&stamp=1692186176&raw=0
mips64el has some linker problem I haven't debugged yet:
https://buildd.debian.org/status/fetch.php?pkg=gtsam&arch=mips64el&ver=4.2%7E9%2Bdfsg-5&stamp=1692184823&raw=0
mipsel also wants -latomic and apparently the debug symbols are too
large to fit into its ELF segments:
https://buildd.debian.org/status/fetch.php?pkg=gtsam&arch=mipsel&ver=4.2%7E9%2Bdfsg-5&stamp=1692192262&raw=0
And so on. None of these are inherent limitations in gtsam. It's all
fixable, but will take some time. I'll get back to it eventually, but if
you want to work on it in the meantime, you can.
|
Hi. It looks like the failure mode on i686 and armhf is the same one: the build completes, but the tests fail: https://buildd.debian.org/status/fetch.php?pkg=gtsam&arch=i386&ver=4.2%7E9%2Bdfsg-5&stamp=1692266835&raw=0 Since you don't need any new hardware on i686, can I get you to look into that? You'll be far more effective at debugging this than I. Thanks. |
@dkogan I can look at it this week |
Tagging @jlblancoc for this is a Debian-related issue |
First, thanks a lot @dkogan for finally being brave enough to move this forward! :-) All these errors are caused by assumptions and/or never-happened-yet situations about the different sizes of "int", "long", "size_t" in those "uncommon" archs, it's not related directly with Debian packaging or flags per se... |
PS: The comment above was for build errors. For failing tests, I cannot see a clear reason from the buildd logs. In the past, reasons for tests to fail in other projects on "uncommon" archs have been:
It's still a long track ahead, but it's worth it! 👍 |
Hi. Yes. You could run valgrind or use the Debian build infrastructure
and so on. But let's start with i686. This doesn't require anything that
we all don't have already, and should be very simple to reproduce. I
don't have the cycles currently. Can one of you please look at it?
Thanks.
|
I think I know what is the issue, it's because we expect Key to be the same as Eigen::Index which is 32 bit in some archs. I would say now we have a failure case for #1522 |
Fan Jiang ***@***.***> writes:
I think I know what is the issue, it's because we expect Key to be the
same as Eigen::Index which is 32 bit in some archs. I would say now we
have a failure case for #1522
That looks like a good candidate. Are the patches in that PR more or
less done? Should I try applying them to the Debian builds?
|
@dkogan Yes that would be a good test, let's apply and see what happens |
The patches in #1522 fixed most of the test failures but not all. I just tried on armhf, and there's one failure still:
Yall should test this all yourselves I think. I would start with i686. I suggest using a Debian install on your native arch (presumably amd64), crossing to i686. This is trivial to set up, and works without emulation. Let me know if you need help. |
If you can root-cause it and suggest a fix/PR that I think would be the most helpful. Might be some cross-platform non-determinism that we have not yet encountered. |
This has a lot of gtsam-specific complexity, and you can debug and fix this much faster than I can. Building on i686 is trivial for you to do (as described above), and you should start there. Let me know if you need help setting this up. |
This seems to be just an order issue in the comparison of Bayes tree, these two trees are equivalent. @dkogan I think you can mask this test in i386 first and we can work on a fix to the comparison |
Fan Jiang ***@***.***> writes:
This seems to be just an order issue in the comparison of Bayes tree,
these two trees are equivalent. @dkogan I think you can mask this test
in i386 first and we can work on a fix to the comparison
OK. So to be clear I should:
- Apply the 3 patches in #1522
- Disable this one test that failed with armhf
Yes? I'm certain there will be other issues, but this should get us
closer.
|
Yes :) |
Alright, I tried it. Build logs are here: https://buildd.debian.org/status/package.php?p=gtsam Click on the red "Build-Attempted" to get the log for each particular arch. Notes:
The other arches either succeeded, or haven't been built yet. Can somebody please take a look? Thanks |
|
Yep, seems like abuse of |
OK. Once some form of #1625 is merged, I'll pull that into my patch stack. And now that all the builds are done, I can see that some other arches failed in different ways. I think I fixed what mips64el is complaining about. Can somebody please look at the test failures for hppa and x32: |
@dkogan It's near conference deadline recently so no bandwidth for me, but after ICRA sub deadline I can look into the Python failures. |
Fan Jiang ***@***.***> writes:
@dkogan It's near conference deadline recently so no bandwidth for me,
but after ICRA sub deadline I can look into the Python failures.
I'm not in a rush at all. Is the ICRA deadline coming? I should do
something too...
|
#1625 was just merged, so this round of i386 issues should be fixed. Thanks! The armel, armhf ones would be unaffected though. |
Today our automated packaging broke because of the introduction of libcephes-gtsam.so into the development branch. As the name suggests, it's a private build of libcephes since no debian package exists. |
Hi. If you're including more dependencies you didn't write, it would be great if you had a clear switch in the code to avoid using them. |
Hi. I pushed updated the Debian package to the new 4.2.0 release. And the 32-bit architectures still fail tests. For i386: https://buildd.debian.org/status/fetch.php?pkg=gtsam&arch=i386&ver=4.2.0%2Bdfsg-1&stamp=1704797861&raw=0 The first failure:
This looks to me like it couldn't run the test code, not that it failed some threshold. Can one of you please take a look? Thanks. |
@varunagrawal Is libcephes a header-only library? If so maybe it does not need to be built at all. We can make all function inline and remove the lib. @dkogan You are right, this error is potentially a wrong dynamic_cast. There is a recent issue #1685 about similar stuff. I can look into this error this week. |
Unfortunately it isn't and I don't have the bandwidth to go deep into updating it to be header-only (especially since there is no test suite). |
From a search this is the only place using cephes, is it correct? |
Yes, at least for now. |
I am not sure I understand what the issue with Cephes is here. @dkogan's latest comment seems to be referring to something else. |
Just thinking about how can we make packaging easier :) |
Sorry I think I somewhat hijacked this issue, which was about gtsam not building on i386/armhf. The point I'm raising about the introduction of libcephes is in so far relevant that it may go counter to @dkogan 's efforts to make gtsam an acceptable Debian package (which the present issue is also concerned with). It was my understanding that shipping binaries of other libraries along with gtsam is less than ideal, see |
The two seem orthogonal? The reasoning for including cephes is to enable functionality (specifically of the GNC optimizer), so sacrificing functionality for the sake of packaging seems to be akin to getting a poor gift but really expensive wrapping paper. There is a single function call but behind the scenes, multiple files and functions are required to make it work. That being said, if there was an easier way to enable GNC without Cephes, I would do it. Unfortunately, re-implementing code provided by Cephes in a manner that keeps up with the quality of GTSAM is out of the question for me since I have been working on it by myself without any help for many months, and I have other things on my plate. Unless someone is willing to step up and help with development, it feels a bit disingenuous to ask for changes to make others lives easier while simultaneously making mine harder. |
Hi. I pushed gtsam into Debian, and I see (among other things) that it doesn't
build on armhf and i386. Can one of you please take a look? It'll take yall much
less time to detangle this than it would take me.
Here's the i386 build log: https://buildd.debian.org/status/fetch.php?pkg=gtsam&arch=i386&ver=4.2%7E9%2Bdfsg-4&stamp=1692035308&raw=0
The punchline is that when building
gtsam/linear/tests/testSparseEigen.cpp
this happens:
It's reproducible standalone, without cmake like this:
Thanks much
The text was updated successfully, but these errors were encountered: