Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

make -j all (build using multiple jobs) may not work #54

Open
weslleyspereira opened this issue Mar 7, 2022 · 7 comments
Open

make -j all (build using multiple jobs) may not work #54

weslleyspereira opened this issue Mar 7, 2022 · 7 comments

Comments

@weslleyspereira
Copy link
Collaborator

PR #53 fixed the dependencies of the ScaLAPACK libraries on BLACS. It should now be safe to use make -j lib, i.e., to use multiple jobs to build. However, make -j all may still fail as related in PR #53.

@prj-
Copy link
Contributor

prj- commented Mar 8, 2022

Upon further inspection, make -j lib is still buggy, probably because of this:

$ cat SRC/Makefile
[...]
single: $(SLASRC) $(SCLAUX) $(ALLAUX)
	$(ARCH) $(ARCHFLAGS) ../$(SCALAPACKLIB) $(SLASRC) $(SCLAUX) \
	$(ALLAUX)
	$(RANLIB) ../$(SCALAPACKLIB)

complex: $(CLASRC) $(SCLAUX) $(ALLAUX)
	$(ARCH) $(ARCHFLAGS) ../$(SCALAPACKLIB) $(CLASRC) $(SCLAUX) \
	$(ALLAUX)
	$(RANLIB) ../$(SCALAPACKLIB)

double: $(DLASRC) $(DZLAUX) $(ALLAUX)
	$(ARCH) $(ARCHFLAGS) ../$(SCALAPACKLIB) $(DLASRC) $(DZLAUX) \
	$(ALLAUX)
	$(RANLIB) ../$(SCALAPACKLIB)

complex16: $(ZLASRC) $(DZLAUX) $(ALLAUX)
	$(ARCH) $(ARCHFLAGS) ../$(SCALAPACKLIB) $(ZLASRC) $(DZLAUX) \
	$(ALLAUX)
	$(RANLIB) ../$(SCALAPACKLIB)
[...]

The target files are being overwritten by the different rules, and one may end up with errors such as:

/usr/bin/ar: ../libscalapack.a: error reading blacs_grid_.oo: file truncated
/usr/bin/ar: ../libscalapack.a: error reading free_handle_.o: file truncated

@prj-
Copy link
Contributor

prj- commented Mar 8, 2022

Just FYI, one functioning approach (at least that's what's being done when building ScaLAPACK in PETSc) is available in this commit: https://bitbucket.org/petsc/pkg-scalapack/commits/bf5290f5d002fe43a6fc59f83730ff5141860380. But I don't know if you'd be willing to merge that in @langou?

@weslleyspereira
Copy link
Collaborator Author

Hi @prj-. I agree, there is a problem there. Thanks for taking your time to explain this!

I am sorry but I am not sure I understood your suggestion for the use of https://bitbucket.org/petsc/pkg-scalapack/commits/bf5290f5d002fe43a6fc59f83730ff5141860380. Are you suggesting we call $(ARCH) $(ARCHFLAGS) $(SCALAPACKLIB) ... only once for all precisions? Or is it something else?

Using a single call of $(ARCH) $(ARCHFLAGS) $(SCALAPACKLIB) ... is a good way to go in my opinion. That is also working on LAPACK (https://github.com/Reference-LAPACK/lapack/blob/a7c0392a190998f550cbf064f711b9575c8410cf/SRC/Makefile)

ALLOBJ = $(SLASRC) $(DLASRC) $(DSLASRC) $(CLASRC) $(ZLASRC) $(ZCLASRC) \
   $(SCLAUX) $(DZLAUX) $(ALLAUX)

ifdef USEXBLAS
ALLXOBJ = $(SXLASRC) $(DXLASRC) $(CXLASRC) $(ZXLASRC)
endif

ifdef BUILD_DEPRECATED
DEPRECATED = $(DEPRECSRC)
endif

.PHONY: all
all: $(LAPACKLIB)

LAPACKLIB_DEPS := $(ALLOBJ)

ifdef USEXBLAS
  LAPACKLIB_DEPS += $(ALLXOBJ)
endif

ifdef BUILD_DEPRECATED
  LAPACKLIB_DEPS += $(DEPRECATED)
endif

$(LAPACKLIB): $(LAPACKLIB_DEPS)
	$(AR) $(ARFLAGS) $@ $^
	$(RANLIB) $@

With this kind of approach, I think we would still be able to build for a single precision type.

@prj-
Copy link
Contributor

prj- commented Mar 12, 2022

Are you suggesting we call $(ARCH) $(ARCHFLAGS) $(SCALAPACKLIB) ... only once for all precisions? Or is it something else?

I'm now realizing that the trick there is indeed to have a single rule for all precisions, which may not be acceptable to you.

With this kind of approach, I think we would still be able to build for a single precision type.

Right, I think this file https://bitbucket.org/petsc/pkg-scalapack/src/bf5290f5d002fe43a6fc59f83730ff5141860380/Makefile.objs could be parsed to split objects among the different precisions. (not sure whether this is really needed) Then, the Makefile would match the one of LAPACK indeed.

@mgates3
Copy link

mgates3 commented Mar 13, 2022

Using a single ar cr is the right thing to do, rather than multiple ar cr updating the same file. In terms of correctness and speed, I advocate moving to a non-recursive Makefile, where the top-level Makefile can see the whole dependency DAG. See "Recursive Make Considered Harmful". A non-recursive Makefile could be either a single Makefile for the entire project (SLATE uses this approach), or a top-level Makefile that includes Makefiles for each sub-directory (MAGMA uses this approach), which has the same logical effect. With this non-recursive structure, linking a shared library (.so) is easy, too.

Here's an example non-recursive Makefile structure. I'm sure there are many more subtleties in ScaLAPACK's Makefiles.

There are ways to write it such that running make in, say, the BLACS directory compiles only the BLACS, but I suspect this is overkill for ScaLAPACK.

The use of make single double complex complex16 seems inherently problematic for parallelism. It defines 4 targets that all write via ar cr to the same library. Better would be to define 4 variables that select the appropriate files to go into $(lib_obj). E.g., from the example repo:

single ?= 1
...
ifeq ($(single),1)
    lib_obj += $(dir1_s_obj)
endif

@mgates3
Copy link

mgates3 commented Mar 14, 2022

Here's a take on implementing this in ScaLAPACK, so far just the BLACS library. If this seems like a good direction, can @weslleyspereira or someone do the other directories? Or I can, but may not have time immediately. I use this myself and have seen several people burned by make -j failing, so it seems useful to fix.
https://github.com/mgates3/scalapack/tree/non-recursive-make
mgates3@d51c709?diff=split

@weslleyspereira
Copy link
Collaborator Author

Thanks for the complete answers, @mgates3 !! I also don't know if I have time to work on it now, but I will keep your solution in my radar.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants