Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Extreme efficiency enhancements #151

Draft
wants to merge 97 commits into
base: master
Choose a base branch
from
Draft
Show file tree
Hide file tree
Changes from 1 commit
Commits
Show all changes
97 commits
Select commit Hold shift + click to select a range
f3e08e2
Implement adaptive rcut_low for cell list overlap detection with lj.
rwsmith7531 Jul 21, 2022
896f6f5
Disable charge correction to adaptive rminsq when charge interactions…
rwsmith7531 Jul 21, 2022
904fab1
Move precalculation of rcut_lowsq.
rwsmith7531 Jul 21, 2022
31a559f
Correctly compute pair minimum qq.
rwsmith7531 Aug 10, 2022
4a15546
Add more detailed Widom insertion output.
rwsmith7531 Aug 10, 2022
2c0b799
Merge branch 'type_pair_rmin' into timed_adaptive_rmin
rwsmith7531 Aug 10, 2022
e943784
Initialize t_cpu to zero prior to parallel section in widom_insert
rwsmith7531 Aug 16, 2022
2c8ec51
Let type max charge be negative and let type minimum charge be positive.
rwsmith7531 Aug 17, 2022
f4fb8a9
Merge branch 'widom_species_timing' into timed_adaptive_rmin
rwsmith7531 Aug 17, 2022
44da93a
Merge branch 'type_pair_rmin' into timed_adaptive_rmin
rwsmith7531 Aug 17, 2022
415042c
Add cell lists for neighbor-finding.
rwsmith7531 Aug 19, 2022
3a38beb
Merge branch 'cbmc_cell_list' into cbmc_cell_list_merge
rwsmith7531 Aug 22, 2022
62b7e98
Fix CBMC cell list bugs.
rwsmith7531 Sep 15, 2022
575b111
Merge branch 'cbmc_cell_list' into cbmc_cell_list_merge
rwsmith7531 Sep 15, 2022
0955297
Estimate appropriate value for Umax to use.
rwsmith7531 Sep 19, 2022
16445ae
Fix CBMC dihedral sampling.
rwsmith7531 Sep 21, 2022
a2a2dfc
Merge branch 'fix_dihedral_selection' into Eij_max_estimation
rwsmith7531 Sep 21, 2022
efd9d41
Flag overlapping cbmc dihedral trials as overlap.
rwsmith7531 Oct 18, 2022
d03ee55
Update test examples that were failing because they were originally g…
rwsmith7531 Oct 21, 2022
e0ad108
Update examples corresponding to the corrected tests.
rwsmith7531 Oct 21, 2022
15f50d4
Add support for reading .xtc trajectory files.
rwsmith7531 Dec 13, 2022
5713224
Merge branch 'fix_dihedral_selection' into Eij_max_estimation
rwsmith7531 Jan 16, 2023
6eabb4f
Add atom pair energy table for intermolecular CBMC trial energies.
rwsmith7531 Jan 17, 2023
a96cd3e
Fix bugs with atompair_nrg_table.
rwsmith7531 Jan 19, 2023
650a294
Add atompair rminsq table feature with limited rminsq resolution.
rwsmith7531 Feb 2, 2023
26eca17
Move some memory allocation from stack to heap to avoid stack buffer …
rwsmith7531 Feb 7, 2023
a58cf8d
Enable custom tolerance list for atompair rminsq table creation.
rwsmith7531 Feb 9, 2023
8c277a2
Merge branch 'read_xtc' into atompair_rmin_xtc
rwsmith7531 Feb 9, 2023
2a55824
Include libgmxfort and libxdrfile and update Makefiles.
rwsmith7531 Feb 15, 2023
e17516b
Add ability to choose whether some large private arrays are kept on t…
rwsmith7531 Feb 20, 2023
821da98
Make recommended Eij_max estimation optional.
rwsmith7531 Feb 21, 2023
e9f7f62
Stop using Eij_ind_ubound before it is set.
rwsmith7531 Feb 23, 2023
3fbeffe
Add documentation for estimating and using atom pair and atom type pa…
rwsmith7531 Feb 23, 2023
228be17
Add documentation for atompair energy table.
rwsmith7531 Feb 27, 2023
1bbc456
Add documentation for trajectory reader changes, including xtc reading.
rwsmith7531 Feb 27, 2023
17ccecb
Merge branch 'read_xtc' into atompair_rmin_xtc
rwsmith7531 Feb 27, 2023
9c63e2a
Merge branch 'atompair_rmin' into atompair_rmin_xtc
rwsmith7531 Feb 27, 2023
a483438
Make cell lists compatible with triclinic boxes.
rwsmith7531 Mar 7, 2023
be091d6
Vectorize Widom nonbonded intermolecular energy calculation
rwsmith7531 Mar 24, 2023
51c6188
Assert vector pointers are contiguous.
rwsmith7531 Mar 24, 2023
c0435b7
Implement first working version of vectorized Widom intermolecular en…
rwsmith7531 Mar 30, 2023
20b077b
Include previously missing division by zero for charge_cut energy cal…
rwsmith7531 Mar 30, 2023
eec2510
Merge branch 'widom_pair_nrg_table' into atompair_rmin
rwsmith7531 Mar 30, 2023
6785e18
Merge branch 'atompair_rmin' into atompair_rmin_xtc
rwsmith7531 Mar 30, 2023
48ec01a
Merge branch 'atompair_rmin_xtc' into triclinic_cell_list
rwsmith7531 Mar 30, 2023
e8db1df
Merge branch 'triclinic_cell_list' into vectorization
rwsmith7531 Mar 30, 2023
e279a3e
Fix new bug in rsq_min binning in widom_insert.
rwsmith7531 Mar 31, 2023
9007cab
Implement first vectorized version that provides speed boost
rwsmith7531 Apr 7, 2023
e5b5b82
Add missing jcharge_coul vector range and add assume_aligned directives.
rwsmith7531 Apr 11, 2023
e2a3c6d
Rearrange atom coordinate arrays for better vectorization.
rwsmith7531 Apr 13, 2023
e31bd0a
Remove references to j_atoms.
rwsmith7531 Apr 14, 2023
abdd9eb
Implement first working version of vectorized and gathered cell list …
rwsmith7531 Apr 19, 2023
bcf271a
Improve vectorization of Widom insertion Ewald reciprocal energy calc…
rwsmith7531 Apr 20, 2023
5913915
Merge branch 'vectorization' into gathered_cell_list
rwsmith7531 Apr 20, 2023
da038f1
WIP successfully compiles.
rwsmith7531 May 3, 2023
68a5a97
First working version
rwsmith7531 May 5, 2023
e1f576f
First version that works with multiple threads
rwsmith7531 May 5, 2023
4ef7722
Enhanced vectorization that only works with one thread unless ensembl…
rwsmith7531 May 25, 2023
afab71b
Enhanced vectorization that works with multiple threads.
rwsmith7531 Jun 6, 2023
7359aaf
Merge branch 'gathered_cell_list' into nonwidom_vectorization.
rwsmith7531 Jun 6, 2023
dd98c82
Use undamped shifted force method for cbmc electrostatics.
rwsmith7531 Jul 4, 2023
15851f3
Fix bug preventing correct Emax usage and estimation for systems with…
rwsmith7531 Jul 31, 2023
e3d881b
Remove temporary debugging output.
rwsmith7531 Jul 31, 2023
6df5a61
Update tests to involve adaptive rmin and atom pair-specific overlap …
rwsmith7531 Aug 1, 2023
571c5f0
Fix bug preventing xyz and H files specified with new syntax from ope…
rwsmith7531 Aug 4, 2023
fb9e1f8
Make lammpstrjconvert.py not center the box by default. Add/update e…
rwsmith7531 Aug 4, 2023
d937b06
Update Makefiles to not require pkg-config.
rwsmith7531 Aug 9, 2023
324d38f
Update Widom examples to match the corresponding tests in the test su…
rwsmith7531 Aug 9, 2023
1c05037
Merge branch 'master' into atompair_rmin_xtc to resolve PR branch con…
rwsmith7531 Aug 9, 2023
75790c3
Include example input for adaptive and specific overlap radii and ene…
rwsmith7531 Aug 9, 2023
fb7dcd1
Add GCMC and GEMC tests using CBMC energy table.
rwsmith7531 Aug 9, 2023
0a1233f
Change CMake settings to keep full RPATH when installing gmxfort.
rwsmith7531 Aug 11, 2023
fc96649
Correct executable name in Makefile.gfortran.openMP
rwsmith7531 Aug 11, 2023
652ca79
Clarify some parts of the documentation.
rwsmith7531 Aug 14, 2023
adbaaee
Attempt to fix linking on MacOS by setting CMake policy CMP0042 NEW.
rwsmith7531 Aug 14, 2023
f7b8d93
Attempt to set CMake policy CMP0042 NEW.
rwsmith7531 Aug 14, 2023
98de8a4
Replace '=' with ',' when specifying linker flag -rpath to be compati…
rwsmith7531 Aug 14, 2023
98c6e1e
Merge branch 'atompair_rmin_xtc' into nonwidom_vectorization
rwsmith7531 Aug 15, 2023
c11e389
Add apparently working bitcell overlap detection and CBMC cell neighb…
rwsmith7531 Sep 12, 2023
058e6b9
Add support for RB torsion dihedrals and convert other dihedral types…
rwsmith7531 Sep 21, 2023
0c7d121
Move all 'none' type dihedrals to the end of dihedral_list.
rwsmith7531 Sep 21, 2023
355cf26
Change 'pentuple angle' to 'quintuple angle' in explanation comment.
rwsmith7531 Sep 21, 2023
098477d
Merge branch 'RB_torsions' into bit_cell_overlap
rwsmith7531 Sep 21, 2023
89a3ea2
Refactor fragment_placement, enabling bitcell overlap detection and m…
rwsmith7531 Oct 13, 2023
b139e9f
Merge branch 'cbmc_sf' into bit_cell_overlap and make various improve…
rwsmith7531 Nov 15, 2023
2011ae9
Correct the triclinic functionality of cell list related enhancements…
rwsmith7531 Dec 18, 2023
7ce03f5
Implement cavity biasing and several optimizations.
rwsmith7531 Mar 11, 2024
8bee4d9
Fix minor timing bug and make changes necessary to compile with gfort…
rwsmith7531 Mar 11, 2024
2727efe
Remove some unused subroutines in energy_routines.f90.
rwsmith7531 Mar 11, 2024
db25d46
Rename load_next_frame.f90 to trajectory_reader_routines.f90
rwsmith7531 Mar 15, 2024
69389c1
Repair molecules when reading wrapped trajectories and optimize cell …
rwsmith7531 Mar 21, 2024
b7ba1ff
Compare read cell matrix this_length against orig_length instead of l…
rwsmith7531 Mar 21, 2024
be7a1b9
Optimize Ewald summation code.
rwsmith7531 Apr 18, 2024
b5bd87c
Make further enhancements to Ewald summations.
rwsmith7531 Apr 19, 2024
201a106
Merge branch 'master' into extreme_efficiency_enhancements
rwsmith7531 Apr 22, 2024
7887319
Merge branch 'improved_ewald_setup' into extreme_efficiency_enhancements
rwsmith7531 May 3, 2024
51784a9
Detect SIMD vector size and alignment from compiler options and adjus…
rwsmith7531 May 10, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Prev Previous commit
Next Next commit
Add apparently working bitcell overlap detection and CBMC cell neighb…
…or list.

Members of gathered overlap cells and cell neighbor lists are now filtered by proximity.
CBMC cell list option would now be more appropriately called a cell neighbor list method, since
the possible neighbors for a cell are now gathered and filtered by proximity.  CBMC cells are now
the same size as overlap cells; the gathering algorithm just searches more cells to capture all possible
neighbors.  Trial insertion of first fragment in CBMC are now greatly vectorized.  CBMC dihedral trials are not yet,
but applying vectorization and bitcell overlap detection to dihedral trials should be fairly straightforward.
Dimension padding currently assumes vector size no greater than 256 bits (the size of AVX2 vector registers), and if
we want Cassandra to support AVX-512, changes need to be made to accommodate that since it would violate the alignment
assumptions made in some ifort compiler directives.  While intermolecular CBMC energy estimation is vectorized when used
with CBMC cell neighbor lists, it can apparently sometimes still be slightly slower than directly computing the energy,
most likely due to slower memory access for the very large, precomputed energy table.  I still left it as an option though because
for more expensive force fields, it may be faster.  Some cheap WRITE statements used for debugging are still present in the code
and should probably be removed to avoid excessive verbosity, especially to STDOUT.
Repeating an old simulation (from before this commit) using the same seeds and simulation options will not give identical
results even with a single thread due to the way CBMC insertion trial positions are calculated from the random numbers
differing from how it used to be done; for example, using rranf() - 0.5 instead of 0.5 - rranf() as fractional COM coordinate.
Restricted insertion trial coordinates are now generated within the inner volume the first time, rather than
being generated anywhere in the box and re-generating them within the inner volume them if they're outside the inner volume, as
was done previously, and this process is now vectorized.  Widom insertions will no longer be restricted ever, even if the inserted
species is designated with restricted GCMC insertions. It's likely this was never a problem for anyone, but this fix should make sure
it won't be a problem in the future.  If restricted Widom insertions are ever allowed in the future, additional changes will need
to be made for it to be done properly.
  • Loading branch information
rwsmith7531 committed Sep 12, 2023
commit c11e3898f83cb7dcab1345538c72406cb8456b97
16 changes: 12 additions & 4 deletions Src/atompair_nrg_table_routines.f90
Original file line number Diff line number Diff line change
Expand Up @@ -127,7 +127,8 @@ SUBROUTINE Allocate_Atompair_tables
wsolute_maxind = wsolute_nextbase
IF (precalc_atompair_nrg) THEN
ALLOCATE(typepair_nrg_table(atompair_nrg_res,0:solvent_ntypes,0:solute_ntypes,nbr_boxes))
ALLOCATE(atompair_nrg_table(atompair_nrg_res,solvent_nextbase,solute_nextbase,nbr_boxes))
ALLOCATE(atompair_nrg_table(atompair_nrg_res+1,solvent_nextbase,solute_nextbase,nbr_boxes))
ALLOCATE(atompair_nrg_table_reduced(0:(atompair_nrg_res+1)*solvent_nextbase-1,solute_nextbase,nbr_boxes))
typepair_nrg_table = 0.0_DP
END IF
IF (est_atompair_rminsq) THEN
Expand Down Expand Up @@ -163,6 +164,8 @@ SUBROUTINE Create_Atompair_Nrg_table
nsolutes = 0
nsolvents = 0
rsq_step = (MAXVAL(rcut_cbmcsq)-rcut_lowsq)/atompair_nrg_res
inv_rsq_step = 1.0_DP/rsq_step
inv_rsq_step_sp = REAL(inv_rsq_step,SP)
rsq_shifter = rcut_lowsq - rsq_step
DO i = 1, atompair_nrg_res
rsq_lb_vector(i) = rsq_shifter + rsq_step*i
Expand Down Expand Up @@ -241,24 +244,29 @@ SUBROUTINE Create_Atompair_Nrg_table
!$OMP END PARALLEL

!$OMP WORKSHARE
atompair_nrg_table = typepair_nrg_table(:,solvent_typeindvec,solute_typeindvec,:)
atompair_nrg_table(1:atompair_nrg_res,:,:,:) = typepair_nrg_table(:,solvent_typeindvec,solute_typeindvec,:)
!$OMP END WORKSHARE

!$OMP PARALLEL DEFAULT(SHARED)
!$OMP DO COLLAPSE(3) SCHEDULE(STATIC)
DO ibox = 1, nbr_boxes
DO ti_solute = 1, solute_maxind
DO ti_solvent = 1, solvent_maxind
atompair_nrg_table(:,ti_solvent,ti_solute,ibox) = &
atompair_nrg_table(:,ti_solvent,ti_solute,ibox) + &
atompair_nrg_table(1:atompair_nrg_res,ti_solvent,ti_solute,ibox) = &
atompair_nrg_table(1:atompair_nrg_res,ti_solvent,ti_solute,ibox) + &
f2(:,ibox)*cfqq(ti_solvent,ti_solute)
END DO
END DO
END DO
!$OMP END DO
!$OMP WORKSHARE
atompair_nrg_table(atompair_nrg_res+1,:,:,:) = 0.0
atompair_nrg_table_reduced = REAL(RESHAPE(atompair_nrg_table, SHAPE(atompair_nrg_table_reduced)),SP)
!$OMP END WORKSHARE
!$OMP END PARALLEL



END SUBROUTINE Create_Atompair_Nrg_table

SUBROUTINE Setup_Atompair_tables
Expand Down
59 changes: 56 additions & 3 deletions Src/create_nonbond_table.f90
Original file line number Diff line number Diff line change
Expand Up @@ -76,6 +76,7 @@ SUBROUTINE Create_Nonbond_Table
REAL(DP) :: sixbycut, eps, sigma, negsigsq, negsigbyr2, rterm, rterm2

!******************************************************************************
l_zerotype_present = .FALSE.
IF (verbose_log) THEN
WRITE(logunit,*)
WRITE(logunit,'(A)') 'Nonbond tables'
Expand Down Expand Up @@ -145,6 +146,7 @@ SUBROUTINE Create_Nonbond_Table
ELSE
! atom has no atom_type
nonbond_list(ia,is)%atom_type_number = 0
l_zerotype_present = .TRUE.
ENDIF
! Get maximum and minimum charge for atom type
IF (repeat_type) THEN
Expand Down Expand Up @@ -216,6 +218,8 @@ SUBROUTINE Create_Nonbond_Table
ALLOCATE(vdw_param5_table(0:nbr_atomtypes,0:nbr_atomtypes), Stat=AllocateStatus)
ALLOCATE(ppvdwp_table(0:nbr_atomtypes,0:nbr_atomtypes,5,nbr_boxes))
ALLOCATE(ppvdwp_table2(5,0:nbr_atomtypes,0:nbr_atomtypes,nbr_boxes))
ALLOCATE(ppvdwp_table_sp(0:nbr_atomtypes,0:nbr_atomtypes,5,nbr_boxes))
ALLOCATE(ppvdwp_table2_sp(5,0:nbr_atomtypes,0:nbr_atomtypes,nbr_boxes))

IF (AllocateStatus .NE. 0) THEN
err_msg = ''
Expand Down Expand Up @@ -690,6 +694,8 @@ SUBROUTINE Create_Nonbond_Table
vdw_param2_table ** vdw_param4_table
ppvdwp_table(:,:,1,ibox) = ppvdwp_table(:,:,1,ibox) * &
vdw_param2_table ** vdw_param3_table
l_nonuniform_exponents = ANY(vdw_param3_table(1:,1:) .NE. vdw_param3_table(1,1)) &
.OR. ANY (vdw_param4_table(1:,1:) .NE. vdw_param4_table(1,1))
ppvdwp_table(:,:,3,ibox) = vdw_param3_table * -0.5_DP
ppvdwp_table(:,:,4,ibox) = vdw_param4_table * -0.5_DP
IF (int_vdw_sum_style(ibox) == vdw_cut_shift) THEN
Expand All @@ -709,8 +715,55 @@ SUBROUTINE Create_Nonbond_Table
!shape2 = shape1(order2) ! wrong

ppvdwp_table2 = RESHAPE(ppvdwp_table, SHAPE(ppvdwp_table2), ORDER=order2)

max_rmin = DSQRT(MAXVAL(rminsq_table))
sp_rminsq_table = REAL(rminsq_table,SP)
ppvdwp_table2_sp = REAL(ppvdwp_table2,SP)
ppvdwp_table_sp = REAL(ppvdwp_table,SP)

IF (calc_rmin_flag) THEN
max_rmin = DSQRT(MAXVAL(rminsq_table))
sp_rminsq_table = REAL(rminsq_table,SP)
ALLOCATE(atomtype_max_rminsq(0:nbr_atomtypes))
ALLOCATE(atomtype_min_rminsq(0:nbr_atomtypes))
ALLOCATE(atomtype_max_rminsq_sp(0:nbr_atomtypes))
atomtype_max_rminsq = MAXVAL(rminsq_table(:, &
which_true_from_zero(l_wsolute_atomtype(),nbr_atomtypes+1)),2)
atomtype_min_rminsq = MINVAL(rminsq_table(:, &
which_true_from_zero(l_wsolute_atomtype(),nbr_atomtypes+1)),2)
atomtype_max_rminsq_sp = REAL(atomtype_max_rminsq,SP)
box_list%ideal_bitcell_length = SQRT(MAXVAL(atomtype_min_rminsq)) / 28.0_DP ! RHS scalar LHS vector with one element per box
solvents_or_types_maxind = nbr_atomtypes+1
ELSE
box_list%ideal_bitcell_length = rcut_lowsq / 28.0_DP
solvents_or_types_maxind = 0
END IF
CONTAINS
FUNCTION l_wsolute_atomtype()
LOGICAL, DIMENSION(0:nbr_atomtypes) :: l_wsolute_atomtype
INTEGER :: is
l_wsolute_atomtype = .FALSE.
DO is = 1, nspecies
IF (species_list(is)%l_wsolute) THEN
DO ia = 1, natoms(is)
l_wsolute_atomtype(nonbond_list(ia,is)%atom_type_number) = .TRUE.
END DO
END IF
END DO
END FUNCTION l_wsolute_atomtype
FUNCTION which_true_from_zero(lvec,nl)
INTEGER :: nl
LOGICAL, DIMENSION(0:nl-1) :: lvec
INTEGER :: nt
INTEGER, DIMENSION(COUNT(lvec)) :: which_true_from_zero
INTEGER :: i, tcount
nt = COUNT(lvec)
i = 0
tcount = 0
DO WHILE (tcount < nt)
IF (lvec(i)) THEN
tcount = tcount + 1
which_true_from_zero(tcount) = i
END IF
i = i + 1
END DO
END FUNCTION which_true_from_zero

END SUBROUTINE Create_Nonbond_Table
Loading