Skip to content

Commit 7b26baf

Browse files
committed
Add documentation and usage examples.
1 parent 3f4ea99 commit 7b26baf

7 files changed

+794
-0
lines changed

CMakeLists.txt

+1
Original file line numberDiff line numberDiff line change
@@ -8,3 +8,4 @@ include(CTest)
88

99
add_subdirectory(src)
1010
add_subdirectory(test)
11+
add_subdirectory(example)

README.md

+4
Original file line numberDiff line numberDiff line change
@@ -6,3 +6,7 @@ derived type that describes the mapping of an array's index set to processes.
66
The mapping allows for overlap between processes, and provides collective
77
array gather procedures and scatter reduction procedures associated with that
88
overlap.
9+
10+
Documentation for the module is in the `doc` directory.
11+
12+
Usage examples are in the `example` directory.

doc/index_map_type.md

+377
Large diffs are not rendered by default.

example/CMakeLists.txt

+7
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,7 @@
1+
# Want test modules in the build directory.
2+
#unset(CMAKE_Fortran_MODULE_DIRECTORY)
3+
4+
add_executable(disk-fv-serial disk-fv-serial.f90)
5+
6+
add_executable(disk-fv-mpi disk-fv-mpi.f90)
7+
target_link_libraries(disk-fv-mpi common MPI::MPI_Fortran)

example/README.md

+25
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,25 @@
1+
## Example Programs
2+
3+
This directory contains example programs that illustrate the use of the
4+
`index_map_type` Fortran module.
5+
6+
### Finite Volume Solution of the Heat Equation on a Unit Disk
7+
8+
The program `disk-fv-mpi.f90` solves the heat equation $`u_t = \Delta u`$ on
9+
the unit disk subject to zero boundary conditions. It uses a simple finite
10+
volume discretization on a regular 2D Cartesian mesh that contains the unit
11+
disk. Only those cells whose center is contained in the unit disk are
12+
included in the problem. The unknowns are cell-centered values of $`u`$.
13+
Simple first-order forward Euler time stepping is used to solve from a
14+
uniform initial condition to a final time. Domain decomposition is used to
15+
parallelize the computation. The cells are numbered according to the usual
16+
order of the elements of a rank-2 array (but only for those included in
17+
the problem) and then partitioned into approximately equal blocks, one per
18+
MPI rank. There is a single indirect indexing array that maps a cell index
19+
to the cell indices of its neighboring cells. For comparison, the program
20+
`disk-fv-serial.F90` is a serial version of the solver. Aside from the initial
21+
setup of the parallel decomposition of the problem, the only difference
22+
between the MPI-parallel solver and the serial solver is a single call to
23+
`index_map%gather_offp` to update the off-process elements of the local
24+
unknown array with values from their corresponding on-process elements on
25+
neighboring processes -- a parallel halo exchange operation.

example/disk-fv-mpi.f90

+207
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,207 @@
1+
!! DISK_FV_MPI
2+
!!
3+
!! This example program solves the heat equation \(u_t = \Delta u\) on the
4+
!! unit disk subject to 0 boundary conditions. It uses a simple finite volume
5+
!! discretization on a regular 2D Cartesian mesh that contains the unit disk.
6+
!! Only those cells whose center is contained in the unit disk are included
7+
!! in the problem. The unknowns are cell-centered values of \(u\). Simple
8+
!! first-order forward Euler time stepping is used to solve from a uniform
9+
!! initial condition to a final time. Domain decomposition is used to
10+
!! parallelize the computation. The cells are numbered according to the usual
11+
!! order of the elements of a rank-2 array (but only for those included in
12+
!! the problem) and then partitioned into approximately equal blocks, one per
13+
!! MPI rank. For reference, see the serial implementation disk-serial.f90
14+
!!
15+
!! Copyright 2022 Neil N. Carlson <[email protected]>
16+
!!
17+
!! Permission is hereby granted, free of charge, to any person obtaining a
18+
!! copy of this software and associated documentation files (the "Software"),
19+
!! to deal in the Software without restriction, including without limitation
20+
!! the rights to use, copy, modify, merge, publish, distribute, sublicense,
21+
!! and/or sell copies of the Software, and to permit persons to whom the
22+
!! Software is furnished to do so, subject to the following conditions:
23+
!!
24+
!! The above copyright notice and this permission notice shall be included
25+
!! in all copies or substantial portions of the Software.
26+
!!
27+
!! THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
28+
!! IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
29+
!! FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
30+
!! THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
31+
!! LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
32+
!! FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
33+
!! DEALINGS IN THE SOFTWARE.
34+
!!
35+
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
36+
37+
program disk_fv_mpi
38+
39+
use,intrinsic ::iso_fortran_env, only: i8 => int64, r8 => real64
40+
use mpi
41+
use index_map_type
42+
implicit none
43+
44+
integer, parameter :: NZ = 101 ! number of zones in each dimension
45+
integer :: ierr, nproc, rank, bsize, n, ncell, j, step, nstep, nstep0
46+
integer, allocatable :: mask(:,:), cnhbr(:,:), cnhbr_local(:,:)
47+
real(r8), allocatable :: u(:), u_local(:), u_prev(:)
48+
type(index_map) :: cell_map
49+
real(r8) :: dt, dx, c, tfinal
50+
integer(i8) :: t1, t2, rate
51+
52+
call MPI_Init(ierr)
53+
call MPI_Comm_size(MPI_COMM_WORLD, nproc, ierr)
54+
call MPI_Comm_rank(MPI_COMM_WORLD, rank, ierr)
55+
56+
!! The MASK array marks grid cells inside the unit disk with a unique
57+
!! positive index. Other cells are marked with a 0. It is sized to include
58+
!! a border of cells around the actual grid. NCELL is the number of cells
59+
!! contained in the unit disk and the number of cell-centered unknowns.
60+
if (rank == 0) then
61+
allocate(mask(0:NZ+1,0:NZ+1))
62+
call unit_disk_mask(mask, ncell)
63+
end if
64+
call MPI_Bcast(ncell, 1, MPI_INTEGER, 0, MPI_COMM_WORLD, ierr)
65+
66+
!! Create a mapping of the cell index set onto NPROC processes with
67+
!! nearly equal block sizes.
68+
bsize = ncell/nproc
69+
if (rank < ncell-bsize*nproc) bsize = bsize + 1
70+
call cell_map%init(MPI_COMM_WORLD, bsize)
71+
72+
!! Create the indirect indexing array CNHBR. CNHBR(:,j) are the indices
73+
!! of the 4 cells adjacent to cell j, or 0 if no neighbor. CNHBR_LOCAL is
74+
!! the corresponding distributed indexing array with local index values.
75+
n = merge(ncell, 0, rank==0)
76+
allocate(cnhbr(4,n))
77+
if (rank == 0) call fill_cnhbr(mask, cnhbr)
78+
call cell_map%localize_index_array(cnhbr, cell_map, cnhbr_local)
79+
80+
!! U is the global array of cell-centered unknowns, and U_LOCAL is the
81+
!! corresponding distributed array. For convenience, an extra element
82+
!! is prepended to the arrays for the artificial index 0 to serve as a
83+
!! place to hold the boundary value.
84+
allocate(u(0:n), u_local(0:cell_map%local_size))
85+
u = 1 ! initial conditions
86+
call cell_map%distribute(u(1:), u_local(1:))
87+
u_local(0) = 0 ! boundary value
88+
89+
!! Time step out to time TFINAL using the explicit first order Euler
90+
!! method with a fixed stable time step DT.
91+
dx = 1.0_r8/NZ
92+
dt = 0.25_r8*dx**2
93+
c = dt/dx**2
94+
95+
tfinal = 0.05_r8
96+
nstep = ceiling(tfinal/dt)
97+
tfinal = nstep*dt
98+
99+
nstep0 = max(1, nstep/10) ! warm-up: start timing on this step
100+
101+
allocate(u_prev, mold=u_local)
102+
call MPI_Barrier(MPI_COMM_WORLD, ierr) ! for timing purposes
103+
do step = 1, nstep
104+
if (step == nstep0) call system_clock(t1)
105+
call cell_map%gather_offp(u_local(1:))
106+
u_prev = u_local
107+
do j = 1, cell_map%onp_size
108+
u_local(j) = u_prev(j) + c*(sum(u_prev(cnhbr_local(:,j))) - 4*u_prev(j))
109+
end do
110+
end do
111+
call MPI_Barrier(MPI_COMM_WORLD, ierr) ! for timing purposes
112+
call system_clock(t2, rate)
113+
114+
!! Gather the distributed solution onto rank 0 and output.
115+
call cell_map%collate(u_local(1:), u(1:))
116+
if (rank == 0) then
117+
write(*,'(a,es10.4,a)') 'Solution at t=', tfinal, ' written to out.vtk; visualize with paraview.'
118+
u(0) = 0 ! boundary value
119+
call vtk_plot(mask, u)
120+
write(*,'(g0,a,i0,a)') (10**6)*real(t2-t1)/real(rate)/(nstep-nstep0+1), &
121+
' µsec per time step; ', ncell/nproc, ' cells per process'
122+
end if
123+
124+
call MPI_Finalize(ierr)
125+
126+
contains
127+
128+
!! Generate a grid mask array for a unit disk. A mask element corresponding
129+
!! to a grid cell whose center lies outside the unit disk is set to zero.
130+
!! Other elements are assigned sequential integers starting with 1 according
131+
!! to the normal traveral order of a rank-2 array. The number of grid cells
132+
!! so contained by the unit disk is returned in N.
133+
134+
subroutine unit_disk_mask(mask, n)
135+
integer, intent(out) :: mask(0:,0:), n
136+
real :: dx, dy, x, y
137+
integer :: i, j
138+
mask = 0
139+
dx = 2.0/NZ
140+
dy = 2.0/NZ
141+
n = 0
142+
y = -1 + dy/2
143+
do j = 1, NZ
144+
x = -1 + dx/2
145+
do i = 1, NZ
146+
if (x**2 + y**2 <= 1) then
147+
n = n + 1
148+
mask(i,j) = n
149+
end if
150+
x = x + dx
151+
end do
152+
y = y + dy
153+
end do
154+
end subroutine
155+
156+
!! Fill the cell neighor indirect indexing array using the mask array to
157+
!! obtain the cell indices of the neighbors (or 0 where there is no neighbor).
158+
159+
subroutine fill_cnhbr(mask, cnhbr)
160+
integer, intent(in) :: mask(0:,0:)
161+
integer, intent(out) :: cnhbr(:,:)
162+
integer :: n, i, j
163+
n = 0
164+
do j = 1, NZ
165+
do i = 1, NZ
166+
n = mask(i,j)
167+
if (n > 0) then
168+
cnhbr(1,n) = mask(i,j-1)
169+
cnhbr(2,n) = mask(i+1,j)
170+
cnhbr(3,n) = mask(i,j+1)
171+
cnhbr(4,n) = mask(i-1,j)
172+
end if
173+
end do
174+
end do
175+
end subroutine
176+
177+
!! Write a VTK plot file of the solution U. The solution is embedded in a
178+
!! rectangular grid with boundary values (0) for grid cells outside the
179+
!! masked region.
180+
181+
subroutine vtk_plot(mask, u)
182+
integer, intent(in) :: mask(0:,0:)
183+
real(r8), intent(in) :: u(0:)
184+
integer :: lun, i, j
185+
real, allocatable :: u_plot(:,:)
186+
open(newunit=lun,file='out.vtk')
187+
write(lun,'("# vtk DataFile Version 3.0")')
188+
write(lun,'("example solution")')
189+
write(lun,'("ASCII")')
190+
write(lun,'("DATASET STRUCTURED_POINTS")')
191+
write(lun,'("DIMENSIONS",3(1x,i0))') NZ+1, NZ+1, 1
192+
write(lun,'("ORIGIN 0 0 0")')
193+
write(lun,'("SPACING",3(1x,g0))') 1.0/NZ, 1.0/NZ, 1
194+
write(lun,'("CELL_DATA ",i0)') NZ**2
195+
write(lun,'("SCALARS u float 1")')
196+
write(lun,'("LOOKUP_TABLE default")')
197+
allocate(u_plot(NZ,NZ))
198+
do j = 1, NZ
199+
do i = 1, NZ
200+
u_plot(i,j) = u(mask(i,j))
201+
end do
202+
end do
203+
write(lun,'(g0)') u_plot
204+
close(lun)
205+
end subroutine
206+
207+
end program

0 commit comments

Comments
 (0)