Tags: ilyapopov/amgcl
Tags
v1.4.0 * The codebase uses C++11 standard. * `amgcl::mpi::amg` preconditioner is implemented! This works much better than subdomain deflation which is now considered deprecated. See [`examples/mpi/mpi_amg.cpp`](https://github.com/ddemidov/amgcl/blob/master/examples/mpi/mpi_amg.cpp) and some benchmarks in https://doi.org/10.1134/S1995080219050056 (https://arxiv.org/abs/1811.05704). * Added wrappers for [Scotch](https://www.labri.fr/perso/pelegrin/scotch/) and [ParMetis](http://glaros.dtc.umn.edu/gkhome/metis/parmetis/overview) partitioning libraries. * Runtime interface has been refactored. Instead of doing ```cpp typedef amgcl::make_solver< amgcl::runtime::amg<Backend>, amgcl::runtime::iterative_solver<Backend> > Solver; ``` one should now ```cpp typedef amgcl::make_solver< amgcl::amg< Backend, amgcl::runtime::coarsening::wrapper, amgcl::runtime::relaxation::wrapper >, amgcl::runtime::solver::wrapper<Backend> > Solver; ``` This allows to reuse the same `amgcl::amg` implementation both for compile-time and runtime interfaces and greatly reduces compilation time and memory requirements for the library. * Got rid of as many [Boost](https://www.boost.org/) dependencies as possible in favor of C++11. Currently, only the runtime interface depends on Boost , as it uses `boost::property_tree::ptree` for defining runtime parameters. It should be possible to use the compile-time interface completely Boost-free. This also means that one should replace all uses of `boost::tie` and `boost::make_tuple` in amgcl-related code with `std::tie` and `std::make_tuple`. For example: ```cpp Solver solve(std::tie(n, ptr, col, val)); std::tie(iters, error) = solve(f, x); ``` * Provide `amgcl::backend::bytes()` function that returns (sometimes approximately) the amount of memory allocated for an amgcl object. `std::string amgcl::backend::human_readable_memory(size_t)` converts bytes to a human-readable size string (e.g. 1024 is converted to `1 K`). * Support for mixed-precision computations (where iterative solver and preconditioner use different precisions). * MPI versions of CPR preconditioners. Support for statically-sized matrices in global preconditioners of CPR. Support for mixed-precision in global and pressure-specific parts of CPR. * Helper functions for the computation of rigid body modes (for use as near null space vectors in structural problems) from 2D or 3D coordinates. * Improvements for the Schur pressure correction preconditioner. * Added Richardson and PreOnly (only apply the preconditioner once) iterative solvers. The latter is intended for use as a nested solver in composite preconditioners, such as Schur pressure correction. * `epetra_map` was moved to the `adapter` namespace. * Eigen backend was split into adapter and backend part. * `amgcl::make_scaling_solver` has been replaced with `amgcl::scaled_problem` adapter. * Added tutorials to the documentation. * Many bug fixes and minor improvements.
1.3.99 * The code base uses C++11 standard. * `amgcl::mpi::amg` preconditioner is implemented! This works much better than subdomain deflation which is now considered deprecated. See [`examples/mpi/mpi_amg.cpp`](https://github.com/ddemidov/amgcl/blob/master/examples/mpi/mpi_amg.cpp) and some benchmarks in https://doi.org/10.1134/S1995080219050056 (https://arxiv.org/abs/1811.05704). * Added wrappers for [Scotch](https://www.labri.fr/perso/pelegrin/scotch/) and [ParMetis](http://glaros.dtc.umn.edu/gkhome/metis/parmetis/overview) partitioning libraries. * Runtime interface has been refactored. Instead of doing ```cpp typedef amgcl::make_solver< amgcl::runtime::amg<Backend>, amgcl::runtime::iterative_solver<Backend> > Solver; ``` one should now ```cpp typedef amgcl::make_solver< amgcl::amg< Backend, amgcl::runtime::coarsening::wrapper, amgcl::runtime::relaxation::wrapper >, amgcl::runtime::solver::wrapper<Backend> > Solver; ``` This allows to reuse the same `amgcl::amg` implementation both for compile-time and runtime interfaces, and greately reduces compilation time and memory requirements for the library. * Got rid of as much [Boost](https://www.boost.org/) dependencies as possible in favor of C++11. Currently , only the runtime interface depends on Boost , as it uses `boost::property_tree::ptree` for defining runtime parameters. It should be possible to use the compile-time interface completely Boost-free. This also means that one should replace all uses of `boost::tie` and `boost::make_tuple` in amgcl-related code with `std::tie` and `std::make_tuple`. For example: ```cpp Solver solve(std::tie(n, ptr, col, val)); std::tie(iters, error) = solve(f, x); ``` * Provide `amgcl::backend::bytes()` function that returns (sometimes approximately) the amount of memory allocated for an amgcl object. `std::string amgcl::backend::human_readable_memory(size_t)` converts bytes to a human-readable size string (e.g. 1024 is converted to `1 K`). * Initial support for mixed-precision computations (where iterative solver and preconditioner use different precisions). * MPI versions of CPR preconditioners. Support for statically-sized matrices in global preconditioners of CPR. Support for mixid precision in global and pressure-specific parts of CPR. * `epetra_map` was moved to `adapter` namespace. * Eigen backend was split into adapter and backend part. * `amgcl::make_scaling_solver` has been replaced with `amgcl::scaled_problem` adapter. This is marked as pre-release, because the documentation still needs to be updated.
1.2.0 * Change default value of `smoothed_aggregation.aggr.eps_strong` from 0 to 0.08. This should work better for anisotropic cases. * Pressure mask may be set with a pattern in Schur pressure correction preconditioner. * When using async_setup, allow to exit initialization thread early in case the solution has already converged. * Stable implementation of inner product in OpenMP backend. This makes the solution deterministic for a fixed number of OpenMP threads. * Support non-zero initial condition in BiCGStab(L). * Switch implementation of BiCGStab(L) to Fokkema's version [1]. * Support both left and right preconditioning in BiCGStab, BiCGStab(L), GMGES, CG. * Improve performance/scalability of `mpi::subdomain_deflation`. * Minor bug fixes and improvements. [1] Fokkema, Diederik R. Enhanced implementation of BiCGstab (l) for solving linear systems of equations. Universiteit Utrecht. Mathematisch Instituut, 1996.
1.1.0 * Improve profiling: allow users to configure profiling operations. * Implement `adapter::reorder` for matrices and vectors. Allows to transparently apply Cuthill-McKee reordering to the system matrix and RHS before solution. * Improve performance of Schur pressure correction preconditioner by (optionally) approximating inverse of `Kuu` matrix with its inverted diagonal. * Use power iteration to estimate spectral radius in smoothed_aggregation. This improves convergence rate at the cost of setup time. The total time is usually improved, but may suffer on GPGPU backends. * Adding IDR(s) iterative solver (http://ta.twi.tudelft.nl/nw/users/gijzen/IDR.html). * Improve performance and scalability of `mpi::subdomain_deflation` preconditioner. * Support matrix-free solution with `mpi::subdomain_deflation`. * Provide `amgcl::put(ptree p, string s)` where `s` has `key=value` format. This makes parsing of command line parameters easier. * Add shared and distributed memory benchmarks to the documentation. * Add Clang and OSX tests on Travis-CI. * Minor bug fixes and improvements.
1.0.0 * Implemented OpenMP versions of incomplete LU smoothers (`ilu0`, `iluk`, `ilut`), and got rid of now obsolete `parallel_ilu0` smoother. Parallel algorithm is based on level scheduling approach and is automatically selected when there are four or more OpenMP threads. * Reimplemented multicolor Gauss-Seidel smoother, merged the new implementation with `gauss_seidel`, and got rid of obsolete `multicolor_gauss_seidel` smoother. Parallel algorithm is based on level scheduling approach and is automatically selected when there are four or more OpenMP threads. * Code cleanup, minor improvements and bug fixes.
0.9.0 * Use NUMA-friendly internal data structures. This shows measurable speed-up on NUMA systems. * Allow asynchronous amg setup. `amgcl::amg` constructor starts the setup process in a new thread. As soon as constructor returns, the instance is ready to be used as a preconditioner. Initially its just a single-level smoother, but when as the new (coarser) levels are constructed, they are put to use. In case of GPGPU backends, this should allow to overlap work between host CPU doing setup and the compute device doing the solution. In some cases a 2x speedup of the overall solution has been achieved. * Allow limiting number of amg levels, thus supporting using relaxation for coarse solves. * Rewrite lgmres and fgmres in terms of Givens rotations, which should work better with complex problems, see ddemidov#34. * Use new, more effective, sparse matrix format in VexCL backend and allow to use non-scalar values with the backend. * Modernize cmake scripts. Provide `amgcl::amgcl` imported target, so that users may just ```cmake find_package(amgcl) add_executable(myprogram myprogram.cpp) target_link_libraries(myprogram amgcl::amgcl) ``` to build a program using amgcl. The imported target brings necessary compile and link options automatically. * Replace boost.python with [pybind11](https://github.com/pybind/pybind11) and improve python interface. * Unify example codes for different backends. * Minor improvements and bug fixes
* Implemented LGMRES solver ("Loose" GMRES, [BaJM05]_). * Implemented FGMRES solver (Flexible GMRES, [Saad03]_). * Performance improvements in components using QR decomposition (spai1, aggregation with null-space provided). * Provided python examples. * Minor bug fixes and improvements.
* **Updated two-stage preconditioners** - Replaced SIMPLE preconditioner with Schur complement pressure correction. - Implemented MPI version of Schur complement pressure correction preconditioner. - Make CPR preconditioner actually work. * **Breaking changes** - rename `amgcl/amgcl.hpp` -> `amgcl/amg.hpp` * Added support for complex and non-scalar values for coefficients/unknowns. * Implemented ILU(k) smoother/preconditioner. * Added MPI block-preconditioner wrapper. * Introduced `mpi::make_solver`. * Allow using make_solver as preconditioner. * Improve Matrix-Matrix product performance for many-cored CPUs. * Provide MatrixMarket format I/O functions. * Generalize profiler class, provide a couple of custom performance counters. * Improve coarse solution performance on GPGPU backends. * Allow checking for consistency of runtime parameters. * Move documentation to http://amgcl.readthedocs.org. * Various bugfixes and improvements.
PreviousNext