Object-Pascal unit interface to CBLAS, ATLAS, Openblas, MKL, NVBLAS....
The algebraic vector and matrix operations are easy to implement, but the straightforward implementation is not the most performant one.
There are optimized algebra libraries that can boost the operations by 40x. Those libraries take advantage of handcrafted assembly code, CPU L1/L2 cache and parallelism in GPUs.
This UNIT provide binding to libraries that use the CBLAS interface.
The main requirements during the implementation of the CBLAS Unit where:
- Reuse of the CBLAS C interface.
- Dynamic linking.
- Runtime selection of the library, the unit shall be able to select the library during execution.
- Hide the Windows/Linux/OSX details.
- FreePascal / Delphi
The CBLAS procedure that I mostly use is the dgemm
, and I wish to compare between the different implementations for
the kind of matrix that I use. (medium size 4000 x 10000).
From my humble experience, the main point of friction when writing an application are the graphical user interfaces. The ObjectPascal RAD/VCL environement excel in writing those user interfaces. It lets you build GUI software really quickly. Productivity in Lazarus/Delphi for someone used to it surpases any "modern" GUI development platform.
If asked to define in a single word why I use Pascal for some project, that word is `Productivity´. The IDE (Lazarus or Delphi), are boost the productivity in such a way that the typical CRUD application can be done in a glimpse.
Nevertheless, one of the strengths of the Pascal environment is the integration with other languages/libraries.
This unit was written in 40 minutes, using h2pas
and vim macros
. The friction of interacting with foreign languages
is almost negligible.
The cblas unit has been developed and tested with free pascal.
I'm open to adapt to Delphi, if a kind soul provides me a license :). Nevertheless, is you do the adaptation work, I will be more than happy to merge your changes.
This unit has not been tested with Delphi, (I haven't used Delphi in the last 12 years).
The cblas unit has been tested in following OSs.
Library | FreeBSD | Ubuntu | SUSE | Windows 10 | OSX |
---|---|---|---|---|---|
Netlib BLAS | X | ||||
OpenBLAS | X | X | X | ||
ATLAS | X | X | |||
Intel MKL | X | ||||
NVBlas | |||||
CLBlas |
The cblas unit has been tested in following CPUs.
Library | AMD/Intel 64 | AMD/Intel 32 | ARM 64 | ARM 32 |
---|---|---|---|---|
Netlib BLAS | X | |||
OpenBLAS | X | X | ||
ATLAS | X | X | ||
Intel MKL | X | |||
NVBlas | ||||
CLBlas |
The CBlas libraries in windows can be retrieved in prebuild binaries. But they have a some dependencies. This section describe how to use them.
OpenBlas Pre-Built binaries can be found in sourceforge. The pre-built binaries have dependencies on other dlls (libgcc_s_seh-1.dll, libgfortran-3.dll, libquadmath-0.dll). Those DLL can be found in the mingw64_dll.zip also found in the OpenBlas sourceforge repository. Note that the contents of mingw64_dll.zip are the ones found in an instalation of mingw64.
The MKL exports the CBLAS interface in the mkl_rt.dll Internaly it depends on other libraries that are found in the redist directories. All of the DLL shall be accesible via the Dynamic-Link Library Search Order. That means either the DLL are copied to the executable path or the redist directories are append to the PATH environment variable.
The MKL DLLs are found at.
- compilers_and_libraries_2018.2.185\windows\redist\intel64_win\compiler
- compilers_and_libraries_2018.2.185\windows\redist\intel64_win\mkl
The header used to create the cblas.pas was the netlib cblas.h.
The unit follows the same approach as the SQLite unit. All functions are assigned to an address using the LoadLibrary + GetProcedureAddress. All of them are explicitly loaded. With this approach is possible to load the shared library dynamically.
The usage of the external library name is avoided because it assigns the library magically.
The naming of the CBLAS functions has been preserved. The same names used in the CBLAS headers will be used.
You shall take care of the library to use in the InitializeCBLAS. But if you want to use other library or a non-default name, you can use the InitializeCBLAS pointing out to the library.
Take care of the PATH variable that will be used to locate the libraries. The same rules as LoadLibrary will apply.
Sample code using default initalization.
InitializeCBLAS;
...
cblas_dgemm(CblasRowMajor, CblasNoTrans, CblasNoTrans,
m,n,k, { m, n, k }
1, { alpha }
A, k,
B, n,
1,
C,n
);
....
ReleaseCBLAS;
In case your library is not libopenblas, the InitializeCBLAS procedure shall be called with the libraries and dependencies to load. Take care of the suffix (.so, .dll, .dylib) in Linux, windows and OSX.
Library | Initialization |
---|---|
netlib cblas | InitializeCBLAS(['libblas.so'], 'libcblas.so'); |
openblas | InitializeCBLAS([], 'libopenblas.so'); |
ATLAS | InitializeCBLAS([], 'libcblas.so.3'); |
ATLAS | InitializeCBLAS([], 'mkl_rt.dll'); |
NVBLAS | TBD |
CLBlas | TBD |
The netlib cblas has dependencies in the Fortran libblas. The dependency array shall include all libraries the CBLAS depend on. Otherwise there will be unresolved symbols.
All BLAS functions assume that the matrix and vector data are stored contiguously in memory. This implies for example that a matrix cannot be represented as a vector of vectors. It needs to be represented as a block of N x M contiguous elements in memory. Moreover, vectors and matrices of complex numbers must be stored such that the real and imaginary parts of a given element are contiguous in memory.
For matrices in mathematical notation, the first index usually indicates the row, and the second indicates the column, e.g., given a matrix A , a1,2 is in its first row and second column.
There are two ways of placing previous matrix in memory, either put the rows one after other or place the columns. Those two variants: row-major, and column-major. For CBLAS both are supported equally well,
Row-major order (C, Pascal).
Address | Access | Value |
---|---|---|
0 | A[0][0] | a_1_1 |
1 | A[0][1] | a_1_2 |
2 | A[0][2] | a_1_3 |
3 | A[1][0] | a_2_1 |
4 | A[1][1] | a_2_2 |
5 | A[1][2] | a_2_3 |
Column-major order.
Address | Access | Value |
---|---|---|
1 | A[0][0] | a_1_1 |
2 | A[1][0] | a_2_1 |
3 | A[0][1] | a_1_2 |
4 | A[1][1] | a_2_2 |
5 | A[0][2] | a_1_3 |
6 | A[2][2] | a_2_3 |
Pascal (like C) uses Row Mayor Order. Fortunatelly the CBLAS library allows the selection of the matrix order.
the library uses the fpmake tool to compile and install.
$ git clone https://github.com/clairvoyant/cblas
$ cd cblas
$ fpc fpmake
$ ./fpmake build
$ ./fpmake install
If your FreePascal instalation is not in the standard place, you need to point to the global unit dir.
The syntax is somehow similar to the one below, change the globalunitdir path to your instalation path.
$ ./fpmake build --globalunitdir=/usr/lib64/fpc/3.0.4/
There are some unit tests to verify the environment.
$ ./tests/tests --format=plain --all
- Test: in 32-bit environments. Verification has been performed in 64bits architectures. There can be issues in 32 bit environment.
- Test: with Delphi
- Test: nvblas in Linux/Windows
- Test: clblas in Linux/Windows
- Publish the benchmarks.
- Choosing the optimal BLAS and LAPACK library, Tobias Wittwer, 2008
- Foad Sojoodi Farimani, Curated list of cblas resources
- Column vs Row major order