Overview
Many things can be realized on a computer very elegantly and efficiently today thanks to progress in software and programming languages. One thing that cannot be done efficiently on a computer is computing, at least not computing fast.
High performance computing (HPC) is to a large extend influenced by some highly tuned numeric libraries. Assume we want to multiply two matrices, i.e. calculate A = B * C. Then we can use some libraries that run at over 90 per cent peak performance. We only need to write something like this:
int m= num_rows(A), n= num_cols(B), k= num_cols(A), lda= A.get_ldim(), ldb= B.get_ldim(), ldc= C.get_ldim(); double alpha= 1.0, beta= 1.0; char a_trans= 'N', b_trans= 'N'; _dgemm(&a_trans, &b_trans, &m, &n, &k, &alpha, &A[0][0], &lda, &B[0][0], &ldb, &beta, &C[0][0], &ldc); |
MTL4 allows you to write A = B * C and let you use BLAS internally if available. Otherwise, it provides you an implementation in C++ that is also reasonably fast (we usually reached 60 per cent peak).
Returning to the expression A = B * C; it can be used to express every product of sparse and dense matrices. The library will dispatch to the appropriate algorithm. Moreover, the expression could also represent a matrix vector product if A and C are column vectors (one would probably choose lower-case names though).
In fact, x = y * z can represent four different operations:
- matrix product;
- matrix vector product;
- scalar times matrix; or
- scalar times vector.
This video is a demonstration of our work on phase field crystal growth with MTL4 developed by our team.
To know more about MTL4, look into our Tutorial.