Many things can be realized on a computer very elegantly and efficiently today thanks to progress in software and programming languages. One thing that cannot be done efficiently on a computer is computing, at least not computing fast.

High performance computing (HPC) is to a large extend influenced by some highly tuned numeric libraries. Assume we want to multiply two matrices, i.e. calculate A = B * C. Then we can use some libraries that run at over 90 per cent peak performance. We only need to write something like this:

        int m= num_rows(A), n= num_cols(B), k= num_cols(A),
            lda= A.get_ldim(), ldb= B.get_ldim(), ldc= C.get_ldim();
        double alpha= 1.0, beta= 1.0;
        char a_trans= 'N', b_trans= 'N';
        _dgemm(&a_trans, &b_trans, &m, &n, &k, &alpha, &A[0][0], &lda,
               &B[0][0], &ldb, &beta, &C[0][0], &ldc);

MTL4 allows you to write A = B * C and let you use BLAS internally if available. Otherwise, it provides you an implementation in C++ that is also reasonably fast (we usually reached 60 per cent peak).

Returning to the expression A = B * C; it can be used to express every product of sparse and dense matrices. The library will dispatch to the appropriate algorithm. Moreover, the expression could also represent a matrix vector product if A and C are column vectors (one would probably choose lower-case names though).

In fact, x = y * z can represent four different operations:

  • matrix product;
  • matrix vector product;
  • scalar times matrix; or
  • scalar times vector.

This video is a demonstration of our work on phase field crystal growth with MTL4 developed by our team.



To know more about MTL4, look into our Tutorial.

© 2021 SimuNova UG