How linear expressions get evaluated

With respect to BLAS there are three levels of abstraction in FLENS

  1. High-level interface: here you code expressions like z = a + 2.5*b + c;
    However, you still should know what you are doing. Spend some thoughts about:
    • Can this be evaluated without a temporary object? That is important as FLENS will not create temporaries by itself!
    • Can I write x = A*x or should I do code it as t = x; x = A*t?
      What is the right thing depends on the matrix type. There is are example below showing either can make sense!
  2. The FLENS to BLAS interface: these are functions like mv that are tightly related to a corresponding BLAS function. But they have fewer arguments and are easier to use. This is analogously to the Fortran 90 BLAS interface!
  3. The low-level BLAS bindings: This is more or less the same a CBLAS. But the functions are overloaded with respect to the element type.

Evaluation logic on the high-level

In the following lower-case letters denote vectors and upper-case letters denote matrices.


Evaluation of expressions using the FLENS to BLAS interface

BLAS provides functions like gemv, gbmv, symv (and so on) to compute matrix-vector products. In FLENS there are corresponding functions mv overloaded for each low-level BLAS function. This has two advantages:

  1. the mv functions requires fewer arguments and is simpler to use. For example, you write
      mv(Trans, 2., A, x, 0., z);
    
    instead of
      gemv(ColMajor, Trans,
           A.numRows(), A.numCols(), 2., A.data(), A.leadingDimension(),
           x.data(), x.stride(),
           0., z.data(), z.stride());
    
  2. It allows to implement a simple mechanism that will map an expression like
      z = 1.5*a + B*x + 3*transpose(A)*c;
    
    onto subsequent calls to BLAS functions.

So here a brief (and incomplete) overview of the FLENS to BLAS interface:

  1. Level 1 BLAS:
    • FLENS can therefore evaluate
        z  = 1.5*a;
      
      by calling
        copy(a, z);
        scal(1.5, a);
      
    • And analogously the expression
        z += 4*b;
      
      can be evaluated by calling
        axpy(4, b, z);
      
    • Note that axpy requires for that ! That is why
        z = x + z;
      
      which is equivalent to
        z  = x;
        z += z;
      
      which is equivalent to
        copy(x,z);
        axpy(1, z, z);
      
      will result in an assertion failure!
      Instead you code this as
        z += x;
      
      which is anyway more expressive.
  2. Level 2 BLAS:
    • FLENS can therefore evaluate
        z  = 1.5*A*x;
      
      by calling
        mv(NoTrans, 1.5, A, x, 0, z);
      
      and
        z  += 1.5*A*x;
      
      by calling
        mv(NoTrans, 1.5, A, x, 1, z);
      
    • Whether
        x = A*x;
      
      can be evaluated or not depends on the matrix type of A and on what the underlying BLAS implementation allows. For example
      1. if A is a general matrix then BLAS does not allow to compute mv(Trans, 1, A, x, 0, x). So this will result in an assertion failure.
      2. if A is a triangular matrix then BLAS does allow to compute mv(Trans, 1, A, x, 0, x)! Actually BLAS requires in this case that the matrix vector product is of the form x = A*x, i.e. both vectors are identical. Hence, if A is triangular then y = A*x would fail!
      3. if you do not like this: just implement your own version of mv providing your desired functionality!
  3. Level 3 BLAS:

    BLAS requires that either A or B is of type GeMatrix. But as usual in FLENS: If you provide another version of mm you can extend this functionality.

    • Analogously to Level 1 and Level 2 BLAS the expression
        C = 1.5*A*transpose(B);
      
      gets evaluated by calling
        mm(NoTrans, Trans, 1.5, A, B, 0, C);
      
    • And
        C += 1.5*A*transpose(B);
      
      gets evaluated by calling
        mm(NoTrans, Trans, 1.5, A, B, 1, C);