How linear expressions get evaluated
With respect to BLAS there are three levels of abstraction in FLENS
-
High-level interface: here you code expressions like z = a + 2.5*b + c;
However, you still should know what you are doing. Spend some thoughts about:
- Can this be evaluated without a temporary object? That is important as FLENS will not
create temporaries by itself!
- Can I write x = A*x or should I do code it as t = x; x = A*t?
What is the right thing depends on the matrix type. There is are example below showing either can make sense!
- The FLENS to BLAS interface: these are functions like mv that are tightly related to
a corresponding BLAS function. But they have fewer arguments and are easier to use. This is analogously
to the Fortran 90 BLAS interface!
- The low-level BLAS bindings: This is more or less the same a CBLAS. But the functions are overloaded
with respect to the element type.
Evaluation logic on the high-level
In the following lower-case letters denote vectors and upper-case letters denote matrices.
-
Sums of vectors like
are evaluated equivalently to
-
The same holds if summands are themselves expressions like in
z = 1.5*a + B*x + 3*transpose(A)*c;
this is equivalent to
z = 1.5*a;
z += B*x
z += 3*transpose(A)*c;
- Vector expressions like
are evaluated equivalently to
-
The same applies to matrix expression.
Evaluation of expressions using the FLENS to BLAS interface
BLAS provides functions like gemv, gbmv, symv (and so on) to compute matrix-vector products. In FLENS there
are corresponding functions mv overloaded for each low-level BLAS function. This has two advantages:
- the mv functions requires fewer arguments and is simpler to use. For example, you write
mv(Trans, 2., A, x, 0., z);
instead of
gemv(ColMajor, Trans,
A.numRows(), A.numCols(), 2., A.data(), A.leadingDimension(),
x.data(), x.stride(),
0., z.data(), z.stride());
-
It allows to implement a simple mechanism that will map an expression like
z = 1.5*a + B*x + 3*transpose(A)*c;
onto subsequent calls to BLAS functions.
So here a brief (and incomplete) overview of the FLENS to BLAS interface:
- Level 1 BLAS:
- FLENS can therefore evaluate
by calling
copy(a, z);
scal(1.5, a);
-
And analogously the expression
can be evaluated by calling
-
Note that axpy requires for
that
! That is why
which is equivalent to
which is equivalent to
copy(x,z);
axpy(1, z, z);
will result in an assertion failure!
Instead you code this as
which is anyway more expressive.
- Level 2 BLAS:
- FLENS can therefore evaluate
by calling
mv(NoTrans, 1.5, A, x, 0, z);
and
by calling
mv(NoTrans, 1.5, A, x, 1, z);
-
Whether
can be evaluated or not depends on the matrix type of A and on what the underlying BLAS implementation
allows. For example
- if A is a general matrix then BLAS does not allow to compute mv(Trans, 1, A, x, 0, x). So
this will result in an assertion failure.
- if A is a triangular matrix then BLAS does allow to compute mv(Trans, 1, A, x, 0, x)!
Actually BLAS requires in this case that the matrix vector product is of the form x = A*x, i.e. both
vectors are identical. Hence, if A is triangular then y = A*x would fail!
- if you do not like this: just implement your own version of mv providing your desired functionality!
- Level 3 BLAS:
BLAS requires that either A or B is of type GeMatrix. But as usual in FLENS: If you
provide another version of mm you can extend this functionality.
-
Analogously to Level 1 and Level 2 BLAS the expression
gets evaluated by calling
mm(NoTrans, Trans, 1.5, A, B, 0, C);
- And
gets evaluated by calling
mm(NoTrans, Trans, 1.5, A, B, 1, C);