Some BLAS Benchmarks

For the following benchmarks FLENS two different BLAS implementations were used as backend:

The benchmarks were performed using the Benchmark for Templated Libraries project. The BTL interface used for these benchmarks can be downloaded here (this inlcudes the raw data of the benchmarks and the used compile flags which are basically -O3 -DNDEBUG).

The benchmarks also include a comparison with uBLAS. The uBLAS examples do not utilze any external BLAS implementations but instead the generic BLAS implementation of uBLAS.

So what is the purpose of these tests?

  1. Demonstrating the FLENS provides you the power of the underlying BLAS implementation. In some cases ATLAS in other cases MKL provides better results. This suggests implementing BLAS wrappers in FLENS that decide what underlying BLAS implementation should be used depending on matrix/vector sizes, type of linear algebra operations, ...
  2. I do not want to bash uBLAS, but the tests show that in some cases you really have to use native BLAS implementations for good performance results. And actually you can use in uBLAS bindings to native BLAS implementations. But let's be honest: you don't really wanna compare these bindings with the FLENS high-level interface for BLAS, do you?