A C++ adapter API for integrating vectorized components in a scalar workflow.
Many FLOP-intensive algorithms may profit from the vector pipelines of modern processors; they don’t because they don’t have vectorizable inner loops. The project idea is to implement and benchmark a generic vector flow service that can non-intrusively integrate with arbitrary data processing frameworks and can expose algorithms to the higher-level event loop of these frameworks.
This library aims to provide an API and a set of examples demonstrating the transformation of the scalar algorithm in a vectorized one. Depending on the intrinsic algorithm gain from SIMD vectorization and better data caching, the overheads introduced by the extra data transformations can be much smaller than the benefits.