    Conversion Uint<->Float is quite expensive in comparison to
    Int<->Float (2-2.5 times). This happens because of special code
    that handles sign bit of the number. So discarding this bit with
    conversion Uint->Int makes a huge speedup.
    Now the vector version of the composition is 1.8-8.7 times faster
    that the old version (weighted: 3.2 times).
    Many thanks to Matthias Kretz for pointing this out!
kis_composition_benchmark.cpp 17.9 KB