-
Thorsten Zachmann authored
Added optimized version for alpha darken composite op for RGBF32 colorspace. Added tests to test performance and results of new implementation against legacy. The diff needed in the test compare is do to the fact that the compiler calculates 1.0/255.0 and multiplying the result instead if dividing by 255.0 for the mask. Here are the results of the benchmark on my Intel i5-2520M CPU QDEBUG : KisCompositionBenchmark::testRgbF32CompositeAlphaDarkenLegacy() krita.general: Testing Composite Op: "alphadarken" ( "Legacy" ) QDEBUG : KisCompositionBenchmark::testRgbF32CompositeAlphaDarkenLegacy() krita.general: "Aligned Mask SrcRand DstRand" RESULT: 67 msec QDEBUG : KisCompositionBenchmark::testRgbF32CompositeAlphaDarkenLegacy() krita.general: "DstUnalig Mask SrcRand DstRand" RESULT: 68 msec QDEBUG : KisCompositionBenchmark::testRgbF32CompositeAlphaDarkenLegacy() krita.general: "SrcUnalig Mask SrcRand DstRand" RESULT: 69 msec QDEBUG : KisCompositionBenchmark::testRgbF32CompositeAlphaDarkenLegacy() krita.general: "Unaligned Mask SrcRand DstRand" RESULT: 66 msec QDEBUG : KisCompositionBenchmark::testRgbF32CompositeAlphaDarkenLegacy() krita.general: "Aligned NoMask SrcRand DstRand" RESULT: 33 msec QDEBUG : KisCompositionBenchmark::testRgbF32CompositeAlphaDarkenLegacy() krita.general: "Aligned NoMask SrcZero DstRand" RESULT: 32 msec QDEBUG : KisCompositionBenchmark::testRgbF32CompositeAlphaDarkenLegacy() krita.general: "Aligned NoMask SrcUnit DstRand" RESULT: 32 msec QDEBUG : KisCompositionBenchmark::testRgbF32CompositeAlphaDarkenLegacy() krita.general: "Aligned NoMask SrcRand DstZero" RESULT: 31 msec QDEBUG : KisCompositionBenchmark::testRgbF32CompositeAlphaDarkenLegacy() krita.general: "Aligned NoMask SrcZero DstZero" RESULT: 28 msec QDEBUG : KisCompositionBenchmark::testRgbF32CompositeAlphaDarkenLegacy() krita.general: "Aligned NoMask SrcUnit DstZero" RESULT: 31 msec QDEBUG : KisCompositionBenchmark::testRgbF32CompositeAlphaDarkenLegacy() krita.general: "Aligned NoMask SrcRand DstUnit" RESULT: 32 msec QDEBUG : KisCompositionBenchmark::testRgbF32CompositeAlphaDarkenLegacy() krita.general: "Aligned NoMask SrcZero DstUnit" RESULT: 32 msec QDEBUG : KisCompositionBenchmark::testRgbF32CompositeAlphaDarkenLegacy() krita.general: "Aligned NoMask SrcUnit DstUnit" RESULT: 33 msec QDEBUG : KisCompositionBenchmark::testRgbF32CompositeAlphaDarkenOptimized() krita.general: "Aligned Mask SrcRand DstRand" RESULT: 12 msec QDEBUG : KisCompositionBenchmark::testRgbF32CompositeAlphaDarkenOptimized() krita.general: "DstUnalig Mask SrcRand DstRand" RESULT: 12 msec QDEBUG : KisCompositionBenchmark::testRgbF32CompositeAlphaDarkenOptimized() krita.general: "SrcUnalig Mask SrcRand DstRand" RESULT: 16 msec QDEBUG : KisCompositionBenchmark::testRgbF32CompositeAlphaDarkenOptimized() krita.general: "Unaligned Mask SrcRand DstRand" RESULT: 16 msec QDEBUG : KisCompositionBenchmark::testRgbF32CompositeAlphaDarkenOptimized() krita.general: "Aligned NoMask SrcRand DstRand" RESULT: 9 msec QDEBUG : KisCompositionBenchmark::testRgbF32CompositeAlphaDarkenOptimized() krita.general: "Aligned NoMask SrcZero DstRand" RESULT: 9 msec QDEBUG : KisCompositionBenchmark::testRgbF32CompositeAlphaDarkenOptimized() krita.general: "Aligned NoMask SrcUnit DstRand" RESULT: 14 msec QDEBUG : KisCompositionBenchmark::testRgbF32CompositeAlphaDarkenOptimized() krita.general: "Aligned NoMask SrcRand DstZero" RESULT: 9 msec QDEBUG : KisCompositionBenchmark::testRgbF32CompositeAlphaDarkenOptimized() krita.general: "Aligned NoMask SrcZero DstZero" RESULT: 9 msec QDEBUG : KisCompositionBenchmark::testRgbF32CompositeAlphaDarkenOptimized() krita.general: "Aligned NoMask SrcUnit DstZero" RESULT: 9 msec QDEBUG : KisCompositionBenchmark::testRgbF32CompositeAlphaDarkenOptimized() krita.general: "Aligned NoMask SrcRand DstUnit" RESULT: 9 msec QDEBUG : KisCompositionBenchmark::testRgbF32CompositeAlphaDarkenOptimized() krita.general: "Aligned NoMask SrcZero DstUnit" RESULT: 10 msec QDEBUG : KisCompositionBenchmark::testRgbF32CompositeAlphaDarkenOptimized() krita.general: "Aligned NoMask SrcUnit DstUnit" RESULT: 9 msec This is a speedup of factor 3 to 6.
fb013b7e