Skip to content

udsentry_api_comparison_benchmark: update

Méven Car requested to merge work/meven/update_udsentry_benchmark into master

Add a new candidate TwoVectorKindEntry, that uses two vectors one for numbers and one for strings allowing to save memory (provided reserve is close enough to the ideal case), while being faster.

Remove old KDE3 era code path.

This is the benchmark changes from !1261

To run

# echo "0" | sudo tee /proc/sys/kernel/perf_event_paranoid // might need that to be allowed to use perf

$ cd ~/kde/build/kio
$ ninja && ./bin/udsentry_api_comparison_benchmark -perf -iterations 10
$ ninja && ./bin/udsentry_api_comparison_benchmark -callgrind -iterations 10 # needs valgrind
perf results

On a Ryzen 5900X:

./bin/udsentry_api_comparison_benchmark -perf -iterations 100
[0/2] Re-checking globbed directories...
[2/2] Generating mo...
********* Start testing of UdsEntryBenchmark *********
Config: Using QtTest library 6.6.3, Qt 6.6.3 (x86_64-little_endian-lp64 shared (dynamic) release build; by GCC 13.2.0), ubuntu 23.10
PASS   : UdsEntryBenchmark::initTestCase()
PASS   : UdsEntryBenchmark::testAnotherFill()
RESULT : UdsEntryBenchmark::testAnotherFill():
     1,049.59 nsecs per iteration (total: 104,960, iterations: 100)
     4,376.14 CPU cycles per iteration, 4,17 GHz (total: 437,615, iterations: 100)
     8,628.07 instructions per iteration, 1,972 instr/cycle (total: 862,808, iterations: 100)
     1,339.95 branch instructions per iteration, 1,28 G/sec (total: 133,995, iterations: 100)
PASS   : UdsEntryBenchmark::testTwoVectorKindEntryFill()
RESULT : UdsEntryBenchmark::testTwoVectorKindEntryFill():
     653.00 nsecs per iteration (total: 65,300, iterations: 100)
     2,689.26 CPU cycles per iteration, 4,12 GHz (total: 268,927, iterations: 100)
     5,570.07 instructions per iteration, 2,071 instr/cycle (total: 557,008, iterations: 100)
     864.95 branch instructions per iteration, 1,32 G/sec (total: 86,495, iterations: 100)
PASS   : UdsEntryBenchmark::testTwoVectorsFill()
RESULT : UdsEntryBenchmark::testTwoVectorsFill():
     2,170.09 nsecs per iteration (total: 217,010, iterations: 100)
     9,173.75 CPU cycles per iteration, 4,23 GHz (total: 917,375, iterations: 100)
     17,942.08 instructions per iteration, 1,956 instr/cycle (total: 1,794,208, iterations: 100)
     3,021.94 branch instructions per iteration, 1,39 G/sec (total: 302,195, iterations: 100)
PASS   : UdsEntryBenchmark::testUDSEntryHSFill()
RESULT : UdsEntryBenchmark::testUDSEntryHSFill():
     1,411.50 nsecs per iteration (total: 141,150, iterations: 100)
     5,930.85 CPU cycles per iteration, 4,2 GHz (total: 593,085, iterations: 100)
     13,569.07 instructions per iteration, 2,288 instr/cycle (total: 1,356,908, iterations: 100)
     2,147.94 branch instructions per iteration, 1,52 G/sec (total: 214,795, iterations: 100)
PASS   : UdsEntryBenchmark::testAnotherV2Fill()
RESULT : UdsEntryBenchmark::testAnotherV2Fill():
     2,296.50 nsecs per iteration (total: 229,650, iterations: 100)
     9,715.17 CPU cycles per iteration, 4,23 GHz (total: 971,517, iterations: 100)
     17,435.08 instructions per iteration, 1,795 instr/cycle (total: 1,743,508, iterations: 100)
     2,492.94 branch instructions per iteration, 1,09 G/sec (total: 249,295, iterations: 100)
PASS   : UdsEntryBenchmark::testAnotherCompare()
RESULT : UdsEntryBenchmark::testAnotherCompare():
     1,085.20 nsecs per iteration (total: 108,520, iterations: 100)
     4,536.81 CPU cycles per iteration, 4,18 GHz (total: 453,682, iterations: 100)
     11,269.09 instructions per iteration, 2,484 instr/cycle (total: 1,126,909, iterations: 100)
     1,801.95 branch instructions per iteration, 1,66 G/sec (total: 180,195, iterations: 100)
PASS   : UdsEntryBenchmark::testTwoVectorKindEntryCompare()
RESULT : UdsEntryBenchmark::testTwoVectorKindEntryCompare():
     1,045.20 nsecs per iteration (total: 104,520, iterations: 100)
     4,363.25 CPU cycles per iteration, 4,17 GHz (total: 436,325, iterations: 100)
     9,881.09 instructions per iteration, 2,265 instr/cycle (total: 988,109, iterations: 100)
     1,601.95 branch instructions per iteration, 1,53 G/sec (total: 160,195, iterations: 100)
PASS   : UdsEntryBenchmark::testAnotherV2Compare()
RESULT : UdsEntryBenchmark::testAnotherV2Compare():
     1,340.20 nsecs per iteration (total: 134,020, iterations: 100)
     5,626.68 CPU cycles per iteration, 4,2 GHz (total: 562,668, iterations: 100)
     13,329.09 instructions per iteration, 2,369 instr/cycle (total: 1,332,909, iterations: 100)
     1,909.95 branch instructions per iteration, 1,43 G/sec (total: 190,995, iterations: 100)
PASS   : UdsEntryBenchmark::testTwoVectorsCompare()
RESULT : UdsEntryBenchmark::testTwoVectorsCompare():
     1,232.20 nsecs per iteration (total: 123,220, iterations: 100)
     5,165.19 CPU cycles per iteration, 4,19 GHz (total: 516,520, iterations: 100)
     12,781.09 instructions per iteration, 2,474 instr/cycle (total: 1,278,109, iterations: 100)
     2,121.94 branch instructions per iteration, 1,72 G/sec (total: 212,195, iterations: 100)
PASS   : UdsEntryBenchmark::testUDSEntryHSCompare()
RESULT : UdsEntryBenchmark::testUDSEntryHSCompare():
     1,475.90 nsecs per iteration (total: 147,590, iterations: 100)
     6,206.39 CPU cycles per iteration, 4,21 GHz (total: 620,640, iterations: 100)
     14,399.09 instructions per iteration, 2,320 instr/cycle (total: 1,439,909, iterations: 100)
     2,129.94 branch instructions per iteration, 1,44 G/sec (total: 212,995, iterations: 100)
PASS   : UdsEntryBenchmark::testAnotherApp()
RESULT : UdsEntryBenchmark::testAnotherApp():
     1,543.00 nsecs per iteration (total: 154,300, iterations: 100)
     6,493.47 CPU cycles per iteration, 4,21 GHz (total: 649,348, iterations: 100)
     12,887.09 instructions per iteration, 1,985 instr/cycle (total: 1,288,709, iterations: 100)
     2,007.95 branch instructions per iteration, 1,3 G/sec (total: 200,795, iterations: 100)
PASS   : UdsEntryBenchmark::testTwoVectorKindEntryApp()
RESULT : UdsEntryBenchmark::testTwoVectorKindEntryApp():
     1,152.29 nsecs per iteration (total: 115,230, iterations: 100)
     4,823.13 CPU cycles per iteration, 4,19 GHz (total: 482,313, iterations: 100)
     9,213.09 instructions per iteration, 1,910 instr/cycle (total: 921,309, iterations: 100)
     1,437.95 branch instructions per iteration, 1,25 G/sec (total: 143,795, iterations: 100)
PASS   : UdsEntryBenchmark::testAnotherV2App()
RESULT : UdsEntryBenchmark::testAnotherV2App():
     2,860.50 nsecs per iteration (total: 286,050, iterations: 100)
     12,125.95 CPU cycles per iteration, 4,24 GHz (total: 1,212,596, iterations: 100)
     21,795.09 instructions per iteration, 1,797 instr/cycle (total: 2,179,509, iterations: 100)
     3,125.94 branch instructions per iteration, 1,09 G/sec (total: 312,595, iterations: 100)
PASS   : UdsEntryBenchmark::testTwoVectorsApp()
RESULT : UdsEntryBenchmark::testTwoVectorsApp():
     2,684.50 nsecs per iteration (total: 268,450, iterations: 100)
     11,374.04 CPU cycles per iteration, 4,24 GHz (total: 1,137,404, iterations: 100)
     22,381.09 instructions per iteration, 1,968 instr/cycle (total: 2,238,109, iterations: 100)
     3,734.94 branch instructions per iteration, 1,39 G/sec (total: 373,495, iterations: 100)
PASS   : UdsEntryBenchmark::testUDSEntryHSApp()
RESULT : UdsEntryBenchmark::testUDSEntryHSApp():
     1,928.09 nsecs per iteration (total: 192,810, iterations: 100)
     8,139.77 CPU cycles per iteration, 4,22 GHz (total: 813,978, iterations: 100)
     18,063.09 instructions per iteration, 2,219 instr/cycle (total: 1,806,309, iterations: 100)
     2,815.94 branch instructions per iteration, 1,46 G/sec (total: 281,595, iterations: 100)
QDEBUG : UdsEntryBenchmark::testspaceUsed() 13FrankUDSEntry  memory used "size:336 space used:336"
QDEBUG : UdsEntryBenchmark::testspaceUsed() 15AnotherUDSEntry  memory used "size:344 space used:344"
QDEBUG : UdsEntryBenchmark::testspaceUsed() 17AnotherV2UDSEntry  memory used "size:344 space used:344"
QDEBUG : UdsEntryBenchmark::testspaceUsed() 18TwoVectorKindEntry  memory used "size:224 space used:256"
QDEBUG : UdsEntryBenchmark::testspaceUsed() 10UDSEntryHS  memory used "size:264 space used:2056"
PASS   : UdsEntryBenchmark::testspaceUsed()
perf on Ryzen 5900 - release build
$ export CMAKE_BUILD_TYPE="release"
$ ninja && ./bin/udsentry_api_comparison_benchmark -perf -iterations 1
[0/2] Re-checking globbed directories...
[2/2] Generating mo...
********* Start testing of UdsEntryBenchmark *********
Config: Using QtTest library 6.6.3, Qt 6.6.3 (x86_64-little_endian-lp64 shared (dynamic) release build; by GCC 13.2.0), ubuntu 23.10
PASS   : UdsEntryBenchmark::initTestCase()
PASS   : UdsEntryBenchmark::testAnotherFill()
RESULT : UdsEntryBenchmark::testAnotherFill():
     8,330 nsecs per iteration (total: 8,330, iterations: 1)
     24,588 CPU cycles per iteration, 2,95 GHz (total: 24,588, iterations: 1)
     15,071 instructions per iteration, 0,613 instr/cycle (total: 15,071, iterations: 1)
     2,721 branch instructions per iteration, 327 M/sec (total: 2,721, iterations: 1)
PASS   : UdsEntryBenchmark::testTwoVectorKindEntryFill()
RESULT : UdsEntryBenchmark::testTwoVectorKindEntryFill():
     7,430 nsecs per iteration (total: 7,430, iterations: 1)
     21,577 CPU cycles per iteration, 2,9 GHz (total: 21,577, iterations: 1)
     12,013 instructions per iteration, 0,557 instr/cycle (total: 12,013, iterations: 1)
     2,246 branch instructions per iteration, 302 M/sec (total: 2,246, iterations: 1)
PASS   : UdsEntryBenchmark::testAnotherV2Fill()
RESULT : UdsEntryBenchmark::testAnotherV2Fill():
     9,380 nsecs per iteration (total: 9,380, iterations: 1)
     29,542 CPU cycles per iteration, 3,15 GHz (total: 29,542, iterations: 1)
     23,878 instructions per iteration, 0,808 instr/cycle (total: 23,878, iterations: 1)
     3,874 branch instructions per iteration, 413 M/sec (total: 3,874, iterations: 1)
PASS   : UdsEntryBenchmark::testTwoVectorsFill()
RESULT : UdsEntryBenchmark::testTwoVectorsFill():
     9,230 nsecs per iteration (total: 9,230, iterations: 1)
     29,186 CPU cycles per iteration, 3,16 GHz (total: 29,186, iterations: 1)
     24,385 instructions per iteration, 0,836 instr/cycle (total: 24,385, iterations: 1)
     4,403 branch instructions per iteration, 477 M/sec (total: 4,403, iterations: 1)
PASS   : UdsEntryBenchmark::testUDSEntryHSFill()
RESULT : UdsEntryBenchmark::testUDSEntryHSFill():
     8,360 nsecs per iteration (total: 8,360, iterations: 1)
     25,923 CPU cycles per iteration, 3,1 GHz (total: 25,923, iterations: 1)
     19,862 instructions per iteration, 0,766 instr/cycle (total: 19,862, iterations: 1)
     3,507 branch instructions per iteration, 419 M/sec (total: 3,507, iterations: 1)
PASS   : UdsEntryBenchmark::testAnotherCompare()
RESULT : UdsEntryBenchmark::testAnotherCompare():
     8,410 nsecs per iteration (total: 8,410, iterations: 1)
     25,725 CPU cycles per iteration, 3,06 GHz (total: 25,725, iterations: 1)
     17,713 instructions per iteration, 0,689 instr/cycle (total: 17,713, iterations: 1)
     3,183 branch instructions per iteration, 378 M/sec (total: 3,183, iterations: 1)
PASS   : UdsEntryBenchmark::testTwoVectorKindEntryCompare()
RESULT : UdsEntryBenchmark::testTwoVectorKindEntryCompare():
     8,130 nsecs per iteration (total: 8,130, iterations: 1)
     24,469 CPU cycles per iteration, 3,01 GHz (total: 24,469, iterations: 1)
     16,325 instructions per iteration, 0,667 instr/cycle (total: 16,325, iterations: 1)
     2,983 branch instructions per iteration, 367 M/sec (total: 2,983, iterations: 1)
PASS   : UdsEntryBenchmark::testAnotherV2Compare()
RESULT : UdsEntryBenchmark::testAnotherV2Compare():
     8,310 nsecs per iteration (total: 8,310, iterations: 1)
     25,246 CPU cycles per iteration, 3,04 GHz (total: 25,246, iterations: 1)
     19,773 instructions per iteration, 0,783 instr/cycle (total: 19,773, iterations: 1)
     3,291 branch instructions per iteration, 396 M/sec (total: 3,291, iterations: 1)
PASS   : UdsEntryBenchmark::testTwoVectorsCompare()
RESULT : UdsEntryBenchmark::testTwoVectorsCompare():
     8,060 nsecs per iteration (total: 8,060, iterations: 1)
     24,132 CPU cycles per iteration, 2,99 GHz (total: 24,132, iterations: 1)
     19,225 instructions per iteration, 0,797 instr/cycle (total: 19,225, iterations: 1)
     3,503 branch instructions per iteration, 435 M/sec (total: 3,503, iterations: 1)
PASS   : UdsEntryBenchmark::testUDSEntryHSCompare()
RESULT : UdsEntryBenchmark::testUDSEntryHSCompare():
     8,070 nsecs per iteration (total: 8,070, iterations: 1)
     24,178 CPU cycles per iteration, 3 GHz (total: 24,178, iterations: 1)
     20,543 instructions per iteration, 0,850 instr/cycle (total: 20,543, iterations: 1)
     3,467 branch instructions per iteration, 430 M/sec (total: 3,467, iterations: 1)
PASS   : UdsEntryBenchmark::testAnotherApp()
RESULT : UdsEntryBenchmark::testAnotherApp():
     8,560 nsecs per iteration (total: 8,560, iterations: 1)
     26,351 CPU cycles per iteration, 3,08 GHz (total: 26,351, iterations: 1)
     19,331 instructions per iteration, 0,734 instr/cycle (total: 19,331, iterations: 1)
     3,389 branch instructions per iteration, 396 M/sec (total: 3,389, iterations: 1)
PASS   : UdsEntryBenchmark::testTwoVectorKindEntryApp()
RESULT : UdsEntryBenchmark::testTwoVectorKindEntryApp():
     8,100 nsecs per iteration (total: 8,100, iterations: 1)
     24,387 CPU cycles per iteration, 3,01 GHz (total: 24,387, iterations: 1)
     15,657 instructions per iteration, 0,642 instr/cycle (total: 15,657, iterations: 1)
     2,819 branch instructions per iteration, 348 M/sec (total: 2,819, iterations: 1)
PASS   : UdsEntryBenchmark::testAnotherV2App()
RESULT : UdsEntryBenchmark::testAnotherV2App():
     9,860 nsecs per iteration (total: 9,860, iterations: 1)
     31,884 CPU cycles per iteration, 3,23 GHz (total: 31,884, iterations: 1)
     28,239 instructions per iteration, 0,886 instr/cycle (total: 28,239, iterations: 1)
     4,507 branch instructions per iteration, 457 M/sec (total: 4,507, iterations: 1)
PASS   : UdsEntryBenchmark::testTwoVectorsApp()
RESULT : UdsEntryBenchmark::testTwoVectorsApp():
     9,440 nsecs per iteration (total: 9,440, iterations: 1)
     30,212 CPU cycles per iteration, 3,2 GHz (total: 30,212, iterations: 1)
     28,825 instructions per iteration, 0,954 instr/cycle (total: 28,825, iterations: 1)
     5,116 branch instructions per iteration, 542 M/sec (total: 5,116, iterations: 1)
PASS   : UdsEntryBenchmark::testUDSEntryHSApp()
RESULT : UdsEntryBenchmark::testUDSEntryHSApp():
     8,850 nsecs per iteration (total: 8,850, iterations: 1)
     27,638 CPU cycles per iteration, 3,12 GHz (total: 27,638, iterations: 1)
     24,207 instructions per iteration, 0,876 instr/cycle (total: 24,207, iterations: 1)
     4,153 branch instructions per iteration, 469 M/sec (total: 4,153, iterations: 1)
QDEBUG : UdsEntryBenchmark::testspaceUsed() 13FrankUDSEntry  memory used "size:336 space used:336"
QDEBUG : UdsEntryBenchmark::testspaceUsed() 15AnotherUDSEntry  memory used "size:344 space used:344"
QDEBUG : UdsEntryBenchmark::testspaceUsed() 17AnotherV2UDSEntry  memory used "size:344 space used:344"
QDEBUG : UdsEntryBenchmark::testspaceUsed() 18TwoVectorKindEntry  memory used "size:224 space used:256"
QDEBUG : UdsEntryBenchmark::testspaceUsed() 10UDSEntryHS  memory used "size:264 space used:2056"
PASS   : UdsEntryBenchmark::testspaceUsed()
PASS   : UdsEntryBenchmark::cleanupTestCase()
Totals: 18 passed, 0 failed, 0 skipped, 0 blacklisted, 34ms
********* Finished testing of UdsEntryBenchmark *********
callgrind results
ninja && ./bin/udsentry_api_comparison_benchmark -callgrind -iterations 100
[0/2] Re-checking globbed directories...
[2/2] Generating mo...
********* Start testing of UdsEntryBenchmark *********
Config: Using QtTest library 6.6.3, Qt 6.6.3 (x86_64-little_endian-lp64 shared (dynamic) release build; by GCC 13.2.0), ubuntu 23.10
PASS   : UdsEntryBenchmark::initTestCase()
PASS   : UdsEntryBenchmark::testAnotherFill()
RESULT : UdsEntryBenchmark::testAnotherFill():
     8,563.61 instruction reads per iteration (total: 856,361, iterations: 100)
PASS   : UdsEntryBenchmark::testTwoVectorKindEntryFill()
RESULT : UdsEntryBenchmark::testTwoVectorKindEntryFill():
     5,505.60 instruction reads per iteration (total: 550,561, iterations: 100)
PASS   : UdsEntryBenchmark::testTwoVectorsFill()
RESULT : UdsEntryBenchmark::testTwoVectorsFill():
     17,877.61 instruction reads per iteration (total: 1,787,761, iterations: 100)
PASS   : UdsEntryBenchmark::testUDSEntryHSFill()
RESULT : UdsEntryBenchmark::testUDSEntryHSFill():
     13,361.69 instruction reads per iteration (total: 1,336,169, iterations: 100)
PASS   : UdsEntryBenchmark::testAnotherV2Fill()
RESULT : UdsEntryBenchmark::testAnotherV2Fill():
     17,370.61 instruction reads per iteration (total: 1,737,061, iterations: 100)
PASS   : UdsEntryBenchmark::testAnotherCompare()
RESULT : UdsEntryBenchmark::testAnotherCompare():
     11,204.62 instruction reads per iteration (total: 1,120,462, iterations: 100)
PASS   : UdsEntryBenchmark::testTwoVectorKindEntryCompare()
RESULT : UdsEntryBenchmark::testTwoVectorKindEntryCompare():
     9,816.62 instruction reads per iteration (total: 981,662, iterations: 100)
PASS   : UdsEntryBenchmark::testAnotherV2Compare()
RESULT : UdsEntryBenchmark::testAnotherV2Compare():
     13,264.62 instruction reads per iteration (total: 1,326,462, iterations: 100)
PASS   : UdsEntryBenchmark::testTwoVectorsCompare()
RESULT : UdsEntryBenchmark::testTwoVectorsCompare():
     12,716.62 instruction reads per iteration (total: 1,271,662, iterations: 100)
PASS   : UdsEntryBenchmark::testUDSEntryHSCompare()
RESULT : UdsEntryBenchmark::testUDSEntryHSCompare():
     14,034.62 instruction reads per iteration (total: 1,403,462, iterations: 100)
PASS   : UdsEntryBenchmark::testAnotherApp()
RESULT : UdsEntryBenchmark::testAnotherApp():
     12,822.62 instruction reads per iteration (total: 1,282,262, iterations: 100)
PASS   : UdsEntryBenchmark::testTwoVectorKindEntryApp()
RESULT : UdsEntryBenchmark::testTwoVectorKindEntryApp():
     9,148.62 instruction reads per iteration (total: 914,862, iterations: 100)
PASS   : UdsEntryBenchmark::testAnotherV2App()
RESULT : UdsEntryBenchmark::testAnotherV2App():
     21,730.61 instruction reads per iteration (total: 2,173,062, iterations: 100)
PASS   : UdsEntryBenchmark::testTwoVectorsApp()
RESULT : UdsEntryBenchmark::testTwoVectorsApp():
     22,316.61 instruction reads per iteration (total: 2,231,662, iterations: 100)
PASS   : UdsEntryBenchmark::testUDSEntryHSApp()
RESULT : UdsEntryBenchmark::testUDSEntryHSApp():
     17,709.47 instruction reads per iteration (total: 1,770,947, iterations: 100)
QDEBUG : UdsEntryBenchmark::testspaceUsed() 13FrankUDSEntry  memory used "size:336 space used:336"
QDEBUG : UdsEntryBenchmark::testspaceUsed() 15AnotherUDSEntry  memory used "size:344 space used:344"
QDEBUG : UdsEntryBenchmark::testspaceUsed() 17AnotherV2UDSEntry  memory used "size:344 space used:344"
QDEBUG : UdsEntryBenchmark::testspaceUsed() 18TwoVectorKindEntry  memory used "size:224 space used:256"
QDEBUG : UdsEntryBenchmark::testspaceUsed() 10UDSEntryHS  memory used "size:264 space used:2056"
PASS   : UdsEntryBenchmark::testspaceUsed()
PASS   : UdsEntryBenchmark::cleanupTestCase()
Totals: 18 passed, 0 failed, 0 skipped, 0 blacklisted, 1044ms
********* Finished testing of UdsEntryBenchmark *********
perf results on laptop
./bin/udsentry_api_comparison_benchmark -perf -iterations 10
********* Start testing of UdsEntryBenchmark *********
Config: Using QtTest library 6.6.2, Qt 6.6.2 (x86_64-little_endian-lp64 shared (dynamic) release build; by GCC 13.2.1 20230801), arch unknown
PASS   : UdsEntryBenchmark::initTestCase()
PASS   : UdsEntryBenchmark::testAnotherFill()
RESULT : UdsEntryBenchmark::testAnotherFill():
     1,247.7 nsecs per iteration (total: 12,478, iterations: 10)
     438.6 CPU cycles per iteration, 352 MHz (total: 4,387, iterations: 10)
     581.0 instructions per iteration, 1,324 instr/cycle (total: 5,810, iterations: 10)
     94.2 branch instructions per iteration, 75,6 M/sec (total: 943, iterations: 10)
PASS   : UdsEntryBenchmark::testTwoVectorKindEntryFill()
RESULT : UdsEntryBenchmark::testTwoVectorKindEntryFill():
     1,226.2 nsecs per iteration (total: 12,262, iterations: 10)
     508.1 CPU cycles per iteration, 414 MHz (total: 5,081, iterations: 10)
     765.0 instructions per iteration, 1,506 instr/cycle (total: 7,650, iterations: 10)
     154.3 branch instructions per iteration, 126 M/sec (total: 1,543, iterations: 10)
PASS   : UdsEntryBenchmark::testTwoVectorsFill()
RESULT : UdsEntryBenchmark::testTwoVectorsFill():
     1,557.5 nsecs per iteration (total: 15,576, iterations: 10)
     1,378.5 CPU cycles per iteration, 885 MHz (total: 13,785, iterations: 10)
     1,960.0 instructions per iteration, 1,422 instr/cycle (total: 19,600, iterations: 10)
     391.3 branch instructions per iteration, 251 M/sec (total: 3,913, iterations: 10)
PASS   : UdsEntryBenchmark::testUDSEntryHSFill()
RESULT : UdsEntryBenchmark::testUDSEntryHSFill():
     1,986.2 nsecs per iteration (total: 19,862, iterations: 10)
     2,475.5 CPU cycles per iteration, 1,25 GHz (total: 24,755, iterations: 10)
     3,054.1 instructions per iteration, 1,234 instr/cycle (total: 30,542, iterations: 10)
     623.2 branch instructions per iteration, 314 M/sec (total: 6,233, iterations: 10)
PASS   : UdsEntryBenchmark::testAnotherV2Fill()
RESULT : UdsEntryBenchmark::testAnotherV2Fill():
     1,380.7 nsecs per iteration (total: 13,808, iterations: 10)
     837.6 CPU cycles per iteration, 607 MHz (total: 8,376, iterations: 10)
     1,200.0 instructions per iteration, 1,433 instr/cycle (total: 12,000, iterations: 10)
     181.3 branch instructions per iteration, 131 M/sec (total: 1,813, iterations: 10)
PASS   : UdsEntryBenchmark::testAnotherCompare()
RESULT : UdsEntryBenchmark::testAnotherCompare():
     1,341.7 nsecs per iteration (total: 13,417, iterations: 10)
     764.1 CPU cycles per iteration, 570 MHz (total: 7,641, iterations: 10)
     689.0 instructions per iteration, 0,902 instr/cycle (total: 6,890, iterations: 10)
     215.3 branch instructions per iteration, 160 M/sec (total: 2,153, iterations: 10)
PASS   : UdsEntryBenchmark::testTwoVectorKindEntryCompare()
RESULT : UdsEntryBenchmark::testTwoVectorKindEntryCompare():
     1,266.2 nsecs per iteration (total: 12,662, iterations: 10)
     613.2 CPU cycles per iteration, 484 MHz (total: 6,132, iterations: 10)
     611.0 instructions per iteration, 0,996 instr/cycle (total: 6,110, iterations: 10)
     188.4 branch instructions per iteration, 149 M/sec (total: 1,884, iterations: 10)
PASS   : UdsEntryBenchmark::testAnotherV2Compare()
RESULT : UdsEntryBenchmark::testAnotherV2Compare():
     1,369.4 nsecs per iteration (total: 13,694, iterations: 10)
     914.6 CPU cycles per iteration, 668 MHz (total: 9,146, iterations: 10)
     1,025.0 instructions per iteration, 1,121 instr/cycle (total: 10,250, iterations: 10)
     243.3 branch instructions per iteration, 178 M/sec (total: 2,433, iterations: 10)
PASS   : UdsEntryBenchmark::testTwoVectorsCompare()
RESULT : UdsEntryBenchmark::testTwoVectorsCompare():
     1,342.0 nsecs per iteration (total: 13,420, iterations: 10)
     864.8 CPU cycles per iteration, 644 MHz (total: 8,649, iterations: 10)
     914.2 instructions per iteration, 1,057 instr/cycle (total: 9,142, iterations: 10)
     264.3 branch instructions per iteration, 197 M/sec (total: 2,643, iterations: 10)
PASS   : UdsEntryBenchmark::testUDSEntryHSCompare()
RESULT : UdsEntryBenchmark::testUDSEntryHSCompare():
     1,347.0 nsecs per iteration (total: 13,470, iterations: 10)
     870.3 CPU cycles per iteration, 646 MHz (total: 8,704, iterations: 10)
     1,218.0 instructions per iteration, 1,399 instr/cycle (total: 12,180, iterations: 10)
     182.3 branch instructions per iteration, 135 M/sec (total: 1,823, iterations: 10)
PASS   : UdsEntryBenchmark::testAnotherApp()
RESULT : UdsEntryBenchmark::testAnotherApp():
     1,400.7 nsecs per iteration (total: 14,007, iterations: 10)
     972.0 CPU cycles per iteration, 694 MHz (total: 9,720, iterations: 10)
     1,193.7 instructions per iteration, 1,228 instr/cycle (total: 11,938, iterations: 10)
     213.3 branch instructions per iteration, 152 M/sec (total: 2,133, iterations: 10)
PASS   : UdsEntryBenchmark::testTwoVectorKindEntryApp()
RESULT : UdsEntryBenchmark::testTwoVectorKindEntryApp():
     1,446.4 nsecs per iteration (total: 14,464, iterations: 10)
     1,135.7 CPU cycles per iteration, 785 MHz (total: 11,358, iterations: 10)
     1,331.7 instructions per iteration, 1,173 instr/cycle (total: 13,318, iterations: 10)
     258.3 branch instructions per iteration, 179 M/sec (total: 2,583, iterations: 10)
PASS   : UdsEntryBenchmark::testAnotherV2App()
RESULT : UdsEntryBenchmark::testAnotherV2App():
     1,541.5 nsecs per iteration (total: 15,416, iterations: 10)
     1,298.4 CPU cycles per iteration, 842 MHz (total: 12,984, iterations: 10)
     1,812.0 instructions per iteration, 1,396 instr/cycle (total: 18,120, iterations: 10)
     292.3 branch instructions per iteration, 190 M/sec (total: 2,923, iterations: 10)
PASS   : UdsEntryBenchmark::testTwoVectorsApp()
RESULT : UdsEntryBenchmark::testTwoVectorsApp():
     1,818.4 nsecs per iteration (total: 18,184, iterations: 10)
     2,011.2 CPU cycles per iteration, 1,11 GHz (total: 20,112, iterations: 10)
     2,586.0 instructions per iteration, 1,286 instr/cycle (total: 25,860, iterations: 10)
     521.2 branch instructions per iteration, 287 M/sec (total: 5,213, iterations: 10)
PASS   : UdsEntryBenchmark::testUDSEntryHSApp()
RESULT : UdsEntryBenchmark::testUDSEntryHSApp():
     2,148.3 nsecs per iteration (total: 21,483, iterations: 10)
     2,831.4 CPU cycles per iteration, 1,32 GHz (total: 28,314, iterations: 10)
     3,681.0 instructions per iteration, 1,300 instr/cycle (total: 36,810, iterations: 10)
     713.2 branch instructions per iteration, 332 M/sec (total: 7,133, iterations: 10)
QDEBUG : UdsEntryBenchmark::testspaceUsed() 13FrankUDSEntry  memory used "size:336 space used:336"
QDEBUG : UdsEntryBenchmark::testspaceUsed() 15AnotherUDSEntry  memory used "size:344 space used:344"
QDEBUG : UdsEntryBenchmark::testspaceUsed() 17AnotherV2UDSEntry  memory used "size:344 space used:344"
QDEBUG : UdsEntryBenchmark::testspaceUsed() 18TwoVectorKindEntry  memory used "size:224 space used:256"
QDEBUG : UdsEntryBenchmark::testspaceUsed() 10UDSEntryHS  memory used "size:264 space used:2056"
PASS   : UdsEntryBenchmark::testspaceUsed()
PASS   : UdsEntryBenchmark::cleanupTestCase()
Totals: 18 passed, 0 failed, 0 skipped, 0 blacklisted, 30ms
********* Finished testing of UdsEntryBenchmark *********
  • Another method is the current implementation of UDSEntry.
  • TwoVectorKindEntry method is the prop osed new method.

We can see the new method is measurably faster and more efficient memory-wise. If this is confirmed on other systems, I will revive !1261

On my laptop Processors: 8 × Intel® Core™ i7-8550U CPU @ 1.80GHz, second results, it is very close and depending on run TwoVectorKindEntry or Another can be ahead.

Edited by Méven Car

Merge request reports