kdevplatform/serialization/referencecounting.h · 9031ee87a9869b0b4ea97a2f41a32c9a7ec91782 · Christoph Roick / kdevelop

Reimplement DUChainReferenceCounting: QMap => array · 9031ee87
Igor Kushnir authored Nov 06, 2020
The old DUChainReferenceCounting implementation is incorrect in many
places. But my assertions and tests in
kdevelop/kdevelop!190 indicate
that the wrong code was simply never executed even with global rather
than thread_local data. All overlapping ranges were always completely
equal (both the start and the size). The buggy range merging branch was
never taken.

I haven't encountered more than 2 different ranges at a time with
thread_local duchainReferenceCounting. So there is no need for a QMap or
a boost::icl::interval_map in DUChainReferenceCounting class. A simple
unsorted array of ranges should be perfect. The array size of 3 should
be safe enough.

Use a simple statically allocated array rather than QVector or
std::vector to safely eliminate a long-standing intentional memory leak
(new QMap without a matching delete).

Add a second parameter - unsigned size - to
disableDUChainReferenceCounting() in order to support nested reference
counting enabling/disabling with equal start but different size values.
While I haven't encountered any different overlapping requests,
supporting them doesn't significantly complicate the code. So why not?

Clean up DUChainReferenceCounting code:
* remove almost all reinterpret_cast uses;
* improve const correctness;
* char* => std::byte*;
* remove "DUChainReferenceCounting" suffixes from member function names.

Even with DUChainReferenceCounting::maxIntervalCount = 2, this change
does not break any kdevelop, kdev-python, kdev-php, kdev-ruby, kdev-css,
kdev-upload, kdev-verapp, kdev-krazy2, kdev-valgrind or kdev-xdebug
tests. No other maintained KDevelop plugin includes referencecounting.h
even indirectly, that is, none of the remaining maintained plugins
includes duchain, serialization or language headers.

This implementation simplification substantially speeds up
BenchItemRepository::shouldDoReferenceCounting() with both data rows.
But for some reason it slightly slows down
BenchIndexedString::bench_destroy(). The following table compares
performance of the previous, current and considered alternative
implementations in the affected benchmarks. The numbers denote
milliseconds per iteration.

version\benchmark       qhash   create  destroy shouldDo(-) shouldDo(+)
previous commit         1.2     149     103     212         318
inline f-l static       1.2     149     107     106         196
count == 0 -> false     1.2     148     106     106         212
Q_LIKELY(count == 0)    1.2     147     107     106         282
extern variable         1.4     153     128     636         671
inline variable         1.2     149     100     318         388
* return count != 0;    1.1     148     102     106         106
* return false;         0.87    141      67      70         <test fails>

        Legend
Benchmarks:
qhash       - BenchIndexedString::bench_qhashIndexedString()
create      - BenchIndexedString::bench_create()
destroy     - BenchIndexedString::bench_destroy()
shouldDo(-) - BenchItemRepository::shouldDoReferenceCounting(disabled)
shouldDo(+) - BenchItemRepository::shouldDoReferenceCounting(enabled)

Versions:
previous commit         - the version just before this commit
inline f-l static       - this commit
count == 0 -> false     - this commit with `if (count==0) return false;`
                          inserted at the beginning of
                          DUChainReferenceCounting::shouldDo()
Q_LIKELY(count == 0)    - same as "count == 0 -> false", but with
                          `Q_LIKELY` added in: `Q_LIKELY(count==0)`
extern variable         - this commit, but with extern thread_local
                          variable in the header instead of instance()
inline variable         - this commit, but with inline thread_local
                          variable in the header instead of instance()
* <code>                - this commit with <code> as a one-line wrong
                          implementation of
                          DUChainReferenceCounting::shouldDo()

The versions marked with '*' are incorrect. The table includes them to
show how much DUChainReferenceCounting affects each benchmark's speed.

Removing `Q_DISABLE_COPY_MOVE(DUChainReferenceCounting)` and
`DUChainReferenceCounting() = default;` has no effect on the benchmarks.

The 52216a13 commit message contains the
results of these same benchmarks at other versions of the code.

Each number in this and the older commit message's table is the minimum,
not the average, value of milliseconds per iteration after at least 10
runs of each benchmark. These minimum values were very consistent,
repeated many times. Usually the minimum value was also the most
frequent value. So even the slightest differences in performance are
unlikely to be caused by random fluctuations. Though a few tiny
variations are so strange that they most likely *are* spurious.

The two "count == 0" optimization attempts slightly speed up creating
and destroying IndexedString but significantly slow down the synthetic
shouldDoReferenceCounting(enabled) benchmark.

The "inline variable" version somewhat speeds up destroying
IndexedString but dramatically slows down both data rows of the
synthetic shouldDoReferenceCounting() benchmark. This inconsistent
effect on performance is surprising. As is the fact that the
"extern variable" version is by far the slowest across the board...
9031ee87