-
Igor Kushnir authored
The old DUChainReferenceCounting implementation is incorrect in many places. But my assertions and tests in kdevelop/kdevelop!190 indicate that the wrong code was simply never executed even with global rather than thread_local data. All overlapping ranges were always completely equal (both the start and the size). The buggy range merging branch was never taken. I haven't encountered more than 2 different ranges at a time with thread_local duchainReferenceCounting. So there is no need for a QMap or a boost::icl::interval_map in DUChainReferenceCounting class. A simple unsorted array of ranges should be perfect. The array size of 3 should be safe enough. Use a simple statically allocated array rather than QVector or std::vector to safely eliminate a long-standing intentional memory leak (new QMap without a matching delete). Add a second parameter - unsigned size - to disableDUChainReferenceCounting() in order to support nested reference counting enabling/disabling with equal start but different size values. While I haven't encountered any different overlapping requests, supporting them doesn't significantly complicate the code. So why not? Clean up DUChainReferenceCounting code: * remove almost all reinterpret_cast uses; * improve const correctness; * char* => std::byte*; * remove "DUChainReferenceCounting" suffixes from member function names. Even with DUChainReferenceCounting::maxIntervalCount = 2, this change does not break any kdevelop, kdev-python, kdev-php, kdev-ruby, kdev-css, kdev-upload, kdev-verapp, kdev-krazy2, kdev-valgrind or kdev-xdebug tests. No other maintained KDevelop plugin includes referencecounting.h even indirectly, that is, none of the remaining maintained plugins includes duchain, serialization or language headers. This implementation simplification substantially speeds up BenchItemRepository::shouldDoReferenceCounting() with both data rows. But for some reason it slightly slows down BenchIndexedString::bench_destroy(). The following table compares performance of the previous, current and considered alternative implementations in the affected benchmarks. The numbers denote milliseconds per iteration. version\benchmark qhash create destroy shouldDo(-) shouldDo(+) previous commit 1.2 149 103 212 318 inline f-l static 1.2 149 107 106 196 count == 0 -> false 1.2 148 106 106 212 Q_LIKELY(count == 0) 1.2 147 107 106 282 extern variable 1.4 153 128 636 671 inline variable 1.2 149 100 318 388 * return count != 0; 1.1 148 102 106 106 * return false; 0.87 141 67 70 <test fails> Legend Benchmarks: qhash - BenchIndexedString::bench_qhashIndexedString() create - BenchIndexedString::bench_create() destroy - BenchIndexedString::bench_destroy() shouldDo(-) - BenchItemRepository::shouldDoReferenceCounting(disabled) shouldDo(+) - BenchItemRepository::shouldDoReferenceCounting(enabled) Versions: previous commit - the version just before this commit inline f-l static - this commit count == 0 -> false - this commit with `if (count==0) return false;` inserted at the beginning of DUChainReferenceCounting::shouldDo() Q_LIKELY(count == 0) - same as "count == 0 -> false", but with `Q_LIKELY` added in: `Q_LIKELY(count==0)` extern variable - this commit, but with extern thread_local variable in the header instead of instance() inline variable - this commit, but with inline thread_local variable in the header instead of instance() * <code> - this commit with <code> as a one-line wrong implementation of DUChainReferenceCounting::shouldDo() The versions marked with '*' are incorrect. The table includes them to show how much DUChainReferenceCounting affects each benchmark's speed. Removing `Q_DISABLE_COPY_MOVE(DUChainReferenceCounting)` and `DUChainReferenceCounting() = default;` has no effect on the benchmarks. The 52216a13 commit message contains the results of these same benchmarks at other versions of the code. Each number in this and the older commit message's table is the minimum, not the average, value of milliseconds per iteration after at least 10 runs of each benchmark. These minimum values were very consistent, repeated many times. Usually the minimum value was also the most frequent value. So even the slightest differences in performance are unlikely to be caused by random fluctuations. Though a few tiny variations are so strange that they most likely *are* spurious. The two "count == 0" optimization attempts slightly speed up creating and destroying IndexedString but significantly slow down the synthetic shouldDoReferenceCounting(enabled) benchmark. The "inline variable" version somewhat speeds up destroying IndexedString but dramatically slows down both data rows of the synthetic shouldDoReferenceCounting() benchmark. This inconsistent effect on performance is surprising. As is the fact that the "extern variable" version is by far the slowest across the board...
9031ee87