Skip to content
  • Igor Kushnir's avatar
    Reimplement DUChainReferenceCounting: QMap => array · 9031ee87
    Igor Kushnir authored
    The old DUChainReferenceCounting implementation is incorrect in many
    places. But my assertions and tests in
    kdevelop/kdevelop!190 indicate
    that the wrong code was simply never executed even with global rather
    than thread_local data. All overlapping ranges were always completely
    equal (both the start and the size). The buggy range merging branch was
    never taken.
    
    I haven't encountered more than 2 different ranges at a time with
    thread_local duchainReferenceCounting. So there is no need for a QMap or
    a boost::icl::interval_map in DUChainReferenceCounting class. A simple
    unsorted array of ranges should be perfect. The array size of 3 should
    be safe enough.
    
    Use a simple statically allocated array rather than QVector or
    std::vector to safely eliminate a long-standing intentional memory leak
    (new QMap without a matching delete).
    
    Add a second parameter - unsigned size - to
    disableDUChainReferenceCounting() in order to support nested reference
    counting enabling/disabling with equal start but different size values.
    While I haven't encountered any different overlapping requests,
    supporting them doesn't significantly complicate the code. So why not?
    
    Clean up DUChainReferenceCounting code:
    * remove almost all reinterpret_cast uses;
    * improve const correctness;
    * char* => std::byte*;
    * remove "DUChainReferenceCounting" suffixes from member function names.
    
    Even with DUChainReferenceCounting::maxIntervalCount = 2, this change
    does not break any kdevelop, kdev-python, kdev-php, kdev-ruby, kdev-css,
    kdev-upload, kdev-verapp, kdev-krazy2, kdev-valgrind or kdev-xdebug
    tests. No other maintained KDevelop plugin includes referencecounting.h
    even indirectly, that is, none of the remaining maintained plugins
    includes duchain, serialization or language headers.
    
    This implementation simplification substantially speeds up
    BenchItemRepository::shouldDoReferenceCounting() with both data rows.
    But for some reason it slightly slows down
    BenchIndexedString::bench_destroy(). The following table compares
    performance of the previous, current and considered alternative
    implementations in the affected benchmarks. The numbers denote
    milliseconds per iteration.
    
    version\benchmark       qhash   create  destroy shouldDo(-) shouldDo(+)
    previous commit         1.2     149     103     212         318
    inline f-l static       1.2     149     107     106         196
    count == 0 -> false     1.2     148     106     106         212
    Q_LIKELY(count == 0)    1.2     147     107     106         282
    extern variable         1.4     153     128     636         671
    inline variable         1.2     149     100     318         388
    * return count != 0;    1.1     148     102     106         106
    * return false;         0.87    141      67      70         <test fails>
    
            Legend
    Benchmarks:
    qhash       - BenchIndexedString::bench_qhashIndexedString()
    create      - BenchIndexedString::bench_create()
    destroy     - BenchIndexedString::bench_destroy()
    shouldDo(-) - BenchItemRepository::shouldDoReferenceCounting(disabled)
    shouldDo(+) - BenchItemRepository::shouldDoReferenceCounting(enabled)
    
    Versions:
    previous commit         - the version just before this commit
    inline f-l static       - this commit
    count == 0 -> false     - this commit with `if (count==0) return false;`
                              inserted at the beginning of
                              DUChainReferenceCounting::shouldDo()
    Q_LIKELY(count == 0)    - same as "count == 0 -> false", but with
                              `Q_LIKELY` added in: `Q_LIKELY(count==0)`
    extern variable         - this commit, but with extern thread_local
                              variable in the header instead of instance()
    inline variable         - this commit, but with inline thread_local
                              variable in the header instead of instance()
    * <code>                - this commit with <code> as a one-line wrong
                              implementation of
                              DUChainReferenceCounting::shouldDo()
    
    The versions marked with '*' are incorrect. The table includes them to
    show how much DUChainReferenceCounting affects each benchmark's speed.
    
    Removing `Q_DISABLE_COPY_MOVE(DUChainReferenceCounting)` and
    `DUChainReferenceCounting() = default;` has no effect on the benchmarks.
    
    The 52216a13 commit message contains the
    results of these same benchmarks at other versions of the code.
    
    Each number in this and the older commit message's table is the minimum,
    not the average, value of milliseconds per iteration after at least 10
    runs of each benchmark. These minimum values were very consistent,
    repeated many times. Usually the minimum value was also the most
    frequent value. So even the slightest differences in performance are
    unlikely to be caused by random fluctuations. Though a few tiny
    variations are so strange that they most likely *are* spurious.
    
    The two "count == 0" optimization attempts slightly speed up creating
    and destroying IndexedString but significantly slow down the synthetic
    shouldDoReferenceCounting(enabled) benchmark.
    
    The "inline variable" version somewhat speeds up destroying
    IndexedString but dramatically slows down both data rows of the
    synthetic shouldDoReferenceCounting() benchmark. This inconsistent
    effect on performance is surprising. As is the fact that the
    "extern variable" version is by far the slowest across the board...
    9031ee87