Skip to content

Store startlines in the buffer instead of block

Waqar Ahmed requested to merge work/startlines-faster into master

With large documents, wrapping or unwrapping a line can become really slow. This might not have been a very big problem in the past but now we have multiple cursors which can trigger a lot of wraps/unwraps quickly e.g., when killing lines. Similarly a script that creates new lines or removes lines becomes really slow with large documents.

The primary reason for the slowness is that we traverse the blocks and fix the startLine for each block. But since the startLine is stored inside the block and the block is dynamically allocated we get really poor locality and things become super slow.

With this change the startLines are stored contigously in the buffer. The algorithm for fixing the startLines is also simplified to just adding or removing a given value from the startLine. This not only provides better locality but also makes it very easy for the compiler to optimize the loop with e.g., simd.

I tried two approaches when changing this. This is the second one. The first one was to store both startLine and number of lines in the block.

struct { int startLine; int lineCount; };

This gives us better locality in other places e.g., blockForLine(). However, after checking the code I realized that it

  1. made the code a bit more complex and confusing as lineCount was being stored in two places
  2. blockForLine is already pretty fast and usually it doesn't traverse more than 1 or 2 elements.
  3. There aren't many places where we need both startLine and lineCount. The only other place I can think of is when we fixup a movingRange.

Thus I went with the simpler apporach.

Performance wise, with 1 file which had ~24K blocks. Trigerring a killLine() with multiple cursors before this change took around ~28 seconds. It now takes ~4 seconds.

One other thing to note is that TextBlock has a new m_blockIndex member. This member basically tells the index of the block in KateTextBuffer::m_blocks. We still have to update m_blockIndex every time we modify m_blocks but that is not performance critical as we split or merge the block once in a while. However, I think it is possible to remove this field with some effort, the block doesn't really need to know to its own index. The only place this is used is KateTextBlock::startLine()

Merge request reports

Loading