Instead of serializing/desirializing text columns row by row, convert the...
Description
Instead of serializing/de-serializing text columns row by row, convert the text elements to a byte array (zero character separated) and put it base64-encodeded in the XML.
The performance gain is 2-3x for many text columns but can be also significantly bigger if the text structure is simpler and the base64 encoding is faster.
Performance benchmarks, measured on M3 Pro, NEW vs OLD:
Dataset "Movies database" with 5043 rows:
- column "movie_imdb_link": save - 37ms vs 120ms, load - 1ms vs 4ms
- column "plot_keywords": save - 76ms vs 96ms, load - 1ms vs 7ms
- column "language": save - 4ms vs 75ms, load - 0ms vs 3ms
- column "country": save - 5ms vs 64ms, load - 0ms vs 3ms
Dataset "Lightning strikes" with 3401012 rows:
- column "center_point_geom": save - 33434ms - 65624ms, load - 939ms vs 2852ms
Conformity
-
Changelog entry -
Unit tests -
Update INSTALL -
Downport
Edited by Alexander Semke