Skip to content

Jobs: when scope is UID-based, split it to multiple batches

Daniel Vrátil requested to merge work/dvratil/job-batch-queries into master

This is more or less a workaround for known limitations of the SQL engines we use. They each have some limitations when it comes to the number of bound parameters or the overal size of the SQL query string. We can hit this limit when querying multitude of singular items through a single job (e.g. the Indexing agent will discover unindexed Items in a Collection and will attempt to fetch them all with a single ItemFetchJob - potentially fetching tens of thousands of Items, or user selecting and deleting a very large number of emails from KMail). Even though the limits of SQL engines are 32k or more, we intentionally set the transaction size to 10'000 items - this means that its unlikely to be hit in common scenarios and also gives as almost absolute certainty that we will not make hit the SQL limits, ever.

While the best place to handle it would be in the database code, the way our database abstraction work right now makes it super-hard to implement. It's easier to instead handle this on the client side by splitting the requested scope into multiple smaller batches and process each batch with a separate FetchItemsCommand. This way we know for sure that the SQL queries on the Server won't hit any limits and we make it absolutely transparent to the user that we received their thousands of messages through multiple fetches instead of one.

While in theory the same problem applies to Collections and Tags, realistically, Items are the only entities that get affected, since no-body has tens of thousands of Collections or Tags...

Merge request reports

Loading