Add the ability to configure Tesseract
Is your feature request related to a problem? Please describe Sometimes we need to adjust tesseract settings. @LiteracyFanatic:
For example, the following parameters are of particular interest for CJK languages, especially when written vertically. Unfortunately automatic script and orientation detection is only supported by the legacy engine. The newer LSTM models usually give better results but require manually specifying the orientation.
# Tells tesseract whether the script is horizontal (6) or vertical (5)
tessedit_pageseg_mode
# Don't add spaces between characters
preserve_interword_spaces
# toggling these options on or off may improve the output in some circumstances
paragraph_text_based
textord_old_baselines
lstm_use_matrix
Describe the solution you'd like
@LiteracyFanatic, said:
typical use cases often involve using multiple config files to switch a number of parameters at once - basically setting up profiles for different types of text.
I agree with this point. If you really need this, we can add the ability to set configuration path as was initially implemented in #232. If the profiles do not need to be changed often, it will be more convenient to store the options right inside the application settings. You decide.
Regardless of the solution above, it would be convenient to have the ability to configure such parameters directly in Crow. We could display a QTableWidget
(with two columnts) in preferences, where users could set parameters and their values that would be written to a file. This will be a little easier to implement if settings will be stored in the application settings because of QSettings
:) If we save the settings in a separate file for tesseract, then we will need to write an additional class that is similar to QSettings
(with setValue
).
I don't really like the idea of exporting some options separately. I think it's worth exporting all the options consistently as I described above.
Alternatively, we can support both approaches. If the path is not specified, then save it in the application settings. If specified, then in a separate file. Not sure if this might be needed, because look complicated, but not very usefull.
@LiteracyFanatic, what do you think? Need your opinion on it.