Skip to content

Make country name detection substring matching more strict

Make country name detection substring matching more strict

This fixes KCountry::fromName("Turkey") returning Italy with iso-codes 4.16.0.

What happens here is that "Turkey" is no full name match anymore (that would be Türkiye now), so we end up in the substring match case (which is not supposed to find anything).

However, there is Vietnamese with surprisingly short country names and relying a lot on diacritics that our normalization strips away. So "Ý" (Italy) turns into "y" which is a suffix of Turkey.

The substring matching is supposed to handle cases like "United States" vs "United States of America", it makes no sense for individual letters. So add a minimum size threshold here.

That reasoning doesn't hold for scripts with a much higher information density per character though, but for those the entire substring matching doesn't really apply either anyway.

Edited by Volker Krause

Merge request reports