-
Korean Hangul can be represented in Unicode either as precomposed Hangul syllables, or as sequences of alphabetic components called Jamo. Syllables should occupy 2 cells (there are halfwidth variants at U+FFA0..U+FFDF). A fully decomposed syllable consists of an initial jamo (choseong - leading consonant - may be a filler U+115F), a medial jamo (jungseong - vowel - may be a filler U+1160), and an optional final jamo (jongseong - trailing consonant). Old Korean can have more than one of each of those. In any case, to make the total width 2, we assign width 2 to choseong, and 0 to jungseong and jongseong, which, absent a context-aware wcswidth, will still break with Old Korean syllables with more than one jamo for leading consonants. This aligns with glibc: commit 7a79e321c6f85b204036c33d85f6b2aa794e7c76 Author: Thorsten Glaser <tg@mirbsd.de> Date: Fri Jul 14 14:02:50 2017 +0200 Refresh generated charmap data and ChangeLog [BZ #21750] * charmaps/UTF-8: Refresh. diff --git a/localedata/ChangeLog b/localedata/ChangeLog index 04ef5ad071..9e05b4a652 100644 --- a/localedata/ChangeLog +++ b/localedata/ChangeLog @@ -1,3 +1,17 @@ +2017-07-14 Thorsten Glaser <tg@mirbsd.de> + + [BZ #21750] + * charmaps/UTF-8: Refresh. + * unicode-gen/utf8_gen.py (U+00AD): Set width to 1. + * unicode-gen/utf8_gen.py (U+1160..U+11FF): Set width to 0. + * unicode-gen/utf8_gen.py (U+3248..U+324F): Set width to 2. + * unicode-gen/utf8_gen.py (U+4DC0..U+4DFF): Likewise. + * unicode-gen/utf8_gen.py: Treat category Me and Mn as combining. + [BZ #19852] + * unicode-gen/utf8_gen.py: Process EastAsianWidth lines before + UnicodeData lines so the latter have precedence; remove hack + to group output by EastAsianWidth ranges. + [ ... snip ...] commit 6e540caa21616d5ec5511fafb22819204525138e Author: Mike FABIAN <mfabian@redhat.com> Date: Tue Jun 16 08:29:40 2020 +0200 Set width of JUNGSEONG/JONGSEONG characters from UD7B0 to UD7FB to 0 [BZ #26120] Reviewed-by: Carlos O'Donell <carlos@redhat.com> diff --git a/localedata/charmaps/UTF-8 b/localedata/charmaps/UTF-8 index 14c5d4fa33..8cce47cd97 100644 --- a/localedata/charmaps/UTF-8 +++ b/localedata/charmaps/UTF-8 @@ -48920,6 +48920,8 @@ WIDTH <UABE8> 0 <UABED> 0 <UAC00>...<UD7A3> 2 +<UD7B0>...<UD7C6> 0 +<UD7CB>...<UD7FB> 0 <UF900>...<UFA6D> 2 <UFA70>...<UFAD9> 2 <UFB1E> 0
cfff2326