I use ICU with Lithuanian ( lt_LT) language. The alphabet for this language is as follows:a ą b c č d e ę ė <...> v z ž
However, when sorting, the ICU-collator assumes that, for example, aand ą( awith ogonek) are equivalent, therefore the list of Lithuanian words will be sorted as follows:
a, ą, ab, aba, abadas, <...>, b, ba, <...>`
When the expected result will be:
a, ab, aba, abadas, <...>, ą, <...>, b, ba, <...>
The same thing happens with other "accented" letters ( e- ę- ė, z- ž, etc.)
A more specific test case: running source/samples/coll/coll -locale lt_LT -source ą -target aadecides source is less than targetwhen it is not (see coll.cpp if you need to).
Is this behavior expected? Is this a bug or a function? If so, how can I prevent the ICU collaborator from matching similar letters?