Hi!
I recently discovered that case-insensitive versions of CompareStringEx, CompareStringOrdinal, FindNLSStringEx and
FindStringOrdinal functions don't work correctly with some characters of Deseret alphabet (Unicode range U+10400...1044F).
For example, CompareStringEx for U+10400 and U+10428 (Deseret Capital and Small letters I) returns CSTR_LESS_THAN,
although CSTR_EQUAL expected. Please review the code sample below.
I wrote similar test on Javascript (based on "localecompare") and have check my guesses on a "ICU Unicode String Comparison"
page - both results indicates that those strings are equal.
Is this a bug in Windows API?
Environment: Windows 10 20H2 (build 19042.1165), but I have the same behavior on other Windows releases.
Thanks!
#include <Windows.h>
#include <cstdio>
int main()
{
// U+10400 (Deseret Capital Letter Long I)
wchar_t const left[] = { 0xD801, 0xDC00, 0x0000 };
// U+10428 (Deseret Small Letter Long I)
wchar_t const right[] = { 0xD801, 0xDC28, 0x0000 };
// Prints 1 (CSTR_LESS_THAN). Expected 2 (CSTR_EQUAL).
printf("%d\r\n", CompareStringEx(LOCALE_NAME_INVARIANT, NORM_IGNORECASE, left, -1, right, -1, NULL, NULL, 0));
// Prints 1 (CSTR_LESS_THAN). Expected 2 (CSTR_EQUAL).
printf("%d\r\n", CompareStringOrdinal(left, -1, right, -1, TRUE));
// Prints -1 (not found). Expected 0.
printf("%d\r\n", FindNLSStringEx(LOCALE_NAME_INVARIANT, FIND_FROMSTART | NORM_IGNORECASE, left, -1, right, -1, NULL, NULL, NULL, 0));
// Prints -1 (not found). Expected 0.
printf("%d\r\n", FindStringOrdinal(FIND_FROMSTART, left, -1, right, -1, TRUE));
return 0;
}