Strcmp returns unexpected results

I thought strcmp should have returned a positive number if the first line was larger than the second. But this program

#include <stdio.h>
#include <string.h>

int main()
{
    char A[] = "A";
    char Aumlaut[] = "Ä";
    printf("%i\n", A[0]);
    printf("%i\n", Aumlaut[0]);
    printf("%i\n", strcmp(A, Aumlaut));
    return 0;
}

prints 65, -61and -1.

Why? Is there something I'm missing out on?
I thought that maybe the fact that I save as UTF-8 will affect things. You know, because it Äconsists of two characters. But saving as 8-bit encoding and ensuring that lines with a length of 1 do not help, the end result is the same.
What am I doing wrong?

Using GCC 4.3 under 32-bit Linux is here, in case that matters.

+3
source share
6 answers

strcmp unsigned char s, 7.24.4, 1 ( 7.21.4 C99)

, memcmp, strcmp strncmp, ( unsigned char), .

( ).

, , , s char.

+1

strcmp, . posix C/++ char utf8, " " utf. string.h utf-. strcmp ( , , ).

- , UTF, ​​ IBM ICU - Unicode.

+2

8- ASCII, 'A' == 65 'Ä' -61, unsigned char. , 'Ä' 2 ^ 7-1, , .

, 'Ä' unsigned char ( ), 195 . , strcmp(65, 195) -1.

+1

strcmp() ASCII. , A-- char -61, char 195 (, , 196, ).

+1

man strcmp:

The strcmp() function compares the two strings s1 and s2. It returns
an integer less than, equal to, or greater than zero if s1 is found,
respectively, to be less than, to match, or be greater than s2.
0

Correctly perform string processing in C, when the input character set exceeds UTF8 you should use standard wide-format libraries for strings and input / output. Your program should be:

#include <wchar.h>
#include <stdio.h>

int main()
{
    wchar_t A[] = L"A";
    wchar_t Aumlaut[] = L"Ä";
    wprintf(L"%i\n", A[0]);
    wprintf(L"%i\n", Aumlaut[0]);
    wprintf(L"%i\n", wcscmp(A, Aumlaut));
    return 0;
}

and then it will give the correct results (GCC 4.6.3). You do not need a special library.

-1
source

All Articles