Char size in .net not so expected?

char size is: 2 ( msdn )

sizeof(char)  //2

test:

char[] c = new char[1] {'a'};

Encoding.UTF8.GetByteCount(c) //1 ?

why is the value equal to 1?

(of course, if c is a Unicode char like 'ש', so it shows 2, as it should.)

a is not a .net char?

+3
source share
3 answers

This is because 'a' accepts only one byte for encoding in UTF-8.

Encoding.UTF8.GetByteCount(c)will tell you how many bytes are required to encode a given character array in UTF-8 . See the documentation for for details Encoding.GetByteCount. This is completely different from how widespread the type is charinside .NET.

Each character with code points less than 128 (i.e., U + 0000 to U + 007F) accepts one byte for encoding in UTF-8.

2, 3 4 UTF-8. ( U + 1FFFF, 5 6 , Unicode , , .)

, , 4 UTF-8, char. A char UTF-16, Unicode U + FFFF UTF-16, .

+13

, ,.NET UTF-16, 2 . , UTF-8 1 , 128 (, , ASCII) 2 .

+3

This is unfair. The page you mention says

The char keyword is used to declare a Unicode character.

Try:

Encoding.Unicode.GetByteCount(c)
+3
source

All Articles