Char size in .net not so expected?

Question

Char size in .net not so expected?

char size is: 2 ( msdn )

sizeof(char)  //2

test:

char[] c = new char[1] {'a'};

Encoding.UTF8.GetByteCount(c) //1 ?

why is the value equal to 1?

(of course, if c is a Unicode char like 'ש', so it shows 2, as it should.)

a is not a .net char?

+3

c # .net char .net-4.0

Royi namir May 10 '12 at 19:19

source share

3 answers

, ,.NET UTF-16, 2 . , UTF-8 1 , 128 (, , ASCII) 2 .

+3

Douglas 10 '12 19:22

This is unfair. The page you mention says

The char keyword is used to declare a Unicode character.

Try:

Encoding.Unicode.GetByteCount(c)

+3

Wiktor zychla May 10 '12 at 19:23

source share

Jon Skeet · Accepted Answer · 2012-05-10T19:20:57+0000

This is because 'a' accepts only one byte for encoding in UTF-8.

Encoding.UTF8.GetByteCount(c)will tell you how many bytes are required to encode a given character array in UTF-8 . See the documentation for for details Encoding.GetByteCount. This is completely different from how widespread the type is charinside .NET.

Each character with code points less than 128 (i.e., U + 0000 to U + 007F) accepts one byte for encoding in UTF-8.

2, 3 4 UTF-8. ( U + 1FFFF, 5 6 , Unicode , , .)

, , 4 UTF-8, char. A char UTF-16, Unicode U + FFFF UTF-16, .

Char size in .net not so expected?

More articles: