How to get the correct number of characters in .NET, Java and Sql Server? (read this on Google Chrome)

Given this line

HELLO𝄞 水

Legend: http://en.wikipedia.org/wiki/UTF-16

𝄞 is 4 bytesis 2 bytes

Postgresql database (UTF-8) returns the correct length 7:

select length('HELLO𝄞水');

I noticed that both .NET and Java return 8:

Console.WriteLine("HELLO𝄞水");

System.out.println("HELLO𝄞水");

And the Sql server also returns 8:

SELECT LEN('HELLO𝄞水');

.NET, Java and Sql Server return the correct string length, when a given Unicode character is not a variable length, they all return 6:

  HELLO水

They return 7 for variable lengths, which is not true:

  HELLO𝄞

.NET, Java Sql Server UTF-16. , UTF-16 . UTF-16? UTF-16 , UTF-8. UTF-16 ( .NET, Java, SQL Server -?) , UTF-8?


Python 12, , , 12. , .

len("HELLO𝄞水")

, .NET, Java Sql Server? , .

, Firefox. Google Chrome. Firefox Unicodes .

+3
3

Java:

String s = "HELLO𝄞水";
System.out.println(s.codePointCount(0, s.length())); // 7
System.out.println(s.length()); // 8
+3

# (, , SQL Java) Char .

String.Length

Length Char , Unicode. , Unicode Char. System.Globalization.StringInfo Unicode Char.

+4

.Net: String.Length

Length Char , Unicode. , Unicode Char. System.Globalization.StringInfo Unicode Char.

, StringInfo .

String s = "HELLO𝄞水";
Console.WriteLine (s);
Console.WriteLine ("Count of char: {0:d}", s.Length);

StringInfo info = new StringInfo (s);
Console.WriteLine ("Count of Unicode characters: {0:d}", info.LengthInTextElements);

:

HELLO𝄞 水
Count char: 8
Number of Unicode characters: 7

0
source

All Articles