Given this line
HELLO𝄞 水
Legend: http://en.wikipedia.org/wiki/UTF-16
𝄞 is 4 bytes
水 is 2 bytes
Postgresql database (UTF-8) returns the correct length 7:
select length('HELLO𝄞水');
I noticed that both .NET and Java return 8:
Console.WriteLine("HELLO𝄞水");
System.out.println("HELLO𝄞水");
And the Sql server also returns 8:
SELECT LEN('HELLO𝄞水');
.NET, Java and Sql Server return the correct string length, when a given Unicode character is not a variable length, they all return 6:
HELLO水
They return 7 for variable lengths, which is not true:
HELLO𝄞
.NET, Java Sql Server UTF-16. , UTF-16 . UTF-16? UTF-16 , UTF-8. UTF-16 ( .NET, Java, SQL Server -?) , UTF-8?
Python 12, , , 12. , .
len("HELLO𝄞水")
, .NET, Java Sql Server? , .
, Firefox. Google Chrome. Firefox Unicodes .