Very similar to PHP: convert Unicode code point to UTF-8
Goes straight out of 4 char bytes if you can.
$src = "Hello \u0001f60e";
$replaced = preg_replace("/\\\\u([0-9A-F]{1,8})/i", "&#x$1;", $src);
$result = mb_convert_encoding($replaced, "UTF-8", "HTML-ENTITIES");
echo "Result is [$result] and string length is ".mb_strlen($result);
, .
Result is [Hello 😎] and string length is 10
UTF-16:
$src = "Hello "."\ud83d\ude0e";
$replaced = preg_replace("/\\\\u([0-9A-F]{1,4})/i", "&#x$1;", $src);
$result = mb_convert_encoding($replaced, "UTF-16", "HTML-ENTITIES");
$result = mb_convert_encoding($result, 'utf-8', 'utf-16');
echo "Result is [$result] and string length is ".mb_strlen($result)."\n";
$resultInHex = unpack('H*', $result);
$resultInHex = $resultInHex[1];
$resultSeparated = implode(', ', str_split($resultInHex, 2));
echo "in hex: ".$resultSeparated;
:
Result is [Hello 😎] and string length is 10
in hex: 48, 65, 6c, 6c, 6f, 20, f0, 9f, 98, 8e
, : " Java?", Java UTF-16 .