Wchar_t and coding

If I want to convert a piece of string to UTF-16, say char * xmlbuffer, do I need to convert the type to wchar_t *before encoding to UTF-16? And the type is char*set before encoding in UTF-8?

How wchar_tis charconnected with UTF-8 or UTF-16 or UTF-32 or other conversion format?

Thanks in advance for your help!

+5
source share
3 answers

No, you do not need to change data types.

A wchar_t: the standard states that

The wchar_t type is a separate type whose values ​​can represent different codes for all members of the largest extended character set among supported locales.

, , wchar_t; . , ,

auto s = L"foo";

, *s.

std::string , . , .

+4

iconv - POSIX, . iconv_open, , UTF-8 UTF-16. , , iconv_open, iconv ( ). , iconv_close , iconv_open, ..

, iconv (.. iconv_open). , iconv "utf-8", "UTF8" ..

Windows iconv UTF: MultiByteToWideChar WideCharToMultiByte.

//UTF8 to UTF16
std::string input = ...
int utf16len = MultiByteToWideChar(CP_UTF8, 0, input.c_str(), input.size(), 
                                               NULL, 0);
std::wstring output(utf16len);
MultiByteToWideChar(CP_UTF8, 0, input.c_str(), input.size(), 
                                &output[0], output.size());
//UTF16 to UTF8
std::wstring input = ...
int utf8len = WideCharToMultiByte(CP_UTF8, 0, input.c_str(), input.size(), 
                                              NULL, 0, NULL, NULL);
std::string output(utf8len);
WideCharToMultiByte(CP_UTF8, 0, input.c_str(), input.size(),
                                &output[0], output.size(), NULL, NULL);
+4

The size wchar_tdepends on the compiler, so its relation to different Unicode formats will vary.

+1
source

All Articles