String literal for basic_string <unsigned char>

When it comes to internationalization and Unicode, I'm an idiotic American programmer. Here's the deal.

#include <string>
using namespace std;

typedef basic_string<unsigned char> ustring;

int main()
{
    static const ustring my_str = "Hello, UTF-8!"; // <== error here
    return 0;
}

This causes an unexpected complaint:

cannot convert from 'const char [14]' to 'std::basic_string<_Elem>'

Perhaps today I had the wrong portion of coffee. How to fix it? Can I keep the basic structure:

ustring something = {insert magic incantation here};

?

+2
source share
3 answers

Narrow string literals are defined as const char, and there are no unsigned string literals [1], so you will need to use:

ustring s = reinterpret_cast<const unsigned char*>("Hello, UTF-8");

Of course, you can put this long thing in an inline function:

inline const unsigned char *uc_str(const char *s){
  return reinterpret_cast<const unsigned char*>(s);
}

ustring s = uc_str("Hello, UTF-8");

Or you can just use basic_string<char>and leave with it 99.9% of the time when you work with UTF-8.

[1] char , , , blah, blah.

+4

, , . , .

:

inline ustring convert(const std::string& sys_enc) {
  return ustring( sys_enc.begin(), sys_enc.end() );
}

template< std::size_t N >
inline ustring convert(const char (&array)[N]) {
  return ustring( array, array+N );
}

inline ustring convert(const char* pstr) {
  return ustring( reinterpret_cast<const ustring::value_type*>(pstr) );
}

, , - , ASCII.

+1

Make your life easier by using the UTF-8 string library like http://utfcpp.sourceforge.net/ or go to std :: wstring and use UTF -16. You might be interested in discussing another question about stack overflow: C ++ strings: UTF-8 or 16-bit encoding?

0
source

All Articles