GetBytes () With UTF-8 not working for top-level German Umlauts

For development, I use ResourceBundleto read the properties file of encoded UTF-8 (I installed it in the properties of the Eclipse file in this file) directly from the resource directory in the IDE (native2ascii is used on the production path), for example:

menu.file.open.label=&Öffnen...
label.btn.add.name=&Hinzufügen
label.btn.remove.name=&Löschen

Since this causes character encoding problems when using non-ASCII characters, I thought I would be happy:

ResourceBundle resourceBundle = ResourceBundle.getBundle("messages", Locale.getDefault());
String value = resourceBundle.getString(key);
value = new String(value.getBytes(), "UTF-8");

Well, this works well for the German umlauts in lower case, but not for upper ones, and ßalso does not work. Here is the value read from getString(key)and the value after converting from new String(value.getBytes(), "UTF-8"):

&Löschen => &Löschen
&Hinzufügen => &Hinzufügen

&Ã?ber => &??ber
&SchlieÃ?en => &Schlie??en
&Ã?ffnen... => &??ffnen...

The last three should be:

&Ã?ber => &Über
&SchlieÃ?en => &Schließen
&Ã?ffnen... => &Öffnen...

I suppose I'm not too far from the truth, but what am I missing here?

Google , .

EDIT:

+5
4

, , . , , , . , , (, , ), ( ;)). , Eclipse.

  • Ant -style build.xml

    <?xml version="1.0" encoding="UTF-8"?>
    <project>
        <property name="dir.resources" value="src/main/resources" />
        <property name="dir.target" value="bin/main" />
    
        <target name="native-to-ascii">
            <delete dir="${dir.target}" includes="**/*.properties" />
            <native2ascii src="${dir.resources}" dest="${dir.target}" includes="**/*.properties" />
        </target>
    </project>
    

    native2ascii . , native2ascii .

  • Eclipse "", "...", "Ant Builder" ( )
  • "Main" "Buildfile" Ant - script, "Base Directory" ${project_loc}
  • "" " " ", "
  • "" " " "" native-to-ascii ( , - )
  • , "JRE" .
  • " " " " (, , , , , )
  • "", ""
  • , - Java Builder ( "/" )
  • " Java" (src/main/resources ) **/*.properties

. , ASCII . ü, \u00fc.

, , . .:)

0

, String.getBytes() , . , UTF-8.

UTF-8 , :

// Should be a round-trip
value = new String(value.getBytes("UTF-8"), "UTF-8");

... UTF-8 , , .

, , . " ", , , , , . ResourceBundle, , ... , ResourceBundle .

, , ResourceBundle, , .

EDIT: , native2ascii. , :

native2ascii -encoding UTF-8 input.properties output.properties
+6

:

  • String, UTF-16, ( ).
  • new String(value.getBytes(), "UTF-8"); - ( ) , UTF-8 ; .
  • .properties ISO 8859-1 ( Properties , , ResourceBundle).
  • System.out (PrintStream UTF-16 ; , .)

, .

+3

You encode text with a different encoding with the one you decode with.

Try using the same character set for encoding and decoding instead.

value = new String(value.getBytes("UTF-8"), "UTF-8");

String s = "ßßßßß";
s += s.toUpperCase();
s = new String(s.getBytes("UTF-8"), "UTF-8");
System.out.println(s);

prints

ßßßßßSSSSSSSSSS
+2
source

All Articles