ZipInputStream (InputStream, Charset) decodes the ZipEntry file name false

Java 7 should fix the old problem when unpacking zip archives with character sets other than UTF-8. This can be achieved by the constructor ZipInputStream(InputStream, Charset). So far, so good. I can unzip the zip archive containing the file names using umlauts in them when explicitly setting the ISO-8859-1 character set.

But here's the problem: when iterating over a stream using a ZipInputStream.getNextEntry()record, they have incorrect special characters in their names. In my case, the umlaut "ü" is replaced by "?" character, which is obviously wrong. Does anyone know how to fix this? Obviously, ZipEntryignoring Charsethis base ZipInputStream. This seems like another JDK bug related to the zip code, but I can do something wrong too.

...
zipStream = new ZipInputStream(
    new BufferedInputStream(new FileInputStream(archiveFile), BUFFER_SIZE),
    Charset.forName("ISO-8859-1")
);
while ((zipEntry = zipStream.getNextEntry()) != null) {
    // wrong name here, something like "M?nchen" instead of "München"
    System.out.println(zipEntry.getName());
    ...
}
+5
source share
1 answer

OMG, I played about two hours, but five minutes after I finally posted the question here, I came across the answer: my zip file was not encoded with ISO-8859-1, but with Cp437. Thus, the constructor call should be:

zipStream = new ZipInputStream(
    new BufferedInputStream(new FileInputStream(archiveFile), BUFFER_SIZE),
    Charset.forName("Cp437")
);

. , . , - .

+6

All Articles