Java byte [] to / from string conversion

Why is this junit test not working?

import org.junit.Assert;
import org.junit.Test;

import java.io.UnsupportedEncodingException;

public class TestBytes {
    @Test
    public void testBytes() throws UnsupportedEncodingException {
        byte[] bytes = new byte[]{0, -121, -80, 116, -62};
        String string = new String(bytes, "UTF-8");
        byte[] bytes2 = string.getBytes("UTF-8");
        System.out.print("bytes2: [");
        for (byte b : bytes2) System.out.print(b + ", ");
        System.out.print("]\n");
        Assert.assertArrayEquals(bytes, bytes2);
    }
}

I would suggest that the incoming byte array was equal to the result, but for some reason, probably due to the fact that UTF-8 characters take two bytes, the result array differs from the incoming array in both content and length.

Please enlighten me.

+5
source share
2 answers

The reason is 0, -121, -80, 116, -62not a valid UTF-8 byte sequence. a new line (bytes, "UTF-8") does not raise any exceptions in such situations, but the result is difficult to predict. Read http://en.wikipedia.org/wiki/UTF-8 Incorrect byte sequences .

+3
source

, 8- (7) UTF-8 . bytes2 , 0..127. , , , , arraycopy:

    byte[] bytes3 = new byte[bytes.length];
    System.arraycopy(bytes, 0, bytes3, 0, bytes.length);
+1

All Articles