Python prints the result as '7 \ xe6 \ x9c \ x8810 \ xe6 \ x97 \ xa5', but I want '7 月 10 日'

I selected a webpage that contains Japanese, but when I print it to the console, I did not get the output as . Instead, it prints: 7月10日7\xe6\x9c\x8810\xe6\x97\xa5

What should I do?

+5
source share
1 answer

You will get the correct result. This is a UTF-8 representation of the Japanese string. The problem is that the console itself does not understand UTF-8. If you write this line in a file and open it with an editor that understands UTF-8, you will see the content, as you would expect. You can also try changing the console encoding to UTF-8.

Edit: you can also try something:

print '7\xe6\x9c\x8810\xe6\x97\xa5'.decode('utf-8')

, . , , "ISO Latin-1", ...

: http://www.joelonsoftware.com/articles/Unicode.html

+7

All Articles