I'm falling in unicode hell.
My environment on unix, python 2.7.3
LC_CTYPE=zh_TW.UTF-8
LANG=en_US.UTF-8
I am trying to dump hex encoded data in a readable format, here is simplified code
import sys
s=u"readable\n"
s2="fb is not \xfb"
s += s2
print s
print s.encode('utf-8')
print s.encode('utf-8','ignore')
print s.decode('iso8859-1')
f = open('out.txt','wb')
f.write(s)
I just want to print 0xfb.
I should describe more here. Key + = s2 '. Where s will save my previous decrypted string. And s2 is the next line to be added to s.
If I changed as the following, this happens in the recording file.
s=u"readable\n"
s2="fb is not \xfb"
s += s2.decode('cp437')
print s
f=open('out.txt','wb')
f.write(s)
I want the result of out.txt to be
readable
fb is not \xfb
or
readable
fb is not 0xfb
[Decision]
import sys
import binascii
def fmtstr(s):
r = ''
for c in s:
if ord(c) > 128:
r = ''.join([r, "\\x"+binascii.hexlify(c)])
else:
r = ''.join([r, c])
return r
s=u"readable"
s2="fb is not \xfb"
s += fmtstr(s2)
print s
f=open('out.txt','wb')
f.write(s)
source
share