Utf-8 plus question marks

I have a website that displays user input by decoding it in unicode using utf-8. However, user input may include binary data, which obviously cannot always be "decoded" using utf-8.

I am using Python and I get an error:

'utf8' codec cannot decode byte 0xbf at position 0: unexpected byte code. You went to '\ xbf \ xcd ...

Is there a standard efficient way to convert these unprovable characters to question marks?

It would be very helpful if the answer uses Python.

+3
source share
2 answers

Try:

inputstring.decode("utf8", "replace")

See here for reference.

+6
source

, :

str.decode('utf8','ignore')

,

+1

All Articles