The encoded characters ended up in my MySQL

I have an encoding problem - I have data stored in a MySQL table. While doing some work, one of my columns in my table collected some oblesks and negatives; or a regular diamond with a question mark depending on the encoding. Instead of manually changing each row, is there a quick way to find and destroy characters from the database?

I played both with browser settings and using UTF-8, Western 1252 and ISO-8859-1. I was pleased with how the data was encoded before, I just want to remove the incorrectly encoded whatevers from the database. I tried to write a quick PHP script to capture all the characters and replace them, but I can't figure out what they are. Any ideas?

Here are the characters that are visible in UTF-8   ¬†

+3
source share
1 answer

I don’t know if you can do it, but

UPDATE `table` SET column = replace(column, REGEXP '[\x00-\x1F\x80-\xFF]', '');

Make sure you run this as your first choice or make it in a temporary sandbox. I do not know if this is legal in mysql.

I know that there are third-party regular expression libraries that can do this, but require a change to your db. I do not know how this works.

EDIT

You better write a small php script to do this for you. The above regex will work to strip out garbage characters.

$data = preg_replace_all('/[\x00-\x1F\x80-\xFF]/', '', $data);

Once again, if this was unclear before: DON'T FAST AHEAD IN MY SELECTION OF SQL STATEMENT, as I have no idea what will actually happen.

+1
source

All Articles