I know that there is an old problem with character encoding between different character sets, but I am stuck in one related to “curly quote” window.
We have a client who likes to copy and paste data into a text field, and then publish it in our application. Curly quotes will often be in this data. I used the following conversion of them to my regular copies:
function convert_smart_quotes($string) {
$badwordchars=array("\xe2\x80\x98", "\xe2\x80\x99", "\xe2\x80\x9c", "\xe2\x80\x9d", "\xe2\x80\x93", "\xe2\x80\x94", "\xe2\x80\xa6");
$fixedwordchars=array("'", "'", '"', '"', '-', '--', '...');
return str_replace($badwordchars,$fixedwordchars,$string);
}
This worked perfectly for several months. Then after some changes (we switch servers, do updates on the system, update PHP, etc. Etc.), we learned that it no longer works. So, I look, and I learn that the "curly quotes" all change to different characters. In this case, they turn into the following:
"= ¡È
"= ¡É
'= ¡Æ
= ¡Ç
" - " . mySQL latin1_swedish_ci, , . , , utf-8 , latin1_swedish_ci ISO-8859-1, ... .
- , utf-8. ISO-8859-1, .
"¡È" "¡É" , . , :
$string = str_replace("xa1\xc8", '"', $string);
$string = str_replace("xa1\xc9", '"', $string);
$string = str_replace("xa1\xc6", "'", $string);
$string = str_replace("xa1\xc7", "'", $string);
. , googleing "¡É" .
!