- MySQL tags: encoding, MySQL, PHP, Unicode, WordPress
כשהעברתי את האתר לשרת innereyes.com נתקלתי בבעיה בגישה למלל העברי שב־database. כדי לעזור לאחרים שנתקלים באותה בעיה, אני מפרסם פתרון לבעיה. את הפתרון כתבתי באנגלית, מסיבות ברורות (אני, אגב, נעזרתי בפתרון בסלובקית. כן כן...).
Problem
I see question marks instead of non-latin UTF8 characters reading MySQL database using PHP. For instance, my Latvian, Slovak or Hebrew characters in a WordPress blog has gone all "??????".
Longer and more formally:
Let [DB] be a MySQL (4.1, or other(?)) database in which the encoding is, say, UTF8.
Let [PHP] be a PHP (5, or other(?)) file which makes a query using the [DB] database.
Now, if one doesn't set the proper settings, there might be a problem reading non-latin1 characters from [DB] using [PHP], in some configurations. In this case, a question-mark will be shown instead of every byte written non-latin1ly. Since unicode use two-byted characters most frequently, one will see "??" instead of, say, "א" or "ř".
Solution
In WordPress, edit the wp-includes/wp-db.php file and add the following lines before retrieving the data (using the $this->select($dbname); line), replacing the CHARSET with the name of the encoding you use (most likely UTF8):
$res = @mysql_query('SET character_set_results='.CHARSET, $this->dbh);
$res = @mysql_query('SET character_set_connection='.CHARSET, $this->dbh);
$res = @mysql_query('SET character_set_client='.CHARSET, $this->dbh);
If you've got the same problem not using WordPress, make similiar @mysql_query-s, according to the variable-names you use.
