UTF-8 makes the world of web (application) developers a lot easier. You don’t have to worry as much about encodings as you had to before, e.g. in internationalization and localization of web pages. But it also created new problems. Especially, if you need to use UTF-8 in an ISO world.
One point of possible failure is the database server (and especially the connection to the database server).
In MySQL you can set the collation for each database, table (and in newer versions for each field in a table). The collation specifies the encoding of the data in the table. If you want to use UTF-8 you would set this to utf8_unicode_ci (Note: Not utf8_general_ci. Sorting will be a total mess with this one).
But this is not the only setting. You can also set the encoding of the connection. As I wrote earlier, you can set the character encoding of your MySQL queries by setting the MySQL environment variable character_set_client to utf8.
If you also want to get UTF-8 results, you need to set the MySQL environment variable character_set_results to utf8.
On most systems the environment variables for the MySQL connection still default to latin1, which means ISO. Even if the rest of the system already uses UTF-8. This might have to do with Apache, who still defaults to ISO document encoding on most systems. But that’s just a guess.
More info on all these variables can be found in the MySQL manual: Connection Character Sets and Collations
Shameless plug: If this post was useful to you, please consider buying yourself something from one of my Amazon stores: US store, UK store, FR store, DE store, CA store. If you're not into Amazon, why not donate something to GNOME, Mozilla or Wikipedia? Thank you!


