Switching To Unicode

If you want to convert your web application to unicode, that is easier than you might think. Just follow these steps:

  1. Switch your character encoding to unicode. In Apache you can add the following line to your virtual host container or .htaccess file.

    1. AddDefaultCharset UTF-8
  2. If you specified another encoding in your HTML files or used something like the header method in PHP to force the server to send the pages with another encoding, you need to change those files, too.

  3. Backup your database. In MySQL you can do this by using mysqldump.

    1. mysqldump -u username -p database table > table-yyyymmdd.sql
  4. Convert the SQL dump to UTF-8 using something like iconv.
    Don’t overwrite your backup, but write the output of iconv to another file. This will save time if you need to go back to the old state or if something is messed up. Note: UTF-8 is not supported out of the box. See iconv -l for a list of supported encodings.

    1. iconv -f ISO-8859-1 -t UTF-8 table-yyyymmdd.sql > table-yyyymmdd-utf8.sql
  5. Replace the character encoding information in the converted SQL data. Again: Don’t mess with your backup – use the right file.

    1. vi table-yyyymmdd-utf8.sql
    2. :%s/latin1_german2_ci/utf8_unicode_ci
    3. :%s/latin1/utf8
  6. Replace your data in the database with the converted data. In MySQL this is a fast one:

    1. mysql -u username -p database < table-yyyymmdd-utf8.sql
  7. Check that everything works.

Done.

Shameless plug: If this post was useful to you, please consider buying yourself something from one of my Amazon stores: US store, UK store, FR store, DE store, CA store. If you're not into Amazon, why not donate something to GNOME, Mozilla or Wikipedia? Thank you!

One Response to “Switching To Unicode”

  1. Maxime says:

    didn’t we have another solution?

Leave a Reply