URL:http://www.taylanpince.com/blog/AJAX-and-Multibyte-Character-Support.html
Blogged by taylan at 3:22 pm, 6 February 2006 under PHP
Tags: AJAX, PHP, multibyte, encoding
I love AJAX, and I realize that there are more than enough tutorials on the subject floating around the web these days (my favourite is still the one over at the Apple Developer Connection). But when I wanted to use a simple AJAX menu for a site I was developing in Turkish, I quickly realized that there are some issues about character encoding in dynamically loaded AJAX elements, and that there are no apparent solutions.
The famous XMLHttpRequest object, when used to get the data directly through the responseText method, only supports UTF-8. Unfortunately, there are times when I am not able to define UTF-8 as my main encoding method and I needed a solution to be able to use the features of AJAX regardless of the language the site was in.
One option was to use the responseXML method rather than the responseText, as I read somewhere that it was possible to change the encoding through the XML and make the browser understand the content wasn't in UTF-8. That's a good idea, but the problem is that parsing XML through JavaScript is still a huge issue, especially due to the different approaches taken by different browsers, and I didn't want to get into that just to fix a seemingly simple(r) encoding problem.
The first solution came to me when I started looking at PHP rather than JavaScript. I put a simple header() announcement at the beginning of the PHP file that was going to be loaded in a DIV element through AJAX, and redefined the encoding of the document.
header("Content-type: text/html; charset=ISO-8859-9");
Simple though it may seem, this one line of code fixed the issue in almost all browsers, except Internet Explorer. I hate IE just as most web developers and designers do, but I could not ignore the fact that 70% of my visitors were still using this cursed web browser. So I had to continue looking for another solution.
I finally managed to create a compatible PHP code by letting go of my original encoding and embracing UTF-8. The solution was to use PHP's Multibyte String capability in order to convert the original text into UTF-8.
header("Content-type: application/x-javascript; charset=UTF-8");
echo "<a href=\"index.php?action=load\">".mb_convert_encoding("Ü ü ? ? ? ? ? ?", "UTF-8", "ISO-8859-9")."</a>";
This time, the header() call defines the encoding as UTF-8 and the content type as javascript just to make sure more intelligent browsers like Safari won't try to use the encoding of the parent page. Then, we convert just the text part to UTF-8 from ISO-8859-9 using the mb_convert_encoding() function. The result is happily displayed in all browsers.
Of course, if you are going to use this conversion regularly, you might want to create your own PHP function for encoding conversion.
function encodeAJAX($str)
{
return mb_convert_encoding($str, "UTF-8", "ISO-8859-9");
}
echo encodeAJAX("Ü ü ? ? ? ? ? ?");
I hope this solution helps other people use the AJAX technology with multilingual sites.