An image showing some exemplary code for visualization effects.

Managing UTF8-Strings in PHP

PHP has a god multibyte support. But some functions I often need when dealing with UTF8 strings are missing.

1. Testing if a string is UTF8 encoded

This function uses the good capability of the built in PHP function mb_detect_encoding()to “guess” the correct character encoding of a multibyte string and transforms it in a boolean TRUE vs. FALSE.

  1. function is_utf8($str){
  2.       $ret = false;
  3.       if (mb_detect_encoding($str, 'auto', true) == 'UTF-8'){
  4.             $ret = true;
  5.       }
  6.       return $ret;
  7. }
function is_utf8($str){
      $ret = false;
      if (mb_detect_encoding($str, 'auto', true) == 'UTF-8'){
            $ret = true;
      return $ret;

2. Transform a string to UTF8 encoding

This function first checks whether a string is already UTF8 or mb_detect_encoding()returning an empty string. It is mostly the case on strings with encoding mix which get broken in the conversion process.

Before conversion PHP is told to discard unsupported characters instead of printing a “?” via theini_set('mbstring.substitute_character', 'none') AND via the //IGNORE directive. In principle on of both should be enough but it works for me in more cases if both set together.

The TRANSLITdirective ensures that when a character can’t be represented in UTF8, it can be approximated through one or several similarly looking characters.

  1. function to_utf8($str){
  2.       $ret = $str;
  4.       $enc = mb_detect_encoding($str, 'auto', true);
  5.       if($enc != 'UTF-8' && $enc != ''){
  6.             ini_set('mbstring.substitute_character', 'none');
  7.             $ret = iconv($enc, 'UTF-8//TRANSLIT//IGNORE', $str);
  8.       }
  9.       return $ret;
  10. }
function to_utf8($str){
      $ret = $str;

      $enc = mb_detect_encoding($str, 'auto', true);
      if($enc != 'UTF-8' && $enc != ''){
            ini_set('mbstring.substitute_character', 'none');
            $ret = iconv($enc, 'UTF-8//TRANSLIT//IGNORE', $str);
      return $ret;

You are free to use my code samples if you respect this small license.