Snippet. PHP. Function to Convert Multibyte String to Array (mb_str_to_array)

PHP is notorious for its lack of awareness for Unicode strings. Because of this working with non ASCII strings may quickly become problematic. However, a separate extension called mbstring is existing which is Unicode aware. The functions in the extension mimic the normal PHP stribg functions, and start with the mb prefix. For instance: strlen becomes mb_strlen, strpos becomes mb_strpos, etc. However, not all of the standard string functions have their multi byte counterpart. One such function is str_split which converts a string into a array of characters. This small snippet creates the missing function and DOES NOT rely on regular expressions.

//======================== START OF FUNCTION ==========================//
// FUNCTION: mb_str_to_array                                           //
//=====================================================================//
   /**
    * A substitution of str_split working with not only ASCII strings.
    * @param String $string
    * @return Array
    */
function mb_str_to_array($string){
   mb_internal_encoding("UTF-8"); // Important
   $chars = array();
   for ($i = 0; $i < mb_strlen($string); $i++ ) {
	$chars[] = mb_substr($string, $i, 1);
   }
   return $chars;
}
//=====================================================================//
//  FUNCTION: mb_str_to_array                                          //
//========================== END OF METHOD ============================//

Example

$string = 'Iñtërnâtiônàlizætiøn';

$array = mb_str_to_array($string);

echo count($array);

Note. As of PHP 6 Unicode support should be available by default. Looking forward.


Updated on: 23 Nov 2024