We are talking about Multi Byte ( e.g. UTF-8) strings here, so preg_split will fail for the following string:
'Weiße Rosen sind nicht grün!'
And because I didn't find a regex to simulate a str_split I optimized the first solution from adjwilli a bit:
<?php
$string = 'Weiße Rosen sind nicht grün!'
$stop = mb_strlen( $string);
$result = array();
for( $idx = 0; $idx < $stop; $idx++)
{
$result[] = mb_substr( $string, $idx, 1);
}
?>
Here is an example with adjwilli's function:
<?php
mb_internal_encoding( 'UTF-8');
mb_regex_encoding( 'UTF-8');
function mbStringToArray
( $string
)
{
$stop = mb_strlen( $string);
$result = array();
for( $idx = 0; $idx < $stop; $idx++)
{
$result[] = mb_substr( $string, $idx, 1);
}
return $result;
}
echo '<pre>', PHP_EOL,
print_r( mbStringToArray( 'Weiße Rosen sind nicht grün!', true)), PHP_EOL,
'</pre>';
?>
Let me know [by personal email], if someone found a regex to simulate a str_split with mb_split.
mb_split
(PHP 4 >= 4.2.0, PHP 5)
mb_split — Split multibyte string using regular expression
설명
array mb_split
( string $pattern
, string $string
[, int $limit= -1
] )
Split a multibyte string using regular expression pattern and returns the result as an array.
인수
- pattern
-
The regular expression pattern.
- string
-
The string being split.
- limit
- If optional parameter limit is specified, it will be split in limit elements as maximum.
반환값
The result as an array.
주의
Note: 내부 인코딩이나 mb_regex_encoding()으로 정의한 문자 인코딩을 이 함수의 문자 인코딩으로 사용할 수 있습니다.
참고
- mb_regex_encoding() - Returns current encoding for multibyte regex as string
- mb_ereg() - Regular expression match with multibyte support
mb_split
gert dot matern at web dot de
03-Aug-2009 07:34
03-Aug-2009 07:34
Sezer Yalcin
19-Feb-2009 10:13
19-Feb-2009 10:13
To split by mb letters, use preg_split with /u modifier instead of calling mb functions thousand times.
adjwilli at yahoo dot com
27-Dec-2007 02:37
27-Dec-2007 02:37
I figure most people will want a simple way to break-up a multibyte string into its individual characters. Here's a function I'm using to do that. Change UTF-8 to your chosen encoding method.
<?php
function mbStringToArray ($string) {
$strlen = mb_strlen($string);
while ($strlen) {
$array[] = mb_substr($string,0,1,"UTF-8");
$string = mb_substr($string,1,$strlen,"UTF-8");
$strlen = mb_strlen($string);
}
return $array;
}
?>
