Testimonials  >

The "Technoligiya Bezopasnoisti" (Eng. Security Systems) company is engaged in the wide spectrum of activities in the field of security systems and electric equipment. The enterprise offers their customers practically everything they need...

"Tekhnologiya bezopasnosti",
Glazkov A.

Articles  >

Articles  >  Programming  >  PHP & UTF-8. Chapter 2

Continuing the theme of work with strings encoded in UTF-8, we shall consider some more functions (utf8_strpos and utf8_substr_count), working without Multibyte String Functions extension:

function utf8_strpos($haystack, $needle, $offset = 0)
{
    # get substring (if isset offset param)
    $offset = ($offset<0) ? 0 : $offset;
    if ($offset>0)
    {
        preg_match('/^.{' . $offset . '}(.*)/us', $haystack, $dummy);
        $haystack = (isset($dummy[1])) ? $dummy[1] : '';
    }

    # get relative pos
    $p = strpos($haystack, $needle);
    if ($haystack=='' or $p===false) return false;
    $r = $offset;
    $i = 0;

    # calc real pos
    while($i<$p)
    {
        if (ord($haystack[$i])<128)
        {
            # ascii symbol
            $i = $i + 1;
        }
        else
        {
            # non-ascii symbol with variable length
            # (handling first byte)
            $bvalue = decbin(ord($haystack[$i]));       
            $i = $i + strlen(preg_replace('/^(1+)(.+)$/', '\1', $bvalue));
        }
        $r++;
    }
    return $r;
}

function utf8_substr_count($h, $n)
{
    # preparing $n for using in reg. ex.
    $n = preg_quote($n, '/');

    # select all matches
    preg_match_all('/' . $n . '/u', $h, $dummy);
    return count($dummy[0]);
}

See also: PHP & UTF-8. Chapter 1.

← To publications list

Nikolay I. Yarovoy,
03/19/2006.

Last projects:  Contact lens, Ekaterinburg

Back to top© 2021 ControlStyle, web site development. All rights reserved.
Web site promotion and advertising.