Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

Storing polish character as variable in PHP

I am creating a simple registration form, in which you must type your nickname into textarea. The problem occurs when user uses polish characters (ś,ż,ć etc.). When I try to display the whole string by echo it looks just how it should, but when I try to display only one character then it shows this weird � symbol.

function ech($nam)
{
    echo $nam;
    echo "</br>";
    echo $nam[1];
}

$te = $_POST['sth']; //$te equals "śżć" now
ech($te);

Output:

śżć

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

>Solution :

Using an offset for a character of a string like $nam[1] actually only returns a single byte, but the characters are multiple bytes. use multibyte-safe string functions like mb_substr($nam, 0, 1)

  • php strings are byte arrays (in contrast to, for example, JavaScript, where they are utf16 character arrays), in UTF-8 the string "ś" contains 2 bytes, doing strlen("ś") gives you 2, doing bin2hex("ś") gives you "c59b", and when you do $str[0] you are only fetching the first byte of the 2 bytes that makes up ś, which on it’s own happens to mean nothing, hence you get the � when doing $str[0] (fwiw doing echo $str[0].$str[1] would also work because ś happens to be 2 bytes and you’d manually fetch the first 2 bytes)
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading