Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

Reading data from an ANSI text file with special characters in Yii2 is shifting values, why?

My question is connected to this thread: How to use a text file as database in Yii2

I have managed to read data from the structured ANSI encoded text file (no csv, but positions and lenghts of columns are specified) and show them in a gridview. I have saved the column pos and len definitions into a MySql table so I can access that. Now I’m reading these column definitions and can succesfully read each column correctly:

foreach ($lines as $line) {
    if (substr($line, 0, 1) === $type) {
        $model = new DynamicModel($dynamicAttributes);
        foreach ($fielddefs as $fielddef) {
            $attributeName = $fielddef->attributes['variable'];
            $value = substr($line, $fielddef->pos, $fielddef->len);
            $model->$attributeName = $value;
        }
        $models[] = $model;
    }
}
return $models;

until a special character like ĂĽ, ö etc. is present in one of the columns for example description, and in that case, the values in the next columns are shifted, so it seems that this ĂĽ is somehow changing the way how the next columns are read, the positions or lenghts are not correctly defined anymore. I was trying to convert the whole file to UTF8 with mb_convert_encoding() but unfortunately it didn’t help. As soon as I remove any special characters like these, the whole record, all columns are read perfectly. Have you ever had something like this before? How could I fix this? Thanks.

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

This is how the text file looks like:

enter image description here

And this is how the result looks like in the gridview:

enter image description here

>Solution :

For your case, which involves multi-byte characters, you will need to use mb_substr

mb_substr(
    string $string,
    int $start,
    ?int $length = null,
    ?string $encoding = null
): string

For your information, mb_substr performs a multi-byte safe substr() operation based on number of characters. Position is counted from the beginning of string. First character’s position is 0. Second character position is 1, and so on.

You may see the effect of this function in this sandbox

Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading