Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

Why printed string is shorter when locale-charater is used

I wrote the following code. I’m trying to print a string of specified length with a non-ASCII character.

int main(int argc, char **argv)
{   
    setlocale(LC_ALL,"pl_PL");
    printf("%-10sx\n","ą");
    printf("%-10sx\n","a");
    return 0;
}

The output is as follows:

ą        x
a         x

There is one (white)space less when a non-ASCII character is used. Why does it behave like this?

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

>Solution :

Why printed string is shorter when locale-charater is used

Because the number of columns needed to represent a multi-byte string is not equal to the number of bytes the character takes.

Why does it behave like this?

The string "ą" takes 2 bytes (and 1 more byte for zero terminating character), but is displayed on 1 column. So there will be 8 spaces of padding.

The string "a" has a length of 1 byte, so there will be 9 spaces of padding.

Is there any way to overcome this issue without manually changing the desired length when the string contains a non-ASCII character?

Use a library that has a database of mappings between characters and their width for the encoding your are using. Iterate over the string, get the number of columns needed to represent it. Then add the displayed width to the offset that you want to have from that length.

Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading