Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

How to validate character belongs to the portable character set

Number of POSIX standard refer to the Portable Character Set.

I need to check that user input corresponds to the standards and consists only from the acceptable characters. Is there any convenient way to do the check?

There is tedious approach to manually port table from the wikipedia:

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

portable_set = '\0\a\b...'
def check(sample):
     return all(c in portable_set for c in sample)

But POSIX is all round us, so I believe somewhere in the python standard library such set should be already defined. But I don’t know the location to find it.

>Solution :

I don’t believe such a set exists built-in in python. If it did exist I’d expect it to reside in the string module, and it’s not there.

However, python does have string.printable, which I’m pretty sure contains all but the first three elements of the portable character set. You can make your definition more terse by just tacking the remainder onto it:

import string

portable_set = set(string.printable + '\0\a\b')
def check(sample):
    return set(sample).issubset(portable_set)
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading