Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

Python regex pattern in order to find if a code line is finishing with a space or tab character

Sorry for putting such a low level question but I really tried to look for the answer before coming here…
Basically I have a script which is searching inside .py files and reads line by line there code -> the object of the script is to find if a line is finishing with a space or a tab as in the below example

i = 5 
z = 25 

Basically afte r the i variable we should have a \s and after z variable a \t . ( i hope the code format will not erase it)

def custom_checks(file, rule):
    """
    @param file: file: file in-which you search for a specific character
    @param rule: the specific character you search for
    @return: dict obj with the form { line number : character }
    """
    rule=re.escape(rule)
    logging.info(f"     File {os.path.abspath(file)} checked for {repr(rule)} inside it ")
    result_dict = {}

    file = fileinput.input([file])
    for idx, line in enumerate(file):
        if re.search(rule, line):
            result_dict[idx + 1] = str(rule)

    file.close()
    if not len(result_dict):
        logging.info("Zero non-compliance found based on the rule:2 consecutive empty rows")
    else:
        logging.warning(f'Found the next errors:{result_dict}')

After that if i will check the logging output i will see this:
checked for ‘\+s\\s\$’ inside it i dont know why the \ are double
Also basically i get all the regex from a config.json which is this one:

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

{
  "ends with tab":"+\\t$",
  "ends with space":"+s\\s$"

}

Could some one help me please in this direction-> I basically know that I may do in other ways such as reverse the line [::-1] get the first character and see if its \s etc but i really wanna do it with regex.
Thanks!

>Solution :

Try:

rules = {
  'ends with tab': re.compile(r'\t$'),
  'ends with space': re.compile(r' $'),
}

Note: while getting lines from iterating the file will leave newline ('\n') at the end of each string, $ in a regex matches the position before the first newline in the string. Thus, if using regex, you don’t need to explicitly strip newlines.

if rule.search(line):
    ...

Personally, however, I would use line.rstrip() != line.rstrip('\n') to flag trailing spaces of any kind in one shot.

If you want to directly check for specific characters at the end of the line, you then need to strip any newline, and you need to check if the line isn’t empty. For example:

char = '\t'
s = line.strip('\n')

if s and s[-1] == char:
    ...

Addendum: read rules from JSON config

# here from a string, but could be in a file, of course
json_config = """
{
    "ends with tab": "\\t$",
    "ends with space": " $"
}
"""

rules = {k: re.compile(v) for k, v in json.loads(json_config).items()}
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading