issues when using re.finditer with + sign character in string

November 16, 2022

I am using the following code to find the location the start index of some strings as well as a temperature all of which are read from a text file.
The array searchString, contains what I’m looking for. It does locate the index of the first character of each string. The issue is that unless I put the backslash in front of the string: +25°C, finditer gives an error.
(Alternately, if I remove the + sign, it works – but I need to look for the specific +25). My question is am I correctly escaping the + sign, since the line: print('Looking for: ' + headerName + ' in the file: ' + filename )
displays : Looking for: +25°C in the file: 123.txt (with the slash showing in front of of the +)
Am I just ‘getting away with this’, or is this escaping as it should?
thanks

import re

path = 'C:\mypath\\'
searchString =["Power","Cal", "test", "Frequency", "Max", "\+25°C"]
filename = '123.txt' # file name to check for text

def search_str(file_path):
    with open(file_path, 'r') as file:
        content = file.read()

        for headerName in searchString:
            print('Looking for: ' + headerName + ' in the file: ' + filename )
            match =re.finditer(headerName, content)
            sub_indices=[]
            for temp in match:
                index = temp.start()
                sub_indices.append(index)   
            print(sub_indices ,'\n')

>Solution :

You should use the re.escape() function to escape your string pattern. It will escape all the special characters in given string, for example –

>>> print(re.escape('+25°C'))
\+25°C
>>> print(re.escape('my_pattern with specials+&$@('))
my_pattern\ with\ specials\+\&\$@\(

So replace your searchString with literal strings and try with –

def search_str(file_path):
    with open(file_path, 'r') as file:
        content = file.read()

        for headerName in searchString:
            print('Looking for: ' + headerName + ' in the file: ' + filename )
            match =re.finditer(re.escape(headerName), content)
            sub_indices=[]
            for temp in match:
                index = temp.start()
                sub_indices.append(index)   
            print(sub_indices ,'\n')