re.findall not reading a file

December 9, 2021

I have a simple regex script to find IPs from a text file and add them to a list.

import re


pattern = '^\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3}$'
ip_list = []
ifilepath = input('Please specify full file path location' + '\n')
with open(ifilepath) as inputf:
    ips = re.findall(pattern, inputf.read())
    print(ips) ##Just to test if re.findall is matching against the file
    print(ifilepath)
    for ip in ips:
        ip_list.append(ip)

print('IPs matching the regex pattern: ')
print(ip_list)
print('\n')

After running, the output that I am seeing:

Please specify full file path location
C:\Users\Samson\Desktop\IP.txt

[]
C:\Users\Samson\Desktop\IP.txt
IPs matching the regex pattern: 
[]

It seems that the re.findall() method is not matching against the file, similar script with match method works. A bit of a head scratcher – what am I missing here?

Sample input text file

192.168.0.1 proxy123 10.10.0.1
192.168.0.2 httpstatus=404 proxy_result=block 10.10.0.2
192.163.0.3 %%%
192.168.0.4
abcde
%&&%#(%#(%#

>Solution :

You need to remove the ^ and $ anchors which only are true at the start and end of a string (or line with re.M set).

Consider:

>>> print(t)
192.168.0.1 proxy123 10.10.0.1
192.168.0.2 httpstatus=404 proxy_result=block 10.10.0.2
192.163.0.3 %%%
192.168.0.4
abcde
%&&%#(%#(%#'

Your pattern does not find any matches since there are multiple lines:

>>> re.findall('^\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3}$', t)
[]

The '192.168.0.4' would be found if you added the re.M flag (and there are no trailing whitespace in that line):

>>> re.findall(r'^\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3}$', t, flags=re.M)
['192.168.0.4']

>>> re.findall(r'\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3}', t)
['192.168.0.1', '10.10.0.1', '192.168.0.2', '10.10.0.2', '192.163.0.3', '192.168.0.4']

Your pattern does work if you break up the lines into substrings first:

>>> pattern=r'^\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3}$'
>>> [s for s in t.split() if re.match(pattern, s)]
['192.168.0.1', '10.10.0.1', '192.168.0.2', '10.10.0.2', '192.163.0.3', '192.168.0.4']