extract substring from a string in python?

March 2, 2023

I have two strings like below –

1)500 Rahway Avenue, Westfield, NJ 07090
2)910 N Harbor Drive, San Diego CA 92101

Want to get 2 Expected output-

1)Westfield
2)San Diego

And

1)NJ
2)CA

I tried below approach for output NJ and CA –

s1.rsplit(" ")[-2]

But this is not the right approach.
Any help would be appreciated.

>Solution :

Assuming the city is placed right before the zip code and that zip code always has two capitals, then using a regular expression you could do it like this:

import re

s = """500 Rahway Avenue, Westfield, NJ 07090
910 N Harbor Drive, San Diego CA 92101"""

results = re.findall(r", ([^,]*),? ([A-Z]{2}\b)", s)

print(results)

Output:

[('Westfield', 'NJ'), ('San Diego', 'CA')]

With the zip function you can turn that into a sequence of cities and of zipcodes:

cities, zipcodes = zip(*re.findall(r", ([^,]*),? ([A-Z]{2}\b)", s))

print(cities)
print(zipcodes)

Output:

('Westfield', 'San Diego')
('NJ', 'CA')

When you deal with separate strings for each line, you could also use that same regex as follows:

import re

lst = [
    "500 Rahway Avenue, Westfield, NJ 07090",
    "910 N Harbor Drive, San Diego CA 92101"
]

for s in lst:
    city, zipcode = re.search(r", ([^,]*),? ([A-Z]{2}\b)", s).groups()
    print(city, zipcode)