I have two strings like below –
1)500 Rahway Avenue, Westfield, NJ 07090
2)910 N Harbor Drive, San Diego CA 92101
Want to get 2 Expected output-
1)Westfield
2)San Diego
And
1)NJ
2)CA
I tried below approach for output NJ and CA –
s1.rsplit(" ")[-2]
But this is not the right approach.
Any help would be appreciated.
>Solution :
Assuming the city is placed right before the zip code and that zip code always has two capitals, then using a regular expression you could do it like this:
import re
s = """500 Rahway Avenue, Westfield, NJ 07090
910 N Harbor Drive, San Diego CA 92101"""
results = re.findall(r", ([^,]*),? ([A-Z]{2}\b)", s)
print(results)
Output:
[('Westfield', 'NJ'), ('San Diego', 'CA')]
With the zip function you can turn that into a sequence of cities and of zipcodes:
cities, zipcodes = zip(*re.findall(r", ([^,]*),? ([A-Z]{2}\b)", s))
print(cities)
print(zipcodes)
Output:
('Westfield', 'San Diego')
('NJ', 'CA')
When you deal with separate strings for each line, you could also use that same regex as follows:
import re
lst = [
"500 Rahway Avenue, Westfield, NJ 07090",
"910 N Harbor Drive, San Diego CA 92101"
]
for s in lst:
city, zipcode = re.search(r", ([^,]*),? ([A-Z]{2}\b)", s).groups()
print(city, zipcode)