How to manipulate strings in a list?

September 6, 2022

I have a list of strings which are in different formats, some of the examples are:-


a=['08/58/13ND','08/58/16ND','08/58/18ND','114/15ND','04/2010/PB','AB/23/2016/CHE']

In this the last digits after "/" are YEAR for each string. Some of the strings have year in perfect format like 2010 and 2016 which I want to leave them as it is. But for other strings the last digits after "/" have 13,16,18 and 15,etc.

I want them to be in YYYY format.

Expected Output:


['08/58/2013/ND','08/58/2016/ND','08/58/2018/ND','114/2015/ND','04/2010/PB','AB/23/2016/CHE']

>Solution :

You could try as follows:

import re
a = ['08/58/13ND','08/58/16ND','08/58/18ND','114/15ND','04/2010/PB','AB/23/2016/CHE']

pattern = r'(\d{2})(?=[A-Z]+$)'
b = [re.sub(pattern,r'20\1/', x) for x in a]

print(b)
['08/58/2013/ND', '08/58/2016/ND', '08/58/2018/ND', '114/2015/ND', '04/2010/PB', 'AB/23/2016/CHE']

Explanation r'(\d{2})(?=[A-Z]+$)':

(\d{2}) is a capturing group for 2 digits, to be matched only if followed by one or more characters in [A-Z] at the end of the string $. For this we use the positive lookahead: (?=[A-Z]+$).

Explanation r'20\1/':