Iterate through a list in python and delete characters after the second instance of a character from an element

Advertisements

Sorry, very new to python.

Essentially I have a long list of file names, some in the format NAME_XX123456 and others in the format NAME_XX123456_123456.

I am needing to lose everything from the second underscore and after in each element.
The below code only iterates through the first two elements though, and doesn’t delete the remainder when it encounters a double underscore, just splits it.

sample_list=['NAME_XX011024', 'NAME_XX011030_1234', 'NAME_XX011070', 'NAME_XX090119_15165']

shortlist=[]
item  = "_"
count = 0
i=0
for i in range(0,len(sample_list)):
        if(item in sample_list[i]):
               count =  count + 1
               if(count == 2):
                     shortlist.append(sample_list[i].rpartition("_"))
                     i+=1
                     
               if (count == 1):
                   shortlist.append(sample_list[i])
                   i+=1
                   
               
        print(shortlist)

>Solution :

Here is a simple split join approach. We can split each input on underscore, and then join the first two elements together using underscore as the separator.

sample_list = ['NAME_XX011024', 'NAME_XX011030_1234', 'NAME_XX011070', 'NAME_XX090119_15165']
output = ['_'.join(x.split('_')[0:2]) for x in sample_list]
print(output)
# ['NAME_XX011024', 'NAME_XX011030', 'NAME_XX011070', 'NAME_XX090119']

You could also use regular expressions here:

sample_list = ['NAME_XX011024', 'NAME_XX011030_1234', 'NAME_XX011070', 'NAME_XX090119_15165']
output = [re.sub(r'([^_]+_[^_]+)_.*', r'\1', x) for x in sample_list]
print(output)
# ['NAME_XX011024', 'NAME_XX011030', 'NAME_XX011070', 'NAME_XX090119']

Leave a ReplyCancel reply