Advertisements
My input:
df=(pd.DataFrame({'items':['foo','bar','xyz'],
'path':['c/folderOne/folderTwo/folderThree','c/folder1/folder2/folder3','c/folderO/folderT/folderTh'],
}))
I trying insert in column path
some "fix" after 'c/...
I want add new substring such as NEWPATH
into list and re-join again, but I ran unexpected problem.
That my code:
df['path'] = df['path'].map(lambda x: x.split("/"))
df['path'] = df['path'].map(lambda x: x.insert(1,'NEWPATH'))
df['path'] = df['path'].map(lambda x: x.join('/'))
Code return None
in column path
, when I expected something like this, after second line of my code:
in column df['path'] : ['c','NEWPATH','folderOne'....]
etc. for each row(cell) of my column.
>Solution :
Looks like you should better use a regex here:
NEWPATH = 'abc'
df['path2'] = df['path'].str.replace('^([^/]+/)', rf'\1{NEWPATH}/')
NB. assigning to a new column here for clarity
output:
items path path2
0 foo c/folderOne/folderTwo/folderThree c/abc/folderOne/folderTwo/folderThree
1 bar c/folder1/folder2/folder3 c/abc/folder1/folder2/folder3
2 xyz c/folderO/folderT/folderTh c/abc/folderO/folderT/folderTh
inserting after the nth directory:
NEWPATH = 'abc'
df['path2'] = df['path'].str.replace('^(^(?:[^/]+/){3})', rf'\1{NEWPATH}/')
or
NEWPATH = 'abc'
N = 3
df['path2'] = df['path'].str.replace(f'^(^(?:[^/]+/){{{N}}})', rf'\1{NEWPATH}/')
output:
items path path2
0 foo c/folderOne/folderTwo/folderThree c/folderOne/folderTwo/abc/folderThree
1 bar c/folder1/folder2/folder3 c/folder1/folder2/abc/folder3
2 xyz c/folderO/folderT/folderTh c/folderO/folderT/abc/folderTh