I am converting the .txt file into labels.csv by adding some columns in a data frame. How I can remove images/0/ from the column contains images/1/19997.jpg, images/1/19998.jpg images/1/19999.jpg /0 is folder name and it varies time to time
Code
import pandas as pd
# Read space-separated columns without header
data = pd.read_csv('/media/cvpr/CM_24/synthtiger/results/gt.txt', sep="\s+", header=None)
# Update columns
data.columns = ['filename', 'words']
# Save to required format
data.to_csv('labels.csv')
>Solution :
There is probably more efficient method using slicing (assuming the filename have a fixed properties). But you can use os.path.basename. It will automatically retrieve the valid filename from the path.
data['filename_clean'] = data['filename'].apply(os.path.basename)
![[Result Example]
[1]: https://i.stack.imgur.com/t8znU.png](https://i0.wp.com/i.stack.imgur.com/t8znU.png?w=1200&ssl=1)