I have a csv file (separated by comma), which contains
file1a.extension.extension,file1b.extension.extension
file2a.extension.extension,file2b.extension.extension
Problem is, these files are name such as file.extension.extension
I’m trying to feed both columns to parallel and removing all extesions
I tried some variations of:
cat /home/filepairs.csv | sed 's/\..*//' | parallel --colsep ',' echo column 1 = {1}.extension.extension column 2 = {2}
Which I expected to output
column 1 = file1a.extension.extension column 2 = file1b
column 1 = file2a.extension.extension column 2 = file2b
But outputs:
column 1 = file1a.extension.extension column 2 =
column 1 = file2a.extension.extension column 2 =
The sed command is working but is feeding only column 1 to parallel
>Solution :
As currently written the sed only prints one name per line:
$ sed 's/\..*//' filepairs.csv
file1a
file2a
Where:
\.matches on first literal period (.).*matches rest of line (ie, everything after the first literal period to the end of the line)//says to remove everything from the first literal period to the end of the line
I’m guessing what you really want is two names per line … one sed idea:
$ sed 's/\.[^,]*//g' filepairs.csv
file1a,file1b
file2a,filepath2b
Where:
\.matches on first literal period (.)[^,]*matches on everything up to a comma (or end of line)//gsays to remove the literal period, everything afterwards (up to a comma or end of line), and thegsays to do it repeatedly (in this case the replacement occurs twice)
NOTE: I don’t have parallel on my system so unable to test that portion of OP’s code