Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

Making duplicate record unique using awk

I am trying to use awk to identify duplicate records in a file and apply the changes directly to it. The file has six columns with no headers. My aim is to edit the second column of the duplicate record to make it unique by adding 1 every time it appears. The data looks like this:

1 A B C D E
1 A B C D E   (This is a duplicate record1)
1 A B C D E   (This is a duplicate record2)
2 F G H I J
3 K L M N O

The desired output

1 A   B C D E
1 A-1 B C D E
1 A-2 B C D E
2 F   G H I J
3 K   L M N O

Edit:

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

I tried this code awk 'cnt[$0]++{$0=$0" variant "cnt[$0]-1} 1' file from this post How to rename duplicate lines with awk? but the numbers are added at the end of the record

>Solution :

With your shown samples please try following awk code.

One-liner form of above solution is:

awk '++arr1[$0]>1{$2=$2"-"++arr[$2]}1' Input_file

OR

awk '
++arr1[$0]>1{
  $2=$2"-"++arr[$2]
}
1
'  Input_file

Explanation: Adding detailed explanation for above awk code.

awk '                               ##Starting awk program from here.
++arr1[$0]>1{                       ##Checking condition if current line occurrence in arr1 is greater than 1
  $2=$2"-"++arr[$2]                 ##Then add values to $2 as per condition. If $2 occurrence in arr is more than 1 then add - followed by its occurrence.
}
1                                   ##1 will print edited/non-edited line.
' Input_file                        ##Mentioning Input_file name here.
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading