Advertisements
I have this two columns file :
ctg0F chr_1
ctg1F chr_2
ctg2F chr_3
ctg3F chr_4
ctg4F chr_5
ctg5F chr_6
ctg6F chr_4
ctg7F chr_7
ctg8F chr_8
The first column has different values. I’d like to add an index only for the repeated values in the second column. Here chr4
appears twice, and so :
ctg0F chr_1
ctg1F chr_2
ctg2F chr_3
ctg3F chr_4_1
ctg4F chr_5
ctg5F chr_6
ctg6F chr_4_2
ctg7F chr_7
ctg8F chr_8
I do this :
awk '{ if (++count[$2]>1) print $1,$2"_"count[$2]; else print $1,$2"_"count[$2]}'
But this even adds index "_1"
for the unique values.
Any help?
>Solution :
$ awk '
NR==FNR { tot[$2]++; next }
{ print $0 (tot[$2]>1 ? "_" (++cnt[$2]) : "") }
' file file
ctg0F chr_1
ctg1F chr_2
ctg2F chr_3
ctg3F chr_4_1
ctg4F chr_5
ctg5F chr_6
ctg6F chr_4_2
ctg7F chr_7
ctg8F chr_8