Add index when same column value

Advertisements

I have this two columns file :

ctg0F     chr_1
ctg1F     chr_2
ctg2F     chr_3
ctg3F     chr_4
ctg4F     chr_5
ctg5F     chr_6
ctg6F     chr_4
ctg7F     chr_7
ctg8F     chr_8

The first column has different values. I’d like to add an index only for the repeated values in the second column. Here chr4 appears twice, and so :

ctg0F     chr_1
ctg1F     chr_2
ctg2F     chr_3
ctg3F     chr_4_1
ctg4F     chr_5
ctg5F     chr_6
ctg6F     chr_4_2
ctg7F     chr_7
ctg8F     chr_8

I do this :

awk '{ if (++count[$2]>1) print $1,$2"_"count[$2]; else print $1,$2"_"count[$2]}' 

But this even adds index "_1" for the unique values.

Any help?

>Solution :

$ awk '
    NR==FNR { tot[$2]++; next }
    { print $0 (tot[$2]>1 ? "_" (++cnt[$2]) : "") }
' file file
ctg0F     chr_1
ctg1F     chr_2
ctg2F     chr_3
ctg3F     chr_4_1
ctg4F     chr_5
ctg5F     chr_6
ctg6F     chr_4_2
ctg7F     chr_7
ctg8F     chr_8

Leave a ReplyCancel reply