Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

Print lines that have no duplicates in a file and preserve sort order linux

I have the following file:

2
1
4
3
2
1

I want the output like this (unique lines that don’t have any duplicates and preserve order):

4
3

I tried sort file.txt | uniq -u it works, but output is sorted:

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

3
4

I tried awk '!x[$0]++' file.txt it keeps order, but it prints all values once:

2
1
4
3

>Solution :

A couple ideas to choose from:

a) read the input file twice:

awk '
FNR==NR         { counts[$1]++; next }  # 1st pass: keep count
counts[$1] == 1                         # 2nd pass: print rows with count == 1
' file.txt file.txt

b) read the input file once (requires all rows to be stored in memory – via an array):

awk '
    { lines[NR] = $1                    # maintain ordering of rows
      counts[$1] ++
    }
END { for ( i=1;i<=NR;i++ )             # run thru the indices of the lines[] array and ...
          if ( counts[i] == 1 )         # if the associated count == 1 then ...
             print lines[i]             # print the array entry to stdout
    }
' file.txt

Both of these generate:

4
3
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading