Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

How to get only the first occurrence of the pattern in bash script

Hi All can you please help me I am trying to get the following output so basically I have 2 input files and we need only the common :names from both the input files along with there the lines below them the .name/of/file lines.

Till now I have tried:

awk 'FNR==NR { a[$1]; next }NF<=1   { flag=0   } $1 in a { flag=1   }flag' file1 file2

Output:

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

:name1
    ./name/of/file [logfile] [ error in file coming since Day : 1 ]
    ./name/of/file [logfile] [ error in file coming since Day : 1 ]
:name3
    ./name/of/file [logfile] [ error in file coming since Day : 24 ]
    ./name/of/file [logfile] [ error in file coming since Day : 24 ]
:name1
    ./name/of/file [logfile] [ error in file coming since Day : 40]
    ./name/of/file [logfile] [ error in file coming since Day : 40 ]
:name4
    ./name/of/file [logfile] [ error in file coming since Day : 3 ]
    ./name/of/file [logfile] [ error in file coming since Day : 3 ]
    ./name/of/file [logfile] [ error in file coming since Day : 3 ]
    ./name/of/file [logfile] [ error in file coming since Day : 3 ]

:name4
./name/of/file [logfile] [ error in file coming since Day : 10 ]
./name/of/file [logfile] [ error in file coming since Day : 10 ]
./name/of/file [logfile] [ error in file coming since Day : 10 ]
./name/of/file [logfile] [ error in file coming since Day : 10 ]

But its printing all the occrance of :name1 and :name4 we only need the first occurrence of :name1 and :name4 from input file 1

Input file1:

:name1
./name/of/file [logfile] [ error in file coming since Day : 1 ]
./name/of/file [logfile] [ error in file coming since Day : 1 ]

:name2
./name/of/file [logfile] [ error in file coming since Day : 1 ]

:name3
./name/of/file [logfile] [ error in file coming since Day : 24 ]
./name/of/file [logfile] [ error in file coming since Day : 24 ]

 :name1
./name/of/file [logfile] [ error in file coming since Day : 40]
./name/of/file [logfile] [ error in file coming since Day : 40 ]

:name4
./name/of/file [logfile] [ error in file coming since Day : 3 ]
./name/of/file [logfile] [ error in file coming since Day : 3 ]
./name/of/file [logfile] [ error in file coming since Day : 3 ]
./name/of/file [logfile] [ error in file coming since Day : 3 ]

:name5
./name/of/file [logfile] [ error in file coming since Day : 6 ]
./name/of/file [logfile] [ error in file coming since Day : 6 ]

:name4
./name/of/file [logfile] [ error in file coming since Day : 10 ]
./name/of/file [logfile] [ error in file coming since Day : 10 ]
./name/of/file [logfile] [ error in file coming since Day : 10 ]
./name/of/file [logfile] [ error in file coming since Day : 10 ]  

Input file2:

:name1
:name3
:name4

Required Outputfile:

:name1
    ./name/of/file [logfile] [ error in file coming since Day : 1 ]
    ./name/of/file [logfile] [ error in file coming since Day : 1 ]
:name3
    ./name/of/file [logfile] [ error in file coming since Day : 24 ]
    ./name/of/file [logfile] [ error in file coming since Day : 24 ]
:name4
    ./name/of/file [logfile] [ error in file coming since Day : 3 ]
    ./name/of/file [logfile] [ error in file coming since Day : 3 ]
    ./name/of/file [logfile] [ error in file coming since Day : 3 ]
    ./name/of/file [logfile] [ error in file coming since Day : 3 ]

>Solution :

Since an ‘occurrence’ is based on the name existing as an index in the a[] array, removal of that entry from the array should disable any further ‘occurrences’.

We can accomplish this removal by adding a delete to the current awk code:

awk '
FNR==NR { a[$1]; next          }
NF<=1   { flag=0               }      # clear flag if zero or one non-white-space fields is present in the current line
$1 in a { flag=1; delete a[$1] }      # set flag if 1st field is an index in the a[] array; delete entry from array to insure no more occurrences of this name ($1)
flag                                  # if flag == 1 then print current line to stdout
' file2 file1

This generates:

:name1
./name/of/file [logfile] [ error in file coming since Day : 1 ]
./name/of/file [logfile] [ error in file coming since Day : 1 ]
:name3
./name/of/file [logfile] [ error in file coming since Day : 24 ]
./name/of/file [logfile] [ error in file coming since Day : 24 ]
:name4
./name/of/file [logfile] [ error in file coming since Day : 3 ]
./name/of/file [logfile] [ error in file coming since Day : 3 ]
./name/of/file [logfile] [ error in file coming since Day : 3 ]
./name/of/file [logfile] [ error in file coming since Day : 3 ]

If the logfile lines need to be indented (eg, 4 spaces as in the question), one idea:

awk '
FNR==NR { a[$1]; next }
NF<=1   { flag=0 }
$1 in a { print; flag=1; delete a[$1]; next }
flag    { printf "    %s\n",$0 }
' file2 file1

This generates:

:name1
    ./name/of/file [logfile] [ error in file coming since Day : 1 ]
    ./name/of/file [logfile] [ error in file coming since Day : 1 ]
:name3
    ./name/of/file [logfile] [ error in file coming since Day : 24 ]
    ./name/of/file [logfile] [ error in file coming since Day : 24 ]
:name4
    ./name/of/file [logfile] [ error in file coming since Day : 3 ]
    ./name/of/file [logfile] [ error in file coming since Day : 3 ]
    ./name/of/file [logfile] [ error in file coming since Day : 3 ]
    ./name/of/file [logfile] [ error in file coming since Day : 3 ]
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading