awk if nth column value multiples by X greater than Y

This is my file:

header 1 (which exists in my original file)
col1  col2 col3
name1 1K   1M
name2 2M   1K
name3 2K   2K
name4 2M   2M
name5 1K   5M

I want to do this:

find the third column, if consists of M (megabytes) and (third column * 1024 ** 2 >= 2097152), then write the whole line to another file

Desired output (no matter if the first line (col1, col2, col3) exists in the output or not):

header 1 (which exists in my original file)
col1  col2 col3
name4 2M   2M
name5 1K   5M

These are my failed attempts (I even my attempts with syntax error):

tail -$(( $(wc -l the_file | awk {'print  $1'}) - 2 )) the_file | grep M | awk '$3 >= 2M'
tail -$(( $(wc -l the_file | awk {'print  $1'}) - 2 )) the_file | grep M | awk '$3 >= 2*1024**M'
tail -$(( $(wc -l the_file | awk {'print  $1'}) - 2 )) the_file | grep M | awk '$3+0 > 1M'

And then write the output to another file (the values should be kept in M)

>Solution :

This awk should get the job done:

awk 'NR <= 2 || ($3 ~ /M$/ && $3+0 >= 2)' file

header 1 (which exists in my original file)
col1  col2 col3
name4 2M   2M
name5 1K   5M

We are printing those records where conditions are:

  • NR <= 2: record number is 1 or 2
  • ||: OR
  • ($3 ~ /M$/ && $3+0 >= 2): 3rd column ends with M and numeric value of 3rd column is greater than or equal to 2

Leave a Reply