I have a file that contains some information about daily storage utilization. There are two columns – DD.MM date and usage in KB for every day.
I’m using awk to show the difference between every second line and the previous one in GB as storage usage increases.
Example file:
20.09 10485760
21.09 20971520
22.09 26214400
23.09 27262976
My awk command:
awk 'NR > 1 {a=($2-prev)/1024^2" GB"} {prev=$2} {print $1,$2,a}' file
This outputs:
20.09 10485760
21.09 20971520 10 GB
22.09 26214400 5 GB
23.09 27262976 1 GB
I would also like to add the weekday name before the first column. The date format in the file is always DD.MM, so, to make GNU date accept it as a valid input and return the weekday name, i composed this pipeline:
echo '20.09.2022' | awk -v FS=. -v OFS=- '{print $3,$2,$1}' | date -f - +%a
It works, but i want to call it from the first awk for every processed line with the first column date as an argument and ".2022" appended to it in order to work, and put the output of this external pipeline (it will be the weekday name) before the date in first column.
Example output:
Tue 20.09 10485760
Wed 21.09 20971520 10 GB
Thu 22.09 26214400 5 GB
Fri 23.09 27262976 1 GB
I looked at the system() option in awk, but i couldn’t make it to work with my pipeline and my first awk command.
>Solution :
1st solution: With your shown samples please try following awk code. What system command does in awk is: It runs mentioned commands in a separate shell so basically you are calling awk–>system–>shell–>commands so in spite of that just get all the values with 1 awk for all days(based on 1st field of your Input_file) and we can pass it as an input to another awk where we are doing actual space calculations and we can merge both of them(because system command prints the output through shell commands so then we can’t merge that output with awk‘s output). We could also do it with a while loop but IMHO doing it with awk could be faster.
awk '
FNR==NR{
arr[FNR]=$0
next
}
prev{
a=($2-prev)/1024^2" GB"
}
{
print arr[FNR],$1,$2,a
prev=$2
}
' <(awk '{split($1,arr,".");system("d=\"2022-" arr[2]"-"arr[1]"\";date -d \"$d\" +%a")}' Input_file) Input_file
2nd solution: Using a shell loop along with it awk, though I recommend not to use this if your Input_file is too huge, adding is as an another approach here.
while read first second
do
day=${first%\.*}
month=${first#*\.}
val="2022-${month}-${day}"
dateNew=$(date -d "$val" +%a)
count=$(( count + 1 ))
awk -v date="$dateNew" -v till="$count" 'prev{a=($2-prev)/1024^2" GB"} FNR==till{print date,$1,$2,a;exit} {prev=$2}' Input_file
done < "Input_file"
Output with shown samples will be as follows:
Tue 20.09 10485760
Wed 21.09 20971520 10 GB
Thu 22.09 26214400 5 GB
Fri 23.09 27262976 1 GB