Hoping someone woll be able to point me in the right direction. I am new to bash scripting and I believe awk should be able to solve this problem.
I have multiple files that I want to process, the data located in $1 will always stay the same, the separator is just a space and numbers in $2 will change.
I wish to sum $2 from the multiple files and output to a new file. Example below:
File1.txt
DATA:TEST0 20
DATA:TEST1 4
DATA:TEST2 39
DATA:TEST3 11
File2.txt
DATA:TEST0 2
DATA:TEST1 0
DATA:TEST2 26
DATA:TEST3 9
File3.txt
DATA:TEST0 44
DATA:TEST1 16
DATA:TEST2 21
DATA:TEST3 7
Output.txt is the output I wish to achieve from the above files
DATA:TEST0 66
DATA:TEST1 20
DATA:TEST2 86
DATA:TEST3 27
I have tried the following but it does not work
paste file* | awk '{$2=$1+$2}1' | tee output.txt
Any advice would be appreicated. Thanks in advance
>Solution :
paste puts the files side by side, you don’t need that. Just give all the filenames as arguments to awk and it will process them sequentially.
Use an associative array for the sums for each keyword in column 1.
awk '{sum[$1] += $2} END {for (i in sum) print i, sum[i]}' file* | tee output.txt
If you want the output sorted, you can pipe to the sort command.