AWK match string using regex and combine with previous string

July 5, 2022

I have been reviewing articles and posts on how to match and compare strings but I am struggling to put the two together, unfortunately, I do not have an example awk command that I am trying to make work because I can’t seem to even get that far. Below is what I have been trying to work with, I found it at comparing strings in consecutive lines with awk my hope was that if I changed the match condition from the previous line to instead be anything under 32 id start to get some output I could try to work with, and i modified the NR to start on the 4th string which would be the first subnet mask.

awk '$0<=32 && NR>3 {print (NR)/f} {f=$0} END {print NR,$0}'

My current input looks like this:

hostname1           hostname2           127.0.0.1             27              127.0.0.2              24              127.0.0.3             28              hostname3           127.0.0.4               27              127.0.0.5              24              127.0.0.6            28              127.0.0.7             27              127.0.0.8              24       127.0.0.9             28

The output I am looking to have would be:

hostname1           hostname2           127.0.0.1/27              127.0.0.2/24              127.0.0.3/28              hostname3           127.0.0.4/27              127.0.0.5/24              127.0.0.6/28              127.0.0.7/27              127.0.0.8/24       127.0.0.9/28

These are IP addresses and subnet masks, my thinking was to look for 16-32 using a regex, match for the previous string which would always be an IP address, and combine the two. Does anyone have any examples of this being done? I have to use variables as the number of inputted IP addresses and subnet combinations vary

>Solution :

Using sed

$ sed 's#\(\<[[:digit:].]\+\)[^[:digit:]]*\([[:digit:]]\+\)#\1/\2#g' input_file
hostname1           hostname2           127.0.0.1/27              127.0.0.2/24              127.0.0.3/28              hostname3           127.0.0.4/27              127.0.0.5/24              127.0.0.6/28              127.0.0.7/27              127.0.0.8/24       127.0.0.9/28

\(\<[[:digit:].]\+\) – This is the first capture group as it is enclosed within capturing parenthesis. This capture group will retain digits and periods. There is a word boundary \< at the start of the integer match.

[^[:digit:]]* – Exclude this match as it is not within parenthesis, This will exclude everything up till the next occurance of an integer character.

\([[:digit:]]\+\) – Second capture group which will retain one or more integer characters.

\1/\2 – This is the replacement, as we captured two groups, they can be returned with back refernce \1 and \2 respectively.

The default delimiter / for sed has been changed to # to avoid conflicting with your
data which will also contain / after the replacement.