Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

AWK match string using regex and combine with previous string

I have been reviewing articles and posts on how to match and compare strings but I am struggling to put the two together, unfortunately, I do not have an example awk command that I am trying to make work because I can’t seem to even get that far. Below is what I have been trying to work with, I found it at comparing strings in consecutive lines with awk my hope was that if I changed the match condition from the previous line to instead be anything under 32 id start to get some output I could try to work with, and i modified the NR to start on the 4th string which would be the first subnet mask.

awk '$0<=32 && NR>3 {print (NR)/f} {f=$0} END {print NR,$0}'

My current input looks like this:

hostname1           hostname2           127.0.0.1             27              127.0.0.2              24              127.0.0.3             28              hostname3           127.0.0.4               27              127.0.0.5              24              127.0.0.6            28              127.0.0.7             27              127.0.0.8              24       127.0.0.9             28  

The output I am looking to have would be:

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

hostname1           hostname2           127.0.0.1/27              127.0.0.2/24              127.0.0.3/28              hostname3           127.0.0.4/27              127.0.0.5/24              127.0.0.6/28              127.0.0.7/27              127.0.0.8/24       127.0.0.9/28          

These are IP addresses and subnet masks, my thinking was to look for 16-32 using a regex, match for the previous string which would always be an IP address, and combine the two. Does anyone have any examples of this being done? I have to use variables as the number of inputted IP addresses and subnet combinations vary

>Solution :

Using sed

$ sed 's#\(\<[[:digit:].]\+\)[^[:digit:]]*\([[:digit:]]\+\)#\1/\2#g' input_file
hostname1           hostname2           127.0.0.1/27              127.0.0.2/24              127.0.0.3/28              hostname3           127.0.0.4/27              127.0.0.5/24              127.0.0.6/28              127.0.0.7/27              127.0.0.8/24       127.0.0.9/28

\(\<[[:digit:].]\+\) – This is the first capture group as it is enclosed within capturing parenthesis. This capture group will retain digits and periods. There is a word boundary \< at the start of the integer match.

[^[:digit:]]* – Exclude this match as it is not within parenthesis, This will exclude everything up till the next occurance of an integer character.

\([[:digit:]]\+\) – Second capture group which will retain one or more integer characters.

\1/\2 – This is the replacement, as we captured two groups, they can be returned with back refernce \1 and \2 respectively.

The default delimiter / for sed has been changed to # to avoid conflicting with your
data which will also contain / after the replacement.

Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading