Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

Extract string between combintation of words and characters

I would like to keep the strings between (FROM and as), and (From and newline character).

Input:

FROM some_registry as registry1
FROM another_registry

Output:

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

some_registry
another_registry

Using the following sed command, I can extract the strings. Is there a way to combine the two sed commands?

sed -e 's/.*FROM \(.*\) as.*/\1/' | sed s/"FROM "//

>Solution :

Merging into one regex expression is hard here because POSIX regex does not support lazy quantifiers.

With GNU sed, you can pass the command as

sed 's/.*FROM \(.*\) as.*/\1/;s/FROM //' file

See this online demo.

However, if you have a GNU grep you can use a bit more precise expression:

#!/bin/bash
s='FROM some_registry as registry1
From another_registry'
grep -oP '(?i)\bFROM\s+\K.*?(?=\s+as\b|$)' <<< "$s"

See the online demo. Details:

  • (?i) – case insensitive matching ON
  • \b – a word boundary
  • FROM – a word
  • \s+ – one or more whitespaces
  • \K – "forget" all text matched so far
  • .*? – any zero or more chars other than line break chars as few as possible
  • (?=\s+as\b|$) – a positive lookahead that matches a location immediately followed with one or more whitespaces and then a whole word as, or end of string.
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading