Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

Word extraction with regex string

From this post, I am able recognize the pattern object.* by use or regex string m/(?<=object\.)\w*. However, since I am unfamiliar with Linux, I cannot use the commands sed or perl properly to extract desired tokens. Thus, I need your help. My best guess is grep -E -n object file.txt | perl -nle 'm/(?<=object\.)\w*/; print $1'.

>Solution :

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

You can use grep or sed:

grep -oP '(?<=object\.)\w+' file
sed -nE 's/.*object\.([[:alnum:]_]+).*/\1/p' file

See the online demo.

The grep -oP allows you to use PCRE regex (with -P option) and extract all matched texts (with -o option).

The sed command is more complex, it allows extracting matches (that are the last on a line) once per line: first, it suppresses the default line output with -n and sets the regex flavor to POSIX ERE (with -E), then matches a line with object. + one or more alphanumeric or underscore chars captured into \1 and replaces the full line with the Group 1 value, and only that result is returned.

Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading