Word extraction with regex string

October 19, 2022

From this post, I am able recognize the pattern object.* by use or regex string m/(?<=object\.)\w*. However, since I am unfamiliar with Linux, I cannot use the commands sed or perl properly to extract desired tokens. Thus, I need your help. My best guess is grep -E -n object file.txt | perl -nle 'm/(?<=object\.)\w*/; print $1'.

>Solution :

You can use grep or sed:

grep -oP '(?<=object\.)\w+' file
sed -nE 's/.*object\.([[:alnum:]_]+).*/\1/p' file

See the online demo.

The grep -oP allows you to use PCRE regex (with -P option) and extract all matched texts (with -o option).

The sed command is more complex, it allows extracting matches (that are the last on a line) once per line: first, it suppresses the default line output with -n and sets the regex flavor to POSIX ERE (with -E), then matches a line with object. + one or more alphanumeric or underscore chars captured into \1 and replaces the full line with the Group 1 value, and only that result is returned.