I’m trying to use SED (sed (GNU sed) 4.2.2
) on Centos 7 (OS doesn’t seem related as same behavior occurs with AWS Linux 2) and my capture group is not being added back to the substitution string.
I’m trying to add a directory to an m3u8 file’s resources. The regex is correct as it does the replacement but it looses what should be captured in the first capture group.
code:
eregex='([0-9]+_?[0-9]*[.](ts|key))'
find . -type f -exec grep -lZEe "$eregex" {} + | xargs -r0 sed -i -E "s~$eregex~CH/$1~g"
original data:
https://example.com/dir/dir2/number/12345.key
behavior after execution:
https://example.com/dir/dir2/number/CH/
expected result:
https://example.com/dir/dir2/number/CH/12345.key
I’ve tried using it as a back reference \1
but that didn’t address the issue either. Is my syntax wrong here, or are the capture groups not working as intended? Tried using a non capture group as well for the possible extensions but that didn’t seem to be supported.
https://regex101.com/r/CSWeFx/1
>Solution :
I’ve tried using it as a back reference \1 but that didn’t address the issue either. Is my syntax wrong here,
Yes. The syntax for a backreference in sed
‘s regex dialect is \1
, \2
, etc..
Your command line is processed by the shell before it invokes any commands. That includes parameter expansion, on which you are depending to provide the regex via variable eregex
. But $1
is a variable reference too, and it will also be expanded (apparently to nothing in your case).
I’ve tried using it as a back reference \1 but that didn’t address the issue either.
The backslash (\
) is a single-character quote character to the shell. Except inside a single-quoted string, \1
is equivalent to 1
. The shell will convert the former to the latter during the quote removal stage of command-line processing. To pass a literal \
through to sed
, you must either double it or enclose it in a single-quoted string. For example,
sed -i -E "s~${eregex}~CH/\\1~g"
or
sed -i -E "s~${eregex}~CH/"'\1~g'
(The curly braces are not essential in this case, but I consider it a matter of good form to use curly braces in variable references.)
or are the capture groups not working as intended? Tried using a non capture group as well for the possible extensions but that didn’t seem to be supported.
Correct, sed
does not recognize Perl-style non-capturing groups.