Bash – Multi-character string replacement when strings consist of unknown length but same character

Advertisements

Assume a multi-line text string in which some lines start with a key-character ("#" in our case). Further assume that you wish to replace all instances of a target character ("o" in our case) with a different character ("O" in our case), if – and only if – that target character occurs as a string of two or more adjacent copies (e.g., "ooo"). This replacement is to be done in all lines that do not start with the key-character and must be case-sensitive.

For example, the following lines …

#Foo bar
Foo bar
#Baz foo
Baz foo

are supposed to be converted into:

#Foo bar
FOO bar
#Baz foo
Baz fOO

The following attempt using sed does not retain the correct number of target characters:

$ echo -e "#Foo bar\nFoo bar\n#Baz foo\nBaz foo" | sed '/^#/!s/o\{2,\}/O/g'
#Foo bar
FO bar
#Baz foo
Baz fO

What code (with sed or otherwise) would conduct the desired replacement correctly?

>Solution :

You can use Perl:

echo -e "#Foo bar\nFoo bar\n#Baz foo\nBaz foo" | perl -pe 's/o{2,}/"O" x length($&)/ge' 

Here, o{2,} matches two or more o chars, and "O" x length($&) replaces these matches with O that is repeated the match size times ($& is the match value). Note the e flag after g that is used to evaluate the string on the right-hand side.

See the online demo:

#!/bin/bash
s="#Foo bar
Foo bar
#Baz foo
Baz foo"
perl -pe 's/o{2,}/"O" x length($&)/ge' <<< "$s"

Output:

#FOO bar
FOO bar
#Baz fOO
Baz fOO

Leave a ReplyCancel reply