Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

Replace 2nd and 3rd occurrence of a character with another character, for each line, Bash

I am trying to reformat the reference legend files to make them compatible with bcftools.

Essentially, I need to go from this:

id position a0 a1 TYPE AFR AMR EAS EUR SAS ALL
1:123:A:T 123 A T SNP 0.01 0.01 0 0 0 0.01
1:679:A:T 123 A T SNP 0.01 0.01 0 0 0 0.01

to this:

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

id position a0 a1 TYPE AFR AMR EAS EUR SAS ALL
1:123_A_T 123 A T SNP 0.01 0.01 0 0 0 0.01
1:679_A_T 123 A T SNP 0.01 0.01 0 0 0 0.01

ideally using bash.

>Solution :

If sed is an option:

sed 's/:/_/2; s/:/_/2' file > reformatted_file

(This command s/:/_/2 is substituting the second ":" to an underscore, then substituting the third ":" to an underscore, although it’s technically now the second ":" (s/:/_/2), because the first one has already been changed. Does that make sense?)

Or with only bash:

while read -r line
do
    tmp="${line//:/_}"
    echo "${tmp/_/:}"
done < file > reformatted_file

(*This works with your example, but replacing every ":" with an underscore, then changing the first one back to a ":" might have unintended effects on your file, e.g. it might mess up your header)

Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading