Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

Join string within loop using awk (involve replacing an array with another)

I’m writing a one-liner using awk to join strings within a loop.

Basically, there’s an array of strings (I call it a), then I have another copy of that array (I call it b which is cloned using the clone function). I concatenate the string from array a (which is m) and b (which is n) together, then I store the concatenated string (m n) in a temporary array (tmp). Next, I replace array b with array tmp, and repeat the concatenation. This time, array b is array tmp, so the concatenated string would be m n n… and so on. I put the concatenation in a loop (here, it’ll be repeated for three times) but I wasn’t able to print the result out after the loop was done.

awk -v k=3 'function clone(original, copy) {for (i in original) {if (isarray(original[i])) {copy[i][1]=""; delete copy[i][1]; clone(original[i], copy[i])} else {copy[i]=original[i]}}} BEGIN {a["A"]; a["T"]; a["G"]; a["C"]; clone(a, b); for (i=1; i<k; i++) {for (m in a) {for (n in b) {tmp[m n]}}; delete b; clone(tmp, b)}; for (i in b) {print i}}'

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

I was also able to do this in Perl and Python. And below is how I did it.

Equivalent function in Perl:

sub kmer_generator {
    my ($k) = @_;
    my @bases_1 = ("A", "T", "G", "C");
    my @bases_2 = @bases_1;
    for (my $i = 1; $i < $k; $i++) {
        my @temporary;
        for my $base_1 (@bases_1) {
            for my $base_2 (@bases_2) {
                push @temporary, "$base_1" . "$base_2";
            };
        };
        undef @bases_2;
        @bases_2 = @temporary;
    };
    return @bases_2;
};

Equivalent function in Python:

def generate_kmer(k):
    bases_1 = ["A", "T", "G", "C"]
    bases_2 = bases_1.copy()
    i = 0
    while i < k - 1:
        i += 1
        temp = []
        for m in bases_1:
            for n in bases_2:
                temp.append(m + n)
        bases_2 = None
        bases_2 = temp
    return bases_2

>Solution :

Hah, that was fun. The following script

#!/bin/bash

awk '
# https://unix.stackexchange.com/a/456316/209955
function clone(lhs, rhs) {
    for (i in rhs) {
        if (isarray(rhs[i])) {
            lhs[i][1] = ""
            delete lhs[i][1]
            clone(lhs[i], rhs[i])
        } else {
            lhs[i] = rhs[i]
        }
    }
}
# documentation...
# bases_2 is the return array
# k is an int
function generate_kmer(bases_2, k,
       # Local function variables, to preserve locality add them as arguments.
       bases_1, i, temp) {
    # Easy array initialization.
    split("A T G C", bases_1, " ")
    clone(bases_2, bases_1)
    # ugh?? I would do for (i = 1; i < k; i++) {
    i = 0
    while (i < k - 1) {
        i += 1
        temp = ""
        for (m in bases_1) {
            for (n in bases_2) {
                # I used just a string separated with spaces, easier to append.
                temp = temp (temp ? " " : "") bases_1[m] bases_2[n]
            }
        }
        split(temp, bases_2, " ")
    }
}
END {
    generate_kmer(output, 2)
    for (i in output) {
        printf("%s%s", output[i], i == length(output) ? "" : " ")
    }
    printf("\n")
}' </dev/null

python -c '
def generate_kmer(k):
    bases_1 = ["A", "T", "G", "C"]
    bases_2 = bases_1.copy()
    i = 0
    while i < k - 1:
        i += 1
        temp = []
        for m in bases_1:
            for n in bases_2:
                temp.append(m + n)
        bases_2 = temp
    return bases_2
print(" ".join(generate_kmer(2)))
'

outputs:

AA AT AG AC TA TT TG TC GA GT GG GC CA CT CG CC
AA AT AG AC TA TT TG TC GA GT GG GC CA CT CG CC
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading