Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

how to split values inside column and create two differents table?

I guess it’s easy but i couldn’t figure how to make it properly,
i have this kind of table :

samplexxx   EH  Tred    GangSTR
dijen006_100    17  17  10,17
dijen006_75 .   .   .
dijen017_100    17,21   17,21   12,17
dijen017_75 17  21  20,20
dijen081_100    17,20   17,20   10,19
dijen081_75 17  17  18,18
dijen082_100    21,22   21,22   14,22
dijen082_75 22  22,27   21,21
dijen083_100    20  20  10,20
dijen083_75 20  20  19,19
dijen1013_100   17,20   17,20   9,19
dijen1013_75    18,20   17,20   17,19
dijen1014_100   17,18   17,18   8,17
dijen1014_75    18  18  18,18
dijen1015_100   nofile  .   7,15
dijen1015_75    16  16  15,16
dijen402_100    21,31   21,29   9,27
dijen402_75 27,31   21,38   18,36

and i would like to create two new tables, one with the first value before comma, and second table with the second value if it exists.
I tried using awk :

less HTT.tsv  | awk -F ',' '{print $1}'
    samplexxx   EH  Tred    GangSTR
    dijen006_100    17  17  10
    dijen006_75 .   .   .
    dijen017_100    17
    dijen017_75 17  21  20
    dijen081_100    17
    dijen081_75 17  17  18
    dijen082_100    21
    dijen082_75 22  22
    dijen083_100    20  20  10
    dijen083_75 20  20  19
    dijen1013_100   17
    dijen1013_75    18
    dijen1014_100   17
    dijen1014_75    18  18  18
    dijen1015_100   nofile  .   7
    dijen1015_75    16  16  15
    dijen402_100    21
    dijen402_75 27

but obvisouly it’s not correct as values are missing on some rows, does anyone has an idea? thanks!

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

>Solution :

Assuming you want every input line to be represented in both output files:

$ cat tst.awk
BEGIN { FS=OFS="\t" }
NR == 1 {
    str1 = str2 = $0
}
NR > 1 {
    str1 = str2 = $1
    for (i=2; i<=NF; i++) {
        split($i,a,/,/)
        str1 = str1 OFS a[1]
        str2 = str2 OFS a[2]
    }
}
{
    print str1 > "foo"
    print str2 > "bar"
}

$ awk -f tst.awk file

$ head -50 foo bar
==> foo <==
samplexxx       EH      Tred    GangSTR
dijen006_100    17      17      10
dijen006_75     .       .       .
dijen017_100    17      17      12
dijen017_75     17      21      20
dijen081_100    17      17      10
dijen081_75     17      17      18
dijen082_100    21      21      14
dijen082_75     22      22      21
dijen083_100    20      20      10
dijen083_75     20      20      19
dijen1013_100   17      17      9
dijen1013_75    18      17      17
dijen1014_100   17      17      8
dijen1014_75    18      18      18
dijen1015_100   nofile  .       7
dijen1015_75    16      16      15
dijen402_100    21      21      9
dijen402_75     27      21      18

==> bar <==
samplexxx       EH      Tred    GangSTR
dijen006_100                    17
dijen006_75
dijen017_100    21      21      17
dijen017_75                     20
dijen081_100    20      20      19
dijen081_75                     18
dijen082_100    22      22      22
dijen082_75             27      21
dijen083_100                    20
dijen083_75                     19
dijen1013_100   20      20      19
dijen1013_75    20      20      19
dijen1014_100   18      18      17
dijen1014_75                    18
dijen1015_100                   15
dijen1015_75                    16
dijen402_100    31      29      27
dijen402_75     31      38      36
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading