How to replace "00" with "N/A" skipping first row and first column in python

I’m working with GWAS data which is of 2Million columns and 522 rows. Here I need to replace "00" with "N/A" over data. Since I have a huge file I’m using the open_reader method. can anyone please help

Note: Need to skip the first row and first column

sample data:

ID,kgp11270025,kgp570033,rs707,kgp7500
1,CT,GT,CA,00
200,00,TG,00,GT
300,AA,00,CG,AA
400,GG,CC,AA,TA 

Desired Output:

ID,kgp11270025,kgp570033,rs707,kgp7500
1,CT,GT,CA,N/A
200,N/A,TG,N/A,GT
300,AA,N/A,CG,AA
400,GG,CC,AA,TA 

The code I written:

import re

input_file = "test.csv"
output_file = "testresult.csv"

# print("Processing data from", input_file)
with open(input_file) as f:
    lineno = 0
    for line in f:
        lineno = lineno + 1
        if (lineno == 1):
            #need to skip first line
            # print("Skipping line 1 which is a header")
            print(line.rstrip())
        else:
            # print("Processing line {}".format(lineno))
            line = re.sub(r',00', ',N/A', line.rstrip())
            print(line)
    # print("Processed {} lines".format(lineno))

I have tried this but not working, please help!!

>Solution :

when I use print(line), its showing fine

Then just use file keyword argument of print as follows

import re

input_file = "test.csv"
output_file = "testresult.csv"

# print("Processing data from", input_file)
with open(input_file) as f, open(output_file, "w") as g:
    lineno = 0
    for line in f:
        lineno = lineno + 1
        if (lineno == 1):
            #need to skip first line
            # print("Skipping line 1 which is a header")
            print(line.rstrip(),file=g)
        else:
            # print("Processing line {}".format(lineno))
            line = re.sub(r',00', ',N/A', line.rstrip())
            print(line,file=g)
    # print("Processed {} lines".format(lineno))

Note that whilst opening input file name only is sufficient as default mode is read-text, but specyfing writing mode (w) is required for output file.

Leave a Reply