Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

remove multiple lines with characters in txt file in python

I have multiple txt files like this :

https://ftp.ncbi.nlm.nih.gov/dbgap/studies/phs001672/analyses/phs001672.pha004730.txt

The file is saved in C:\Users\test.txt

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

How can we remove the first lines with the comments (lets say 20 lines) and save a new csv file only with the table in python ?

>Solution :

You can use read_table with a custom comment :

url = "https://ftp.ncbi.nlm.nih.gov/dbgap/studies/"
      "phs001672/analyses/phs001672.pha004730.txt"
​
df = pd.read_table(url, comment="#")


Output :

print(df)

              ID  Analysis ID       SNP ID  ...  Coded Allele  Sample size  Bin ID
0      506214698         4730    rs1300646  ...             A         8542       6
1      506218329         4730   rs76749734  ...             A          942     158
2      506216207         4730   rs80286553  ...             A        90924      26
...          ...          ...          ...  ...           ...          ...     ...
31662  506245867         4730   rs71334010  ...             A       317118    1422
31663  506245880         4730  rs113480342  ...             A       314121    1422
31664  506245884         4730  rs140069817  ...             T       307546    1422

[31665 rows x 22 columns]
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading