Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

Proper way of cleaning csv file

I’ve got a huge CSV file, which looks like this:

1. 02.01.18;"""2,871""";"""2,915""";"""2,871""";"""2,878""";"""+1,66 %""";"""57.554""";"""166.075 EUR""";"""0,044"""
2. 03.01.18;"""2,875""";"""2,965""";"""2,875""";"""2,925""";"""+1,63 %""";"""39.116""";"""114.441 EUR""";"""0,090"""
3. 04.01.18;"""2,915""";"""3,005""";"""2,915""";"""2,988""";"""+2,15 %""";"""58.570""";"""174.168 EUR""";"""0,090"""

In the end I only want to extract the date and ratio. The dataset should look like this:

1.02.01.18, +1,66 %
2.03.01.18, +1,63 %
3.04.01.18, +2,15 %

I tried this and until now I’m just getting more trouble:

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

import pandas as pd
df = pd.read_csv("Dataset.csv", nrows=0)
print(df)
data = []
for response in df:
    data.append(
       response.split(';')
    )
print(data[0])

Do you know some better way to clean up this dataset?

>Solution :

using pandas

import pandas as pd

df = pd.read_csv('data.csv', sep=';', usecols=[0,5], names=['date', 'rate'])
df.rate = df.rate.str.strip('"')
print(df)

result

          date     rate
0  1. 02.01.18  +1,66 %
1  2. 03.01.18  +1,63 %
2  3. 04.01.18  +2,15 %

as mentioned in the comments – you probably don’t need the extra index in the date columns. Also the index and excessive quoting suggests the file was not created properly in the first place and the process should be fixed.

Note, right now both columns are type str.Probably not what you want…

Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading