Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

Read csv with json column which is nested multiple times into dataframe

I have the following csv file with a json column data. The json data is a little bit complicated and I can’t unwrap it the way I want. My goal is to get some data from configuration into columns and the corrosponding values from readings into the row as value. I found something here but I can’t go deeper in the nested data to extract the id and the value with the solutions customized to my data.

The data I have:

system;datetime;data;
ACD;19.10.2021 11:24:00,000;{"id":"device1","type":"dataValues","payload":{"deviceId":"device1","start":"2021-10-19 11:23:00.000","end":"2021-10-19 11:24:00.000","configuration":[{"name":"DD51","id":1},{"name":"VV22","id":2}],"detectors":[{"readings":[{"reading":{"count":8}}],"id":1},{"readings":[{"reading":{"count":2}}],"id":2}]}};
ACD;19.10.2021 11:25:00,000;{"id":"device1","type":"dataValues","payload":{"deviceId":"device1","start":"2021-10-19 11:24:00.000","end":"2021-10-19 11:25:00.000","configuration":[{"name":"DD51","id":1},{"name":"VV22","id":2}],"detectors":[{"readings":[{"reading":{"count":7}}],"id":1},{"readings":[{"reading":{"count":4}}],"id":2}]}};

What I want:

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

system;datetime;data;DD51;VV22;
ACD;19.10.2021 11:24:00,000;8;2;
ACD;19.10.2021 11:25:00,000;7;4;

>Solution :

If your_file.csv contains:

system;datetime;data;
ACD;19.10.2021 11:24:00,000;{"id":"device1","type":"dataValues","payload":{"deviceId":"device1","start":"2021-10-19 11:23:00.000","end":"2021-10-19 11:24:00.000","configuration":[{"name":"DD51","id":1},{"name":"VV22","id":2}],"detectors":[{"readings":[{"reading":{"count":8}}],"id":1},{"readings":[{"reading":{"count":2}}],"id":2}]}};
ACD;19.10.2021 11:25:00,000;{"id":"device1","type":"dataValues","payload":{"deviceId":"device1","start":"2021-10-19 11:24:00.000","end":"2021-10-19 11:25:00.000","configuration":[{"name":"DD51","id":1},{"name":"VV22","id":2}],"detectors":[{"readings":[{"reading":{"count":7}}],"id":1},{"readings":[{"reading":{"count":4}}],"id":2}]}};

Then:

from ast import literal_eval

df = pd.read_csv("your_file.csv", sep=";")[["system", "datetime", "data"]]
df = pd.concat(
    [
        df,
        df.pop("data")
        .apply(literal_eval)
        .apply(
            lambda x: pd.Series(
                {
                    c["name"]: d["readings"][0]["reading"]["count"]
                    for c, d in zip(
                        x["payload"]["configuration"], x["payload"]["detectors"]
                    )
                }
            )
        ),
    ],
    axis=1,
)

Prints:

  system                 datetime  DD51  VV22
0    ACD  19.10.2021 11:24:00,000     8     2
1    ACD  19.10.2021 11:25:00,000     7     4
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading