Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

Assign values to different columns of the same row in a dataframe in python

I have the following dictionary:

   test =  {'AAGUFU 60 (MDE).jpg': 0.2825904813711154,
     'AAGUFU 60 (MCE).jpg': 0.27073007232248,
     'AAGUFU 60 (MCA).jpg': 0.3736480594737323,
     'AAGUFU 60 (MCP).jpg': 0.45155877307246917}

and the following initialized dataframe:

df = pd.DataFrame(columns = ["specimen", "MDE", "MCE", "MCA", "MCP"])
specimen    MDE MCE MCA MCP

I wrote the following code that does: 1) extract the filename (e.g. AAGUFU 60) and after that extract the abbreviation between parenthesis (e.g. MDE); 2) then I want to store the catalog name (AAGUFU 60) to the specimen column of the dataframe, and after that, store the value of each value of the dictionary corresponding to each abbreviation of the dataframe columns in the same row of the filename

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

I wrote the following code but it ins’t working. I read somewhere saying not to add values iteratively to rows of a dataframe because it is computationaly expensive. Any alternative to that? Maybe creating a list of dictionaries to apply from_dict() to it? Also, I think the nested fors in my code aren’t efficient and would like some hints to improve its efficiency

for specimenCatalog in test:

filename = specimenCatalog

#get filename
specimen = re.search('.+(?= \()', filename)
specimen.group(0)

#get abbreviation btw parenthesis
muscle = filename[filename.find('(')+1:filename.find(')')]
muscle

for measurement in test.values(): 

    for index, value in df.iterrows():
        if pd.isna(df.specimen[index]) == True:
            df.specimen[index] = specimen.group(0)
        else:
            continue

        df.at[index, muscle] = measurement

So my expected output would be a dataframe as follow, and I will need to add more rows of the dataframe with other similar dictionaries:

specimen    MDE     MCE     MCA    MCP
AAGUFU 60   0.282  0.270  0.373  0.451

>Solution :

What you could do is:

test =  {'AAGUFU 60 (MDE).jpg': 0.2825904813711154,
     'AAGUFU 60 (MCE).jpg': 0.27073007232248,
     'AAGUFU 60 (MCA).jpg': 0.3736480594737323,
     'AAGUFU 60 (MCP).jpg': 0.45155877307246917}
test = [(key, value) for key, value in test.items()]
# turn dictionary into DataFrame
df = pd.DataFrame(
    columns=['specimen', 'value'],
    data=test
)

# define and use functions for transforming strings
def get_type(s):
    return s[s.find('(') + 1:s.find(')')]
def get_specimen(s):
    return s[:s.find('(')]
df['specimen'].apply(get_type)
df['specimen'] = df['specimen'].apply(get_specimen)

# turn MDE, MCE etc. into columns
df.pivot(index='specimen', columns='type', values='value')
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading