Assign values to different columns of the same row in a dataframe in python

July 8, 2022

I have the following dictionary:

   test =  {'AAGUFU 60 (MDE).jpg': 0.2825904813711154,
     'AAGUFU 60 (MCE).jpg': 0.27073007232248,
     'AAGUFU 60 (MCA).jpg': 0.3736480594737323,
     'AAGUFU 60 (MCP).jpg': 0.45155877307246917}

and the following initialized dataframe:

df = pd.DataFrame(columns = ["specimen", "MDE", "MCE", "MCA", "MCP"])
specimen    MDE MCE MCA MCP

I wrote the following code that does: 1) extract the filename (e.g. AAGUFU 60) and after that extract the abbreviation between parenthesis (e.g. MDE); 2) then I want to store the catalog name (AAGUFU 60) to the specimen column of the dataframe, and after that, store the value of each value of the dictionary corresponding to each abbreviation of the dataframe columns in the same row of the filename

I wrote the following code but it ins’t working. I read somewhere saying not to add values iteratively to rows of a dataframe because it is computationaly expensive. Any alternative to that? Maybe creating a list of dictionaries to apply from_dict() to it? Also, I think the nested fors in my code aren’t efficient and would like some hints to improve its efficiency

for specimenCatalog in test:

filename = specimenCatalog

#get filename
specimen = re.search('.+(?= \()', filename)
specimen.group(0)

#get abbreviation btw parenthesis
muscle = filename[filename.find('(')+1:filename.find(')')]
muscle

for measurement in test.values(): 

    for index, value in df.iterrows():
        if pd.isna(df.specimen[index]) == True:
            df.specimen[index] = specimen.group(0)
        else:
            continue

        df.at[index, muscle] = measurement

So my expected output would be a dataframe as follow, and I will need to add more rows of the dataframe with other similar dictionaries:

specimen    MDE     MCE     MCA    MCP
AAGUFU 60   0.282  0.270  0.373  0.451

>Solution :

What you could do is:

test =  {'AAGUFU 60 (MDE).jpg': 0.2825904813711154,
     'AAGUFU 60 (MCE).jpg': 0.27073007232248,
     'AAGUFU 60 (MCA).jpg': 0.3736480594737323,
     'AAGUFU 60 (MCP).jpg': 0.45155877307246917}
test = [(key, value) for key, value in test.items()]
# turn dictionary into DataFrame
df = pd.DataFrame(
    columns=['specimen', 'value'],
    data=test
)

# define and use functions for transforming strings
def get_type(s):
    return s[s.find('(') + 1:s.find(')')]
def get_specimen(s):
    return s[:s.find('(')]
df['specimen'].apply(get_type)
df['specimen'] = df['specimen'].apply(get_specimen)

# turn MDE, MCE etc. into columns
df.pivot(index='specimen', columns='type', values='value')