I have a dictionary called model_scores_for_datasets that looks like this:
{'Unprocessed': {'Logistic Regression': '0.967', 'Support Vector Machine': '0.967', 'Decision Tree': '0.933', 'Random Forest': '0.933', 'LinearDiscriminant': '1.000', 'K-Nearest Neighbour': '1.000', 'Naive Bayes': '0.967', 'XGBoost': '0.933'}, 'Standardisation': {'Logistic Regression': '0.933', 'Support Vector Machine': '0.967', 'Decision Tree': '0.933', 'Random Forest': '0.967', 'LinearDiscriminant': '0.967', 'K-Nearest Neighbour': '0.967', 'Naive Bayes': '0.967', 'XGBoost': '0.933'}, 'Normalisation': {'Logistic Regression': '0.967', 'Support Vector Machine': '0.967', 'Decision Tree': '0.933', 'Random Forest': '0.967', 'LinearDiscriminant': '0.967', 'K-Nearest Neighbour': '0.967', 'Naive Bayes': '0.967', 'XGBoost': '0.933'}, 'Rescale': {'Logistic Regression': '0.967', 'Support Vector Machine': '0.967', 'Decision Tree': '0.933', 'Random Forest': '0.933', 'LinearDiscriminant': '0.967', 'K-Nearest Neighbour': '0.967', 'Naive Bayes': '0.967', 'XGBoost': '0.933'}}
{'Unprocessed': {'Logistic Regression': '0.967', 'Support Vector Machine': '0.967', 'Decision Tree': '0.933', 'Random Forest': '0.933', 'LinearDiscriminant': '1.000', 'K-Nearest Neighbour': '1.000', 'Naive Bayes': '0.967', 'XGBoost': '0.933'}, 'Standardisation': {'Logistic Regression': '0.933', 'Support Vector Machine': '0.967', 'Decision Tree': '0.933', 'Random Forest': '0.967', 'LinearDiscriminant': '0.967', 'K-Nearest Neighbour': '0.967', 'Naive Bayes': '0.967', 'XGBoost': '0.933'}, 'Normalisation': {'Logistic Regression': '0.967', 'Support Vector Machine': '0.967', 'Decision Tree': '0.933', 'Random Forest': '0.967', 'LinearDiscriminant': '0.967', 'K-Nearest Neighbour': '0.967', 'Naive Bayes': '0.967', 'XGBoost': '0.933'}, 'Rescale': {'Logistic Regression': '0.967', 'Support Vector Machine': '0.967', 'Decision Tree': '0.933', 'Random Forest': '0.933', 'LinearDiscriminant': '0.967', 'K-Nearest Neighbour': '0.967', 'Naive Bayes': '0.967', 'XGBoost': '0.933'}}
I want to get the average for each, dictionary in the list of dictionaries. There are 4 total "Unprocessed
Standardisation
Normalisation
Rescale"
and 8 total metrics for each that look like this:
{'Logistic Regression': '0.967', 'Support Vector Machine': '0.967', 'Decision Tree': '0.933', 'Random Forest': '0.933', 'LinearDiscriminant': '1.000', 'K-Nearest Neighbour': '1.000', 'Naive Bayes': '0.967', 'XGBoost': '0.933'}
So each of the 4 scales have 8 different ML altos and I want to get an average to say that for example on average "standardisation" scored the highest so it will be used during the machine learning process.
This is the code, but it giving me an error: TypeError: can't convert type 'str' to numerator/denominator
avgDict = model_scores_for_datasets
for st,vals in avgDict.items():
print(st,(vals))
#print (st)
for st,vals in avgDict.items():
print("Average for {} is {}".format(st,mean(vals)))
>Solution :
import numpy as np
for mode in results.keys():
mean = np.mean([float(value) for value in results[mode].values()])
print(f"{mode}: {mean}")
Out:
Unprocessed: 0.9624999999999999
Standardisation: 0.9542499999999999
Normalisation: 0.9584999999999999
Rescale: 0.9542499999999999
For PythonCrazy
print({mode: np.mean([float(value) for value in results[mode].values()]) for mode in results.keys()})