I have been struggling to find a nice solution to this problem:
I have a 100-rows-dataframe df1
, Col 0, Col 1, Col 2, Score, Combined
...
23 Oslo, isCapitalOf, Norway, 0.9, None
...
I also have another 100-rows-dataframe df2
, Col 0, Col 1, Col 2, Score, Combined
...
43 Norway, highestMountain, Galdhøpiggen, 0.7, None
...
The aim is to update the score value in df2 by
applying a function (for instance multiplication), between the score values in the df1 and df2, where the column value in df1 col2 equals the column value df2 col 0, which is Norway in the example.
, Col 0, Col 1, Col 2, Score, Combined
...
43 Norway, highestMountain, Galdhøpiggen, 0.7, 0.63
...
Then, do this for all mappings between these columns, not just "Norway". Any help on this is much appreciated!
>Solution :
With this solution you can use any function you want:
def apply_function_on_mapped_scores(df1, df2, func):
# Create a mapping from df1 Col 2 values to Score values
mapping = df1.set_index('Col 2')['Score']
# Apply the function on the mapped score values
df2['Combined'] = df2.apply(lambda x: func(x['Score'], mapping.get(x['Col 0'], 1)), axis=1)
return df2
# Test the function with multiplication
def multiply(x, y):
return x * y
df2 = apply_function_on_mapped_scores(df1, df2, multiply)
# Test the function with minimum
def minimum(x, y):
return np.minimum(x, y)
df2 = apply_function_on_mapped_scores(df1, df2, minimum)