I have a dataframe. Column one has a list of numbers. Second column has average of list of numbers in column one. I need to create third column such that I subtract mean value from each of the elements of column one.
df = pd.DataFrame({'A':[[4.2,2.3,6.5,2.3],[4.1,5.3,6.5,3.8]]})
df['avg'] = df['A'].apply(lambda p: np.average(p))
df['a_avg' = df['A'].apply(lambda p: (np.array(p)-df['avg']).to_list())
Expected output:
df
A avg a_avg
0 [4.2,2.3,6.5,2.3] 3.825 [0.375, -1.525, 2.675, -1.525]
1 [4.1,5.3,6.5,3.8] 4.925 [-0.825, 0.375, 1.575, -1.125]
I created column two for my clarity. if there is a way we can directly get column three from column one, that is also good. whats wrong with the code i have written?
>Solution :
..if there is a way we can directly get column three from column one,
that is also good.
I would do it this way :
df["a_avg"] = [[round(e - np.average(lst), 3) for e in lst] for lst in df["A"]]
Output :
print(df)
A a_avg
0 [4.2, 2.3, 6.5, 2.3] [0.375, -1.525, 2.675, -1.525]
1 [4.1, 5.3, 6.5, 3.8] [-0.825, 0.375, 1.575, -1.125]