I have the following pandas dataframe:
df = pd.DataFrame({0: [11, 12, 31], 1: [6, 14, 27], 2: [11, 24, 21], 3: [1, 24, 20]})
0 1 2 3
0 11 6 11 1
1 12 14 24 24
2 31 27 21 20
For each row at the time, I want to implement a polynomial regression, where column names are X and row values are Y.
I know I can use iterrows:
x=(df.columns).to_numpy()
for index, row in df.iterrows():
print(np.polyfit(x,row,2))
which produces:
[-1.25 1.25 9.75]
[-0.5 6.1 11.1]
[ 0.75 -6.15 31.35]
but this can take a long time on large dataframes.
Is there a faster way to do this? Thanks
>Solution :
polyfit can take a 2d y-param of the same length with x, so:
np.polyfit(df.columns, df.T, 2).T
gives:
array([[-1.25, 1.25, 9.75],
[-0.5 , 6.1 , 11.1 ],
[ 0.75, -6.15, 31.35]])