I tried to implement formula from Wikipedia but results are different. Why is it so?
y_true = np.array([1, 1, 0])
y_pred = np.array([1, 0, 1])
r2 = r2_score(y_true, y_pred)
print(r2)
y_true_mean = statistics.mean(y_true)
r2 = 1 - np.sum((y_true - y_pred) ** 2) / np.sum((y_true - y_true_mean) ** 2)
print(r2)
-1.9999999999999996
0.0
>Solution :
Not sure what statistics package you use, but it seems that the different outcome originates there. Try to use np.mean
instead. That gives the same R2 as sklearn:
import numpy as np
y_true = np.array([1, 1, 0])
y_pred = np.array([1, 0, 1])
y_true_mean = np.mean(y_true)
r2 = 1 - np.sum((y_true - y_pred) ** 2) / np.sum((y_true - y_true_mean) ** 2)
print(r2)