Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

Why I get NAN when calculate corr of those two pandas series?

I have two pandas Series object:

print(s1)
print("="*80)
print(s2)

# output

0   -0.443538
1   -0.255012
2   -0.582948
3   -0.393485
4    0.430831
5    0.232216
6   -0.014269
7   -0.133158
8    0.127162
9   -1.855860
Name: s1, dtype: float64
================================================================================
29160   -0.650857
29161   -0.135428
29162    0.039544
29163    0.241506
29164   -0.793352
29165   -0.054500
29166    0.901152
29167   -0.660474
29168    0.098551
29169    0.822022
Name: s2, dtype: float64

And I want to calculate corr of those two series:

s1.corr(s2)

#output

nan

I don’t know why I get ‘nan’ here, using numpy gives the correct result:

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

np.corrcoef(s1,s2)[0][1]

#output

-0.4918385039519204

Did I do something wrong in the above code?

>Solution :

Your series indices are not aligned. Pandas realigns s2 on s1 yielding only NaNs.

You can force the index of s1 on s2 using set_axis:

s1.corr(s2.set_axis(s1.index))

Output: -0.49183852303556697

Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading