Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

How to list highest correlation pairs (one spec. column with all others) in pandas?

To find all top correlations you can use the following code according List Highest Correlation Pairs from a Large Correlation Matrix in Pandas?:

d = {'col1': [1, 2], 'col2': [3, 4], 'col3': [7,3]}
df = pd.DataFrame(data=d)


df.corr().unstack().sort_values().drop_duplicates()

How do I have to change the above line in order to compare just one specific column with all others?

I do not want to compare col2 to col3. Just the correlation of col1 to col2 and col1 to col3 is important to me.

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

>Solution :

You can first compute the full correlation just using df.corr().
After that you can select the row of the correlation matrix that is returned by df.corr() in which you are interested in.

Say you are interested in the correlation between col1 and the others:

d = {'col1': [1, 2], 'col2': [3, 4], 'col3': [7,3]}
df = pd.DataFrame(data=d)

df.corr().loc['col1']

# col1    1.0
# col2    1.0
# col3   -1.0
# Name: col1, dtype: float64
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading