Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

Why correlation matrix's column is smaller than pandas Dataframe's

When I use pandas.DataFrame.corr() to create a correlation matrix, I found the correlation matrix(corr_matrix) has 37 columns and the DataFrame(all_data) has 80 columns. In my mind, these two columns should be the same. In another word, the correlation matrix should have the shape (80 x 80). But this did not happen. I have imputed all missing data before creating the correlation matrix. So why the two columns are not equal?

The code

corr_matrix = all_data.corr(method="kendall").abs()
print("Missing value descending:\n{}\n".format(all_data.isnull().sum().sort_values(ascending=False)[:5]))
print("Original Dataframe shape: {}".format(all_data.shape))
print("Correlation Matrix shape: {}".format(corr_matrix.shape))

The output

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

Missing value descending:

MSSubClass 0

MSZoning 0

GarageYrBlt 0

GarageType 0

FireplaceQu 0

dtype: int64

Original Dataframe shape: (2904, 80)

Correlation Matrix shape: (37, 37)

>Solution :

Does the train DataFrame contain categorical columns?

Only the correlation between numerical columns is considered, categorical columns are ignored. At least, based on the following example

train = pd.DataFrame({
    "cat1": list("ABC"),
    "cat2": list("xyz"),
    "num1": [1,2,3],
    "num2": [-2,10,-5]
})

# 2 numerical and 2 categorical columns
>>> train 

  cat1 cat2  num1  num2
0    A    x     1    -2
1    B    y     2    10
2    C    z     3    -5

# only numerical columns are present 
>>> train.corr(method="kendall").abs()

          num1      num2
num1  1.000000  0.333333
num2  0.333333  1.000000
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading