Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

Calculate the ratio of specific columns with the same values as other columns in Pandas

Given a DataFrame df as follows:

   id  target1  target2  value1  value2  value3
0   1        1       -1     NaN     0.0     NaN
1   2       -1        1     1.0     1.0     1.0
2   3        1       -1     1.0    -1.0    -1.0
3   4        1       -1     1.0     1.0     1.0
4   5       -1       -1    -1.0     1.0     1.0
5   6       -1       -1     1.0    -1.0    -1.0
6   7       -1        1    -1.0     NaN     1.0

Let’s say for columns target1 and target2, I hope to know which value columns (value1, value2, value3, value4, etc.) have more same values with them respectively. Please note if value column’s values are NaNs, it will not be included in the comparison.

For example, for target1-value1, we find id is 3, 4, 5 and 7, they have same values, so the ratio is 4/6 = 0.666667 (id==1 will not be included since value1 is NaN for that cell). Same logic for other columns.

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

So my question is how could I get the expected result below with Pandas?

  value_cols   target1   target2
0     value1  0.666667  0.333333
1     value2  0.333333  0.500000
2     value3  0.333333  0.666667

>Solution :

You could use apply. The general idea is to iterate over the "target" columns and count the number of matches with "value" columns (with eq on axis + sum); then find the ratio by dividing the totals found with the number of non-NaN rows for each "value" column:

val_cols = ['value1','value2','value3']
totals = df[val_cols].notna().sum()
out = df[['target1','target2']].apply(lambda col: df[val_cols].eq(col, axis=0).sum() / totals)

Output:

         target1   target2
value1  0.666667  0.333333
value2  0.333333  0.500000
value3  0.333333  0.666667
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading