Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

Creating formula percentage between 2 column in dataframe

For example I have this kind of dataframe

df = pd.DataFrame({
    'A': [True, True, False, True, False],
    'B': [False, False, True, True, True],
    'C': [False, True, True, True, True],
    'D': [False, False, False, True, False],
    'E': [False, True, True, False, False]})

    A       B       C       D       E
0   True    False   False   False   False
1   True    False   True    False   True
2   False   True    True    False   True
3   True    True    True    True    False
4   False   True    True    False   False

I want to create a table that calculate the percentage of subset between two column that has True value based on this formula

(x,y) = countIsAllTrue( col_x & col_y ) / countIsTrue(col_x)

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

in this example, the output should be like this (I put the calculation process here just for explanation how to use the formula that I talk about)

    A       B       C       D       E
A   3/3=1   1/3     2/3     1/3     1/3
B   1/3     3/3=1   3/3     1/3     1/3
C   2/4     3/4     4/4=1   1/4     2/4
D   1/1=1   1/1=1   1/1=1   1/1=1   0/1=0
E   1/2     1/2     2/2=1   0/2=0   2/2=1

Anyone can help me to apply this formula?

Note: I used boolean because I thought it is better than using 1/0, and can be counted or sum as 1/0, or is it actually doesnt really matter of which one I used?

>Solution :

Let us do dot product

s = df.astype(int)
(s.T @ s).div(s.sum(), axis=0)

          A         B         C         D         E
A  1.000000  0.333333  0.666667  0.333333  0.333333
B  0.333333  1.000000  1.000000  0.333333  0.333333
C  0.500000  0.750000  1.000000  0.250000  0.500000
D  1.000000  1.000000  1.000000  1.000000  0.000000
E  0.500000  0.500000  1.000000  0.000000  1.000000
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading