Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

count uniq ids between column and index (like pivot table)

I have a table:

id Name
1 A
2 A
2 B
2 C
3 B
3 C
4 C
4 A

And I need a new table where unique IDs are calculated at the intersection of names. That is, for example, ID 2 has both A and B, that is, at the intersection of A and B, there is already one ID. Something like this:

Name A B C
A 1 2
B 1 2
C 2 2

I tried with pandas.pivot_table, but i couldn’t understand how

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

>Solution :

Let us use crosstab to create frequency table, then calculate the inner product to generate similarity / intersection matrix and mask the diagonal values with 0

s = pd.crosstab(df['Name'], df['id']).clip(upper=1)
s = s @ s.T
np.fill_diagonal(s.values, 0)

print(s)

Name  A  B  C
Name         
A     0  1  2
B     1  0  2
C     2  2  0
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading