Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

Pandas Avoiding Summation of Duplicates

I am working with the Pandas Library and learning more about it with data manipulation and analysis. With this Dataframe ITEM

  ItemNo1 ItemNo2 ItemNo3 ItemNo4 
0   Paper   Paper    Rock   Paper
1 Scissor Scissor    Rock     NaN
2    Rock    Rock    Rock   Paper
3 Scissor   Paper Scissor   Paper
4    Rock   Paper Scissor    Rock

I’d like to create a Dataframe with two columns. Item and Unique Item Row Count. The unique item row count should only have the unique row values of Item. Meaning if there are three of the same item in the row, it will only take into account the first instance for count. This would mean the following output:


  Item UniqueItemRowCount
0 Paper                 4
1 Rock                  4
2 Scissor               3

There will be 4 for rock because for index 0 and 1 there is one rock. For index 2 and 4 there is already one Rock found so the rest are ignored/not added for the sum.

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

My main issue is trying to create a lambda function for this as I cannot make it properly get the sum. I have also tried to make the values NaN (null) if there is a duplicate found in the row but this does not work either.

>Solution :

Use DataFrame.melt with DataFrame.drop_duplicates for count duplicates per rows and then count values by Series.value_counts:

df1 = (df.melt()
         .drop_duplicates()['value']
         .value_counts()
         .rename_axis('Item')
         .reset_index(name='UniqueItemRowCount'))
print (df1)
      Item  UniqueItemRowCount
0     Rock                   4
1  Scissor                   3
2    Paper                   3
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading