Home Python: Fastest Way to get no of duplicates of each uniqe row in a pandas DF full of duplicates

Questions

Python: Fastest Way to get no of duplicates of each uniqe row in a pandas DF full of duplicates

April 6, 2023

So I have a pandas dataframe full of repeats. I’m trying to count how many times each row has been repeated and put that into a column.

Here’s my code – found it on stack overflow

def getUniqCounts2(dupDF ):
    countARR=[]
    uniqDF = getUniq(dupDF)
    for i in range( 0,len(uniqDF)):
        df2 = len(dupDF[(dupDF["variable1"]==uniqDF['variable1'][i]) &
         (dupDF["variable2"]==uniqDF['variable2'][i])])
        print(df2)
        countARR.append(df2)
    uniqDF['count'] = countARR
    return uniqDF

Someone did recommend the pivot table function, but the problem is that the resulting DF has a column that is only partially filled even though the input data frame had all columns filled.

Now I have a DF of 1.5M rows. I just want to get the unique rows, and the no of repetitions found per row. What would be the fastest way to do this?

>Solution :

Suppose you have a DataFrame like named df, you can count the number of occurrences by simply using groupby() function like below:

counts = df.groupby(df.columns.tolist()).size().reset_index().rename(columns={0:'count'})

I guess this is much more efficient.

duplicates

byMR

Published April 06, 2023

Add a comment

How to access subclass variable from superclass?

byMR

April 6, 2023

Questions

Search and replace is adding not replacing

byMR

April 6, 2023

Questions

Dynamically accessing object property in typescript

byMR

April 6, 2023

Questions

How can I write a function that finds any and all elements of a list that add to a target?

byMR

April 6, 2023

Questions

Partially color the name of a listcolumn

byMR

April 6, 2023

Questions

How to make list item in the center of the screen?

byMR

April 6, 2023

Python: Fastest Way to get no of duplicates of each uniqe row in a pandas DF full of duplicates

MEDevel.com: Open-source for Healthcare and Education

>Solution :

Like this:

Leave a ReplyCancel reply

Read more

How to access subclass variable from superclass?

Search and replace is adding not replacing

Dynamically accessing object property in typescript

How can I write a function that finds any and all elements of a list that add to a target?

Partially color the name of a listcolumn

How to make list item in the center of the screen?

Keep Up to Date with the Most Important News

Python: Fastest Way to get no of duplicates of each uniqe row in a pandas DF full of duplicates

MEDevel.com: Open-source for Healthcare and Education

>Solution :

Share this:

Like this:

Leave a ReplyCancel reply

Keep Up to Date with the Most Important News

Read more

How to access subclass variable from superclass?

Search and replace is adding not replacing

Dynamically accessing object property in typescript

How can I write a function that finds any and all elements of a list that add to a target?

Partially color the name of a listcolumn

How to make list item in the center of the screen?

Discover more from Dev solutions