Home Identify duplicates and assign similar index in Pandas DataFrame

Questions

Identify duplicates and assign similar index in Pandas DataFrame

May 3, 2023

Suppose that I have a sample data set that can be generated using code below

# Sample DataFrame with duplicate rows
data = {'A': [1, 2, 1, 3, 1, 2, 3, 2],
        'B': [4, 5, 4, 6, 4, 5, 6, 5],
        'C': [1, 2, 3, 4, 5, 6, 7, 8]}
df = pd.DataFrame(data)

In above dataframe I want to assign duplicate rows similar index. For index 0 would be assigned to rows 0, 1 and 4. Similarly index 1 would be assigned to rows 1, 5 and 7. Duplicates should be identified using only column A and B

>Solution :

Use groupby().ngroup:

df.index = df.groupby(['A','B']).ngroup()

Output:

duplicates

byMR

Published May 03, 2023

Add a comment

filling missing values using tidyverse syntax based on conditions within the tibble

byMR

May 3, 2023

Questions

Multiple Group By and Count in LINQ Entity Framework

byMR

May 3, 2023

Questions

Format for dynamic connection string for Azure Storage that uses container name and SAS token

byMR

May 3, 2023

Questions

Filtering dynamic HTML table with JavaScript

byMR

May 3, 2023

Questions

Filtering dynamic HTML table with JavaScript

byMR

May 3, 2023

Questions

Creating an array by shifting values

byMR

May 3, 2023

Identify duplicates and assign similar index in Pandas DataFrame

MEDevel.com: Open-source for Healthcare and Education

>Solution :

Like this:

Leave a ReplyCancel reply

Read more

filling missing values using tidyverse syntax based on conditions within the tibble

Multiple Group By and Count in LINQ Entity Framework

Format for dynamic connection string for Azure Storage that uses container name and SAS token

Filtering dynamic HTML table with JavaScript

Filtering dynamic HTML table with JavaScript

Creating an array by shifting values

Keep Up to Date with the Most Important News

Identify duplicates and assign similar index in Pandas DataFrame

MEDevel.com: Open-source for Healthcare and Education

>Solution :

Share this:

Like this:

Leave a ReplyCancel reply

Keep Up to Date with the Most Important News

Read more

filling missing values using tidyverse syntax based on conditions within the tibble

Multiple Group By and Count in LINQ Entity Framework

Format for dynamic connection string for Azure Storage that uses container name and SAS token

Filtering dynamic HTML table with JavaScript

Filtering dynamic HTML table with JavaScript

Creating an array by shifting values

Discover more from Dev solutions