Home Check duplicated indices for each subset of values in pandas dataframe

Questions

Check duplicated indices for each subset of values in pandas dataframe

April 4, 2023

I have the following dataframe:

import pandas as pd

df_test = pd.DataFrame(data=[['AP1', 'House1'],
                             ['AP1', 'House1'], 
                             ['AP2', 'House1'], 
                             ['AP3', 'House2'], 
                             ['AP4','House2'], 
                             ['AP5', 'House2']],
                       columns=['AP', 'House'],
                       index=[0, 1, 2, 0, 1, 1])

I need to check at each subset of values of a column and see if there are duplicated indices. For example, in column House, we have three entries of House1 and no duplicated indices. But for entry House2 we have one duplicated index 1.

I have tried this:

print(f'{df_test.index.duplicated().sum()} repeated entries')

But this gives 3 duplicated entries, since it does not consider each value of the column separately.

>Solution :

A possible solution:

print(df_test.reset_index().duplicated(['index', 'AP']).sum())
print(df_test.reset_index().duplicated(['index', 'House']).sum())

Output:

0
1

pandas

byMR

Published April 04, 2023

Add a comment

Why do i get an list index out of range error with entries with a length of four digits? (except that I'm probably stupid)

byMR

April 4, 2023

Questions

Setting a variable in the component from an observable

byMR

April 4, 2023

Questions

Decoding and encoding Typed Array in JS giving different result than original value

byMR

April 4, 2023

Questions

Data is being cleared in the wrong order

byMR

April 4, 2023

Questions

After dropping pandas DataFrame rows, How to still locate a row by the same index?

byMR

April 4, 2023

Questions

Get variable value in subset function – R

byMR

April 4, 2023

Check duplicated indices for each subset of values in pandas dataframe

MEDevel.com: Open-source for Healthcare and Education

>Solution :

Like this:

Leave a ReplyCancel reply

Read more

Why do i get an list index out of range error with entries with a length of four digits? (except that I'm probably stupid)

Setting a variable in the component from an observable

Decoding and encoding Typed Array in JS giving different result than original value

Data is being cleared in the wrong order

After dropping pandas DataFrame rows, How to still locate a row by the same index?

Get variable value in subset function – R

Keep Up to Date with the Most Important News

Check duplicated indices for each subset of values in pandas dataframe

MEDevel.com: Open-source for Healthcare and Education

>Solution :

Share this:

Like this:

Leave a ReplyCancel reply

Keep Up to Date with the Most Important News

Read more

Why do i get an list index out of range error with entries with a length of four digits? (except that I'm probably stupid)

Setting a variable in the component from an observable

Decoding and encoding Typed Array in JS giving different result than original value

Data is being cleared in the wrong order

After dropping pandas DataFrame rows, How to still locate a row by the same index?

Get variable value in subset function – R

Discover more from Dev solutions