Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

Get duplicate value rows with respect to column of list

i am trying to find duplicate rows with respect to the column which contain list. But unfortunately I doesn’t get my expected result. The model dataframe what i used is ,

df = pd.DataFrame(
{
    "author": ["Jefe9", "Jefe98", "Alex", "Alex", "Qbert"],
    "date": [1423112400, 1423112400, 1603112400, 1423115600, 1663526834],
    "ingredients": [
        ["ingredA", "ingredB", "ingredC"],
        ["ingredA", "ingredB", "ingredC"],
        ["ingredA", "ingredB", "ingredD"],
        ["ingredA", "ingredB", "ingredD", "ingredE"],
        ["ingredB", "ingredC", "ingredF"],
    ],
}
)

the model dataframe is,

    author  date    ingredients
0   Jefe9   1423112400  [ingredA, ingredB, ingredC]
1   Jefe98  1423112400  [ingredA, ingredB, ingredC]
2   Alex    1603112400  [ingredA, ingredB, ingredD]
3   Alex    1423115600  [ingredA, ingredB, ingredD, ingredE]
4   Qbert   1663526834  [ingredB, ingredC, ingredF]

the expected output is,

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

author  date    ingredients
0   Jefe9   1423112400  [ingredA, ingredB, ingredC]
1   Jefe98  1423112400  [ingredA, ingredB, ingredC]

The code i had tried is,

df[df.duplicated(['ingredients'])]

It gave error because it expecting for a single unit or elemental value for finding duplicate. Thanks in advance

>Solution :

You can turn ingredients column value to tuple first

out = df[(df.assign(ingredients=df['ingredients'].apply(lambda x: tuple(sorted(x))))
          .duplicated(['ingredients'], keep=False))]
print(out)

   author        date                  ingredients
0   Jefe9  1423112400  [ingredA, ingredB, ingredC]
1  Jefe98  1423112400  [ingredA, ingredB, ingredC]
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading