Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

creating list of top values in all columns pandas

I am trying to get list of top 2 value counts in all columns in my pandas dataframe. DF is something like this

            column1          column2          column3
 1           apple            red               cat
 2          banana            blue              dog
 3          grapes            yellow            cat
 4           apple            blue              cat
 5          banana            red               tiger
 6          banana            blue              dog

I want the result to be in the form of a list. Something like this:

 ['banana', 'apple', 'blue', 'red', 'cat', 'dog']

can someone please help me with this?

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

>Solution :

Use Series.value_counts per all columns and filter top values by index with slice (because value_counts sorting values) and then convert values to list:

a = df.apply(lambda x: x.value_counts()[:2].index.tolist()).to_numpy().ravel('F').tolist()
print (a)
['banana', 'apple', 'blue', 'red', 'cat', 'dog']

List comprehension solution with flatten values:

a = [x for c in df.columns for x in df[c].value_counts()[:2].index]
print (a)
['banana', 'apple', 'blue', 'red', 'cat', 'dog']
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading