Home pandas equivalent of SQL distinct

Questions

pandas equivalent of SQL distinct

December 6, 2021

I have a dataframe with multiple columns and I want to extract the rows that are unique in the manner of the SQL "select distinct" operation. So far whenever I look up forums on this I find comments about counting distinct (but I want the actual values) or (worse) values that are distinct in two columns just joined together as one set (using ravel). What I want is, for example for two columns, values that are distinct in pairs and the result as a dataframe.

I am considering now that the most effective method might be to write it myself – doing a stable sort on tuples and then scanning for duplicates. Any pandas expression that is no simpler than doing essentially that is not an answer to this question. I am looking for a basic or simple compound operation.

For those who do not know what a "distinct" in a query does …

Starting with

we get back

Note – the question was asked should (2,1) and (1,2) be considered the same. No, as tuples are ordered. Again – refer to the behaviour of SQL for the details.

>Solution :

To get the unique values of a given column, try pandas.Series.unique():

values = df['column_name'].unique()

To get unique combinations of given columns, try pandas.DataFrame.drop_duplicates():

df.drop_duplicates(subset=['colmun_name1', 'column_name2'])

dataframe

byMR

Published December 06, 2021

Add a comment

API Fetch not pulling data

byMR

December 6, 2021

Questions

Delete Array Row With .contextMenu

byMR

December 6, 2021

Questions

Dynamically creating images with a random lifespan

byMR

December 6, 2021

Questions

why the value of an array could be changed by another variable

byMR

December 6, 2021

Questions

Warning in C++: Pointer holds a value that must be examined when trying to assign new int32_t

byMR

December 6, 2021

Questions

judge the values crossing zero

byMR

December 6, 2021

pandas equivalent of SQL distinct

MEDevel.com: Open-source for Healthcare and Education

>Solution :

Like this:

Leave a ReplyCancel reply

Read more

API Fetch not pulling data

Delete Array Row With .contextMenu

Dynamically creating images with a random lifespan

why the value of an array could be changed by another variable

Warning in C++: Pointer holds a value that must be examined when trying to assign new int32_t

judge the values crossing zero

Keep Up to Date with the Most Important News

pandas equivalent of SQL distinct

MEDevel.com: Open-source for Healthcare and Education

>Solution :

Share this:

Like this:

Leave a ReplyCancel reply

Keep Up to Date with the Most Important News

Read more

API Fetch not pulling data

Delete Array Row With .contextMenu

Dynamically creating images with a random lifespan

why the value of an array could be changed by another variable

Warning in C++: Pointer holds a value that must be examined when trying to assign new int32_t

judge the values crossing zero

Discover more from Dev solutions