Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

What is pandas equivalent of the following SQL?

OK, I have a dataframe that looks like the following:

>> df
id  trip_id segment_id  session_id  start_timestamp     lat_start   lon_start   lat_end     lon_end travelmode
563097015   563097  15  128618  2017-05-20 17:47:12+01  41.1783308  -8.5949878  41.1784478  -8.5948463  0
563097013   563097  13  128618  2017-05-20 17:45:29+01  41.1781344  -8.5951169  41.1782919  -8.5950689  0
563097011   563097  11  128618  2017-05-20 17:43:41+01  41.1781196  -8.5954075  41.1782139  -8.5950689  0
563097009   563097  9   128618  2017-05-20 17:41:48+01  41.1782497  -8.595197   41.1781101  -8.5954124  0
563097003   563097  3   128618  2017-05-20 17:10:29+01  41.1832512  -8.6081606  41.1782561  -8.5950259  0

In SQL, to filter unique segments (segment_id) by travelmode I will do:

SELECT travelmode, COUNT(DISTINCT segment_id) AS NumOfSegments
FROM df_table
GROUP BY travelmode

What is the pandas equivalent of this expression?

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

>Solution :

Maybe:

 df.groupby('travelmode').segment_id.nunique()

as suggested in this post.

Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading