Pandas select rows where value is in either of two columns

I have a dataframe that looks like this Title Description Area 51 Aliens come to earth on the 4th of July. Matrix Hacker Neo discovers the shocking truth. Spaceballs A star-pilot for hire and his trusty sidekick must come to the rescue of a princess. I am want to select rows that contain the word… Read More Pandas select rows where value is in either of two columns

Can the .loc command be used with groupby's apply function

This question already has answers here: Pandas conditional creation of a series/dataframe column (13 answers) Your post has been associated with a similar question. If that question doesn’t answer your issue, edit your question to highlight the difference between the associated question and yours. If edited, your question will be reviewed and might be reopened.… Read More Can the .loc command be used with groupby's apply function

Create a list or array of date time using pandas

I am trying to create a list of date time in python using pandas. I want something like: 2023-05-10_00:00:00 2023-05-10_01:00:00 2023-05-10_02:00:00 ….. ….. 2023-05-10_23:00:00 So basically I want data with datetime with 1 hour increment. I tried the following dt = pd.to_datetime(‘2023-05-10′,format=’%Y-%m-%d’) print(dt) which gives me the following, and it is not exactly what I… Read More Create a list or array of date time using pandas

How to remove everything after the last occurrence of a delimiter?

I want to remove everything after the last occurrence of the _ delimiter in the HTAN Parent Biospecimen ID column. import pandas as pd df_2["HTAN Parent Biospecimen ID"] = df_2["HTAN Parent Biospecimen ID"].str.rsplit("_", 1).str.get(0) Traceback: ————————————————————————— TypeError Traceback (most recent call last) Input In [41], in <cell line: 3>() 1 # BulkRNA-seqLevel1 2 df_2 =… Read More How to remove everything after the last occurrence of a delimiter?

Reverse the sequence while keeping pairs of columns in a dataframe

Let’s say my dataframe df has this sequence of columns: [‘e’, ‘f’, ‘c’, ‘d’, ‘a’, ‘b’] And I want to reverse the sequence while keeping pairs, resulting in this sequence: [‘a’, ‘b’, ‘c’, ‘d’, ‘e’, ‘f’] If the column names were always the same, I could use this same list above to generate the desired… Read More Reverse the sequence while keeping pairs of columns in a dataframe

How to filter the rows based on a column value group by results

I have the below dataframe import pandas as pd data= [[‘A’,’2022-07-01′,3],[‘A’,’2022-07-01′,4],[‘A’,’2022-07-01′,5],[‘A’,’2022-07-02′,5],[‘A’,’2022-07-03′,6],[‘A’,’2022-07-03′,2],[‘B’,’2022-07-01′,3],[‘B’,’2022-07-01′,4],[‘B’,’2022-07-02′,5],[‘B’,’2022-07-02′,6],[‘B’,’2022-07-03′,2],[‘C’,’2022-07-01′,3],[‘C’,’2022-07-02′,4],[‘C’,’2022-07-02′,5],[‘C’,’2022-07-03′,6],[‘C’,’2022-07-04′,2]] df = pd.DataFrame(data,columns =[‘category’,’date’,’Value’]) I would like to get the all the rows from ‘category’ which are having duplicate date entries of greater than 1. Category A has three entries of date 2022-07-01, two entries of 2022-07-03..so its unique duplicate date occurence(2022-07-01,2022-07-03) is two,… Read More How to filter the rows based on a column value group by results