create id number for unique combination of column values

I have a dataframe like this: df = pd.DataFrame({"year": [2000,2000,2000,2001,2001,2001], "A": [1,1,0,0,1,0], "B": [4,4,6,10,10,10]}) df year A B 0 2000 1 4 1 2000 1 4 2 2000 0 6 3 2001 0 10 4 2001 1 10 5 2001 0 10 I would like to create a unique id number for each combination of… Read More create id number for unique combination of column values

Unable to split the column into multiple columns based on the first column value

I’ve a data frame which contains one column. Below is the example Questionsbysortorder Q1-4,Q2-3,Q3-2,Q4-3,Q5-3 Q1-1,Q2-2,Q3-1,Q4-1 Q1-5,Q2-3,Q3-3 I’m trying to explode the columns with the help of already given row values. Like below is the example Questionsbysortorder Q1 Q2 Q3 Q4 Q5 Q1-4,Q2-3,Q3-2,Q4-3,Q5-3 4 3 2 3 3 Q1-1,Q2-2,Q3-1,Q4-1 1 2 1 1 NA Q1-5,Q2-3,Q5-3 5… Read More Unable to split the column into multiple columns based on the first column value

Mask using groupby apply function

df person time_bought product 42 abby 10min fruit 12 abby 5min fruit 10 abby 10min other 3 barry 12min fruit … How could I convert the lines below into a a generalisable function, since I’m using groupby all the time. ref = df.groupby(‘person’)[‘time_bought’].shift() m1 = df.loc[df.product=="fruit", ‘time_bought’].groupby(df[‘person’]).diff().gt(ref) m2 = df[‘product’].ne(‘fruit’) df[‘new_group’] = m1.groupby(df[‘person’]).cumsum().add(1).mask(m2) # gives… Read More Mask using groupby apply function

Compute avg gap between dates and max date for each group using pandas

I have a dataframe like as shown below sub_id,teacher,div,pid,pos_date 1,ABC,SCIENCE,A1,12/10/2021 1,ABC,SCIENCE,A1,22/06/2019 1,ABC,SCIENCE,A1,12/12/2018 1,ABC,SCIENCE,A1,27/11/2020 1,DEF,CHEMISTRY,A1,12/10/2021 1,DEF,CHEMISTRY,A2,11/11/2018 1,DEF,CHEMISTRY,A2,12/10/2021 1,ABC,SCIENCE,A2,12/10/2019 1,ABC,SCIENCE,A2,12/10/2020 1,ABC,SCIENCE,A3,12/11/2021 1,ABC,SCIENCE,A3,22/03/2022 1,ABC,SCIENCE,A4,22/10/2021 1,ABC,SCIENCE,A4,12/04/2021 df = pd.read_clipboard() I would like to do the below a) Group by sub_id,teacher,div and pid b) For each group, compute the below 1) Max(pos_date) 2) Average gap between each pos_date 3) Median… Read More Compute avg gap between dates and max date for each group using pandas

Create new column from a row value in a grouped data frame?

I have a data frame data data_ = {‘ID’: [777, 777, 777,777,777,777],’Month’:[1,1,1,2,2,2], ‘Salary’: [130,170,50,140,180,60], ‘O’: ["AC","BR","BR","AC","BR","BR"], ‘D’:["LF","AC","LF","LF","AC","LF"], ‘B’:[True,True,False,True,True,False]} data = pd.DataFrame(data=data_) for each subgroup of this data frame: Subgroup=data.groupby(["ID","Month"]) I would like to add a new column NEW_Salary filled with the values of Salary where B is false in each subgroup as show in the… Read More Create new column from a row value in a grouped data frame?

How to add new columns in Pandas for unique values of certain key (problem agregate)

How to add a new column of aggregated data I want to create 03 new columns in a dataframe Column 01: unique_list Create a new column in the dataframe of unique values of cfop_code for each key Column 02: unique_count A column that check the number of unique values that shows in unique_list Column 03:… Read More How to add new columns in Pandas for unique values of certain key (problem agregate)

Append % symbol to dict numeric values in a dataframe column

I have a dataframe like as shown below key, values_list 1, {‘ABC’:100} 2, {‘DEF’:100} 3, {‘ASE’:95,’ABC’:5} 4, {‘ABC’:55,’ASE’:40,’DEF’:5} 5, {‘DEF’:90,’ABC’:5,’ASE’:2.5,’XYZ’:2.5} I would like to do the below a) Convert dict values to string and include % symbol at the end of each string So, I tried the below df[‘values_list’].str.replace(r'[0-9]+’, ‘[0-9]%’) # Approach 1 np.where(df[‘values_list’].str.isdigit(),df[‘values_list’]+’%’,df[‘values_list’]) #Approach… Read More Append % symbol to dict numeric values in a dataframe column