Update a second column if a number in the first column is an integer

Advertisements

I have a dataframe as follows:

 A    |    B    | 
-----------------
 AA   |    101  |   
 AA   |    102  |    
 AA   |    103.5|    
 AA   |    104  |    
 AA   |    105  |    

And basically, I would like to add a column which increases by 1, but if B is a decimal number then it skips it such that I get a dataframe like this:

 A    |    B    |    C
------------------------
 AA   |    101  |    1
 AA   |    102  |    2
 AA   |    103.5|    
 AA   |    104  |    3
 AA   |    105  |    4

I tried using something like this:

df.insert(2, 'C', range(1,  len(df)))

df.loc[is_integer(df['order']), 'detailed_category_id'] =...

But I’m not too sure if this is correct, so any help would be appreciated, thanks!

>Solution :

You can use df['B'].eq(df['B'].astype(int)) to check if the value is an integer, then use this mask for boolean indexing of the mask’s cumsum:

m = df['B'].eq(df['B'].astype(int))
df.loc[m, 'C'] = m.cumsum()

print(df)

If you have groups in A and want to restart the count on new groups, rather use groupby.cumsum:

df.loc[m, 'C'] = m.groupby(df['A']).cumsum()

Output:

    A      B    C
0  AA  101.0  1.0
1  AA  102.0  2.0
2  AA  103.5  NaN
3  AA  104.0  3.0
4  AA  105.0  4.0

Leave a ReplyCancel reply