Assigning unique value wrt the groups in Pandas

February 27, 2022

I have a DataFrame like this:

df = pd.DataFrame({
            'idA': [11,25,11,45,11,25,35],
            'idB': [30,40,45,20,30,40,90]  
         })

I want to create a new column includes a unique value with respect to the idA and idB by each row. In other words, whenever a pair appears in any other place, the value should be exactly the same. But any other pair – including the reverse one – must be different. To overcome this, I thought to apply a made up math function like this:

df['NewValue'] = df.min(axis=1)*(df['idA']*10 + 15*df['idB'])

#   idA    idB  NewValue
#   11      30      6160
#   25      40      21250
#   11      45      8635
#   45      20      15000
#   11      30      6160
#   25      40      21250
#   35      90      59500

It seems working. But by following this, NewValue can take any number. I desire to have a simple incrementing integer instead. For this example :

NewValue = [1,2,3,4,1,2,5]

In reality, I have thousands of rows. So I am also seeking for an efficient solution. Any way to handle this?

>Solution :

IIUC, you could use groupby + ngroup:

df['NewValue'] = df.groupby(['idA','idB'], sort=False).ngroup().add(1)

Output:

   idA  idB  NewValue
0   11   30         1
1   25   40         2
2   11   45         3
3   45   20         4
4   11   30         1
5   25   40         2
6   35   90         5