Home In a Pandas Id column, how can I insert an incremented value (max + 1) for each missing Id value

Questions

In a Pandas Id column, how can I insert an incremented value (max + 1) for each missing Id value

July 13, 2022

In a large data set (300K rows) with an Id column (primary key), I append new rows of data. New rows do not have an Id, which I am having difficulty adding. Each new Id should be an incremented value for each row – adding 1 to the maximum value in the Id column.

The data looks something like this:

import pandas as pd
# example data frame
inp = [{'Id': 0, 'Col1': 1, 'Col2': 7},
   {'Id': 1, 'Col1': 1, 'Col2': 8},
   {'Id': 2, 'Col1': 3, 'Col2': 9},
   {'Id': '', 'Col1': 1, 'Col2': 10}, 
   {'Id': 4, 'Col1': 5, 'Col2': 11},
   {'Id': '', 'Col1': 1, 'Col2': 12}
   ]
df = pd.DataFrame(inp)
# format to be like my real data
df["Id"] = pd.to_numeric(df["Id"], errors='coerce')
df["Id"] = df["Id"].astype("Int64")

print(df)
      Id  Col1  Col2
0     0     1     7
1     1     1     8
2     2     3     9
3  <NA>     1    10
4     4     5    11
5  <NA>     1    12

Needed:

      Id  Col1  Col2
0     0     1     7
1     1     1     8
2     2     3     9
3     5     1    10
4     4     5    11
5     6     1    12

Fails:

df["Id"] = np.select([df["Id"].isna()], [df["Id"].max() + 1], default=df["Id"])
print(df
          Id  Col1  Col2
0     0     1     7
1     1     1     8
2     2     3     9
3     5     1    10
4     4     5    11
5     5     1    12

df["Id"] = df.apply(lambda x: df["Id"].max() + 1 if ~isinstance(x["Id"], int) else x["Id"], axis=1)
   Id  Col1  Col2
0   5     1     7
1   5     1     8
2   5     3     9
3   5     1    10
4   5     5    11
5   5     1    12

df.sort_values(by=["Id"], inplace=True)
df.set_index("Id", inplace=True)
      Col1  Col2
Id              
0        1     7
1        1     8
2        3     9
4        5    11
<NA>     1    10
<NA>     1    12

I’m not sure if I’m even on the right track (this seems like such an obvious thing to do, but I can’t find it described from the perspective I’m taking). Lambda and looping techniques were also very slow. Am I missing some simple function to do exactly this?!?

>Solution :

Try:

add max() to cumsum on isna
fillna with the above

Code:

df['Id'] = df['Id'].fillna(df['Id'].isna().cumsum()+df['Id'].max())

Output:

   Id  Col1  Col2
0   0     1     7
1   1     1     8
2   2     3     9
3   5     1    10
4   4     5    11
5   6     1    12

identity

byMR

Published July 13, 2022

Add a comment

Merge three columns into one taking into account priority preference

byMR

July 13, 2022

Questions

Thinning algorithm in Octave, index error

byMR

July 13, 2022

Questions

How to push data into array in php?

byMR

July 13, 2022

Questions

How do I make so the window.prompt variables only pop up when a button is clicked?

byMR

July 13, 2022

Questions

How to optimize duplicated code between forms in C#

byMR

July 13, 2022

Questions

mongoose findOne not working with array search query

byMR

July 13, 2022

In a Pandas Id column, how can I insert an incremented value (max + 1) for each missing Id value

MEDevel.com: Open-source for Healthcare and Education

>Solution :

Like this:

Leave a ReplyCancel reply

Read more

Merge three columns into one taking into account priority preference

Thinning algorithm in Octave, index error

How to push data into array in php?

How do I make so the window.prompt variables only pop up when a button is clicked?

How to optimize duplicated code between forms in C#

mongoose findOne not working with array search query

Keep Up to Date with the Most Important News

In a Pandas Id column, how can I insert an incremented value (max + 1) for each missing Id value

MEDevel.com: Open-source for Healthcare and Education

>Solution :

Share this:

Like this:

Leave a ReplyCancel reply

Keep Up to Date with the Most Important News

Read more

Merge three columns into one taking into account priority preference

Thinning algorithm in Octave, index error

How to push data into array in php?

How do I make so the window.prompt variables only pop up when a button is clicked?

How to optimize duplicated code between forms in C#

mongoose findOne not working with array search query

Discover more from Dev solutions