Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

Pandas – Replace Last Non-None Value with None Row-wise

One common thing people seem to want to do in pandas is to replace None-values with the next or previous None-value. This is easily done with .fillna. I however want to do something similar but different.

I have a dataframe, df, with some entries. Every row has a different number of entries and they are all "left-adjusted" (if the df is 10 columns wide and some row has n<10 entries the first n columns hold the entries and the remaining columns are Nones).

What I want to do is find the last non-None entry in every row and change it to also be a None. This could be any of the columns from the first to the last.

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

I could of course do this with a for-loop but my dfs can be quite large so something quicker would be preferable. Any ideas?

Thanks!

>Solution :

With help from numpy, this is quite easy. By counting the number of None in each row one can find for each row the column with the last non-None value. Then using Numpy change this value to None:

data = np.random.random((6,10))

df = pd.DataFrame(data)
df.iloc[0, 7:] = None
df.iloc[1, 6:] = None
df.iloc[2, 5:] = None
df.iloc[3, 8:] = None
df.iloc[4, 5:] = None
df.iloc[5, 4:] = None

Original dataframe looks like this:

          0         1         2         3         4   5
0  0.992337  0.651785  0.521422       NaN       NaN NaN
1  0.912962  0.292458  0.620195  0.507071  0.010205 NaN
2  0.061320  0.565979  0.344755       NaN       NaN NaN
3  0.521936  0.057917  0.359699  0.484009       NaN NaN
isnull = df.isnull()

col = data.shape[1] - isnull.sum(axis = 1) - 1

df.values[range(len(df)), col] = None

Updated dataframe looks like this:

          0         1         2         3   4   5
0  0.992337  0.651785       NaN       NaN NaN NaN
1  0.912962  0.292458  0.620195  0.507071 NaN NaN
2  0.061320  0.565979       NaN       NaN NaN NaN
3  0.521936  0.057917  0.359699       NaN NaN NaN
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading