Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

Python pandas find the largest value after date of each row

What is the best way to find the largest value after date of each row, for example i have this dataframe:

import pandas as pd
data = [[20200101, 10], [20200102, 16], [20200103, 14], [20200104, 18]]
df = pd.DataFrame(data, columns=['date', 'value'])
print(df)

       date  value
0  20200101     10
1  20200102     16
2  20200103     14
3  20200104     18

i need to get the first largest value date after each row date :

   date  value           largest_value_date
0  20200101     10                 20200102
1  20200102     16                 20200104
2  20200103     14                 20200104
3  20200104     18                        0

of course i tried with "for" but in big data it’s very slow:

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

df['largest_value_date'] = 0
for i in range(0, len(df)):
    date = df['date'].iloc[i]
    value = df['value'].iloc[i]
    largestDate = df[(df['date'] > date) & (df['value'] > value)]
    if len(largestDate) > 0:
        df['largest_value_date'].iloc[i] = largestDate['date'].iloc[0]
print(df)


      date  value  largest_value_date
0  20200101     10            20200102
1  20200102     16            20200104
2  20200103     14            20200104
3  20200104     18                   0

>Solution :

We can speed up the whole process with numpy board cast then idxmax , get the most recent values’ id greater than the current row , then assign it back

s = df['value'].values
idx = pd.DataFrame(np.triu(s-s[:,None])).gt(0).idxmax(1)
df['new'] = df['date'].reindex(idx.replace(0,-1)).values
df
Out[158]: 
       date  value         new
0  20200101     10  20200102.0
1  20200102     16  20200104.0
2  20200103     14  20200104.0
3  20200104     18         NaN
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading