I have a dataframe similar to this:
import pandas as pd
id = [1001, 1002, 1003]
a = [156, 224, 67]
b = [131, 203, 61]
c = [97, 165, 54]
d = [68, 122, 50]
value = [71, 180, 66]
df = pd.DataFrame({'id':id, 'a':a, 'b':b, 'c':c, 'd':d, 'value':value})
| id | a | b | c | d | value |
|---|---|---|---|---|---|
| 1001 | 156 | 131 | 97 | 68 | 71 |
| 1002 | 224 | 203 | 165 | 122 | 180 |
| 1003 | 67 | 61 | 54 | 50 | 66 |
For each row, I would like to evaluate columns a-d and within them identify the next lowest and next highest values, as compared to value. So in this example, the expected result would look like:
| id | a | b | c | d | value | nxt_low | nxt_high |
|---|---|---|---|---|---|---|---|
| 1001 | 156 | 131 | 97 | 68 | 71 | 68 | 97 |
| 1002 | 224 | 203 | 165 | 122 | 180 | 165 | 203 |
| 1003 | 67 | 61 | 54 | 50 | 66 | 61 | 67 |
I have tried creating a single column with a numpy array from a-d and trying to do some operations that way, but I’m not applying it correctly and have been unable to get the desired result. Any help is greatly appreciated.
>Solution :
you can get nearest low following code:
df.apply(lambda x: x[x < x[-1]].max(), axis=1)
output:
0 68
1 165
2 61
dtype: int64
get nearest low and high and make result to columns:
df[['nxt_low', 'nxt_high']] = df.apply(lambda x: [x[x < x[-1]].max(), x[x > x[-1]].min()], axis=1, result_type='expand')
df:
id a b c d value nxt_low nxt_high
0 1001 156 131 97 68 71 68 97
1 1002 224 203 165 122 180 165 203
2 1003 67 61 54 50 66 61 67
If id is nearest low or high, modify code a bit.
df[['nxt_low', 'nxt_high']] = df.iloc[:, 1:].apply(lambda x: [x[x < x[-1]].max(), x[x > x[-1]].min()], axis=1, result_type='expand')