I am trying to replicate the variable aux_35, because I have some missing values in my database. Here is a little sample of the dataset:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import dateutil.relativedelta as rd
import math
from itertools import groupby
from itertools import repeat
from operator import itemgetter
import warnings
warnings.filterwarnings('ignore')
df = pd.DataFrame({'pdt_050':[[0.683522, 0.26141],
[0.683522, 0.26141],
[0.683522, 0.26141],
[0.726501, 0.373269, 0.159278],
[0.726501, 0.373269, 0.159278],
[0.596246, 0.288327, 0.120612],
[0.353175, 0.314364, 0.159139],
[0.595886, 0.25835],
[0.582035],
[0.726501, 0.373269, 0.159278],
[0.583463, 0.366378, 0.262419, 0.19254, 0.1288, 0.064597],
[0.751279, 0.436349, 0.248187, 0.110235]
],
'aux_35': [0.683522, 0.683522,0.683522, 0.726501, 0.726501, 0.596246, 0.159139,0.25835,0.582035, 0.373269, 0.583463,
0.436349
],
'tob': [1, 1,1, 1, 1, 1, 14, 2, 1, 1, 0, 1
]
})
Basically aux_35 take data from pdt_050 and assign the value based on the variable tob. For example: when the number of tob is equal to 1 or 0, aux_35 should be the first element of the array pdt_050 and when tob is a number that is higher than the length of elements on pdt_050, aux_35 should be equal to the last element in pdt_050; as you can see on the row number six.
I was making the function to replicate that process:
def mmonths(df):
pdo = []
pdoriginal = df['pdt_050']
tob_y = df['aux_35'].astype(int)
for i in range(len(tob_y)):
tob = tob_y[i]
try:
pdo.append(pdoriginal[i][(tob)])
except:
pdo.append(pdoriginal[i][0])
return pdo
df['replica'] = mmonths(df)
But, as you can see in the following pic, it is not good. Can you help me please?
Thanks!
>Solution :
Lets apply a custom indexer function along column axis
def indexer(a, i):
return a[max(1, min(int(i), len(a))) - 1]
df['aux_35'] = df.apply(lambda s: indexer(s['pdt_050'], s['tob']), axis=1)
Result
pdt_050 tob aux_35
0 [0.683522, 0.26141] 1 0.683522
1 [0.683522, 0.26141] 1 0.683522
2 [0.683522, 0.26141] 1 0.683522
3 [0.726501, 0.373269, 0.159278] 1 0.726501
4 [0.726501, 0.373269, 0.159278] 1 0.726501
5 [0.596246, 0.288327, 0.120612] 1 0.596246
6 [0.353175, 0.314364, 0.159139] 14 0.159139
7 [0.595886, 0.25835] 2 0.258350
8 [0.582035] 1 0.582035
9 [0.726501, 0.373269, 0.159278] 1 0.726501
10 [0.583463, 0.366378, 0.262419, 0.19254, 0.1288, 0.064597] 0 0.583463
11 [0.751279, 0.436349, 0.248187, 0.110235] 1 0.751279

