I have two long dataframes like so:
df1 = pd.DataFrame([[1, 1],
[2, 1],
[3, 1],
[2, 1],
[2, 2],
[3, 2],
[5, 3]], columns=['value', 'id'])
df2 = pd.DataFrame([1, 2, 3, 4])
The first column of df1 act as the value, while the second column is an ID column.
What I want to do is multiply df1 by df2, the first value of each ID is multiplied by the first value of df2, the second value multiplied by the second and so on. So for instance, for id=1, we would get [1x1=1, 2x2=4, 3x3=9, 4x2=8], for id=2, we would get [2x1=2, 3x2=6],… It is guaranteed that the values for each ID does not appear more than the length of df2, and ultimately the result would be something like
1
4
9
8
2
6
5
>Solution :
code
out = df1.groupby('id').cumcount().map(df2[0]).mul(df1['value'])
out
0 1
1 4
2 9
3 8
4 2
5 6
6 5
dtype: int64
@wjandrea’s comment:
Enumerate each group of ids, use those to index into df2[0], then multiply.