I’m new to pandas and I’m trying to merge the following 2 dataframes into 1 :
nopat
0 2021-12-31 3.580000e+09
1 2020-12-31 6.250000e+08
2 2019-12-31 -1.367000e+09
3 2018-12-31 2.028000e+09
capital_employed
0 2021-12-31 5.924000e+10
1 2020-12-31 6.062400e+10
2 2019-12-31 5.203500e+10
3 2018-12-31 5.441200e+10
When I try to apply a function to my new datframe, all columns disappear. Here is my code :
roce_by_year = pd.merge(nopat, capital_employed) \
.rename(columns={"": "date"}) \
.sort_values(by='date') \
.apply(lambda row: compute_roce(row['nopat'], row['capital_employed']), axis=1) \
.reset_index(name='roce')
Here is the result :
index roce
0 3 3.727119
1 2 -2.627078
2 1 1.030945
3 0 6.043214
I would like to have the following result :
date roce
0 2018 3.727119
1 2019 -2.627078
2 2020 1.030945
3 2021 6.043214
Do you have an explanation ?
>Solution :
If you want a method-chained solution, you could use something like this:
import pandas as pd
roce_by_year = (
pd.merge(nopat, capital_employed)
.rename(columns={"": "date"})
.assign(
date=lambda xdf: pd.to_datetime(
xdf["date"], errors="coerce"
).dt.year
)
.assign(
roce=lambda xdf: xdf.apply(
lambda row: compute roce(
row["nopat"], row["capital_employed"]
), axis=1
)
)
.sort_values("date", ascending=True)
)[["date", "roce"]]