I’m trying to use pandas.DataFrame.assign in Pandas 1.5.2. Let’s consider this code, for instance:
df = pd.DataFrame({"col1":[1,2,3], "col2": [4,5,6]})
df.assign(
test1="hello",
test2=df.test1 + " world"
)
I’m facing this error:
AttributeError: ‘DataFrame’ object has no attribute ‘test1’
However, it’s explicitly stated in the documentation that:
Assigning multiple columns within the same
assignis possible. Later items in**kwargsmay refer to newly created or modified columns indf; items are computed and assigned intodfin order.
So I don’t understand: how can I refer to newly created or modified columns in df when calling assign?
>Solution :
You can pass a callable to assign. Here use a lambda to reference the DataFrame.
Parameters
**kwargsdict of {str: callable or Series}The column names are keywords. If the values are callable, they are computed on the DataFrame and
assigned to the new columns. The callable must not change input
DataFrame (though pandas doesn’t check it). If the values are not
callable, (e.g. a Series, scalar, or array), they are simply assigned.
df = pd.DataFrame({"col1":[1,2,3], "col2": [4,5,6]})
df.assign(
test1="hello",
test2=lambda d: d.test1 + " world"
)
Output:
col1 col2 test1 test2
0 1 4 hello hello world
1 2 5 hello hello world
2 3 6 hello hello world