I’m new to working with Pandas and I’m trying to do a very simple thing with it. Using the flights.csv file I’m defining a new column which defines a new column with underperforming if the number of passengers is below average, the value is 1. My problem is that it might be something wrong with the logic since it’s not updating the values. Here is an example:
df = pd.read_csv('flights.csv')
passengers_mean = df['passengers'].mean()
df['underperforming'] = 0
for idx, row in df.iterrows():
if (row['passengers'] < passengers_mean):
row['underperforming'] = 1
print(df)
print(passengers_mean)
Any clue?
>Solution :
According to the docs:
You should never modify something you are iterating over. This is not guaranteed to work in all cases.
What you can do instead is:
df["underperforming"] = (df.passengers < x.passengers.mean()).astype('int')