Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

How to fix SettingWithCopyWarning? (Python)

I have a dataframe with some columns of the same type:

['total_tracks', 't_dur0', 't_dur1', 't_dur2', 't_dance0', 't_dance1', 't_dance2', 
 't_energy0', 't_energy1', 't_energy2', 't_key0', 't_key1', 't_key2', 't_mode0', 
 't_mode1', 't_mode2', 't_speech0', 't_speech1', 't_speech2', 't_acous0', 't_acous1', 
 't_acous2', 't_ins0', 't_ins1', 't_ins2', 't_live0', 't_live1', 't_live2', 't_val0', 
 't_val1', 't_val2', 't_tempo0', 't_tempo1', 't_tempo2', 't_sig0', 't_sig1', 't_sig2', 
 'popularity', 'release_year', 'release_month']

And I am trying to combine the columns with the same type like this:

# Takes in a dataframe with three columns and returns a dataframe with one column of their means
def average_column(dataframe):
    dataframe["mean"] = dataframe.mean(axis=1)                        # Add column to the dataframe (axis=1 means the mean() is applied row-wise)
    mean_df = dataframe.iloc[: , -1:]                                 # Isolated column of the mean by selecting all rows (:) for the last column (-1:)
    print("Original: {}\tWith mean:\n{}".format(dataframe, mean_df))
    return mean_df

Inspired by this and this question. I tried to run this code:

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

t_name_df = df[["t_dur0", "t_dur1", "t_dur2"]]
print(t_name_df.columns.tolist())
average_column(t_name_df)

Which gave me this output:

['t_dur0', 't_dur1', 't_dur2']
Original:
      t_dur0  t_dur1  t_dur2         mean
0       2315    2310    2293  2306.000000
1       1558     886    1870  1438.000000
2        803     316     504   541.000000
3        498     815     677   663.333333
4       1508    1677    1386  1523.666667
...      ...     ...     ...          ...
[2833 rows x 4 columns]
With mean:
         mean
0     2306.000000
1     1438.000000
2      541.000000
3      663.333333
4     1523.666667
...           ...

A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

To get rid of the warning I tried re-writing it:

t_name_df = df.loc['t_dur0', 't_dur0']
print(t_name_df.column.tolist())
average_column(t_name_df)

Which gave me this error:

KeyError: 't_dur0'

How do I get rid of this warning correctly?

>Solution :

Change your average_column function to this:

def average_column(dataframe):
    # ADD THIS LINE:
    dataframe = dataframe.copy()
    
    dataframe["mean"] = dataframe.mean(axis=1)                        # Add column to the dataframe (axis=1 means the mean() is applied row-wise)
    mean_df = dataframe.iloc[: , -1:]                                 # Isolated column of the mean by selecting all rows (:) for the last column (-1:)
    print("Original: {}\tWith mean:\n{}".format(dataframe, mean_df))
    return mean_df

The warning is happening because by doing t_name_df = df[["t_dur0", "t_dur1", "t_dur2"]], you’re creating a copy of those columns, and pandas is telling you that changes you make to it (t_name_df) won’t reflect in the original dataframe (df). By adding .copy(), you explicitly let pandas know that you’re okay with that happening.

Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading