pandas performance warning that can't get rid of

Advertisements

I have a panda dataframe df with a column name ‘C’.
I am creating 280 duplicate columns added to the same dataframe with names of 1 … 280 as follows:

for l in range(1,281):
    df[str[l]] = df['C']

I haven’t figured out how to do this operation more efficiently, however, this operation works as expected but I get the following performance warning message:

PerformanceWarning: DataFrame is highly fragmented.  This is usually the result of calling `frame.insert` many times, which has poor performance.  Consider joining all columns at once using pd.concat(axis=1) instead. To get a de-fragmented frame, use `newframe = frame.copy()`
  df_base[str(d)]=col_vals 

I’ve tried to suppress this warning with

import warnings
warnings.simplefilter(action='ignore', category=pd.errors.PerformanceWarning)

The performance warning suppression works when running on 1 core however, I’m running this code with joblib with 30 cores.

When running this operation with joblib, the warnning suppresion doesn’t work!

How can I get rid of this warning message with either of these 2 methods?

  1. how to supress the warning on joblib?
    or
  2. how to create duplicate columns in a more efficient way with no warnings?

>Solution :

You can do this in one go:

df = pd.concat([df['C']] * 281, axis=1)
df.columns = list(range(1, 281)) + ['C']

Leave a Reply Cancel reply