Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

pandas.series.str.split() not accepting 3 keyword arguments

I’m doing a project using the MIMIC-IV dataset as a source. I found a preprocessing pipeline which is widely used in many projects. When I try to run through said pipeline all is well until I try to generate the time series data representation module (I haven’t modified the data nor the pipeline code in any way myself). The following error occurs:

TypeError                                 Traceback (most recent call last)
.../Downloads/MIMIC-IV-Data-Pipeline-main/mainPipeline.ipynb Cell 27 in <cell line: 20>()
     18     impute=False
     20 if data_icu:
---> 21     gen=data_generation_icu.Generator(cohort_output,data_mort,data_admn,data_los,diag_flag,proc_flag,out_flag,chart_flag,med_flag,impute,include,bucket,predW)
     22     #gen=data_generation_icu.Generator(cohort_output,data_mort,diag_flag,False,False,chart_flag,False,impute,include,bucket,predW)
     23     #if chart_flag:
     24     #    gen=data_generation_icu.Generator(cohort_output,data_mort,False,False,False,chart_flag,False,impute,include,bucket,predW)
     25 else:
     26     gen=data_generation.Generator(cohort_output,data_mort,data_admn,data_los,diag_flag,lab_flag,proc_flag,med_flag,impute,include,bucket,predW)

File ~/Downloads/MIMIC-IV-Data-Pipeline-main/model/data_generation_icu.py:22, in Generator.__init__(self, cohort_output, if_mort, if_admn, if_los, feat_cond, feat_proc, feat_out, feat_chart, feat_med, impute, include_time, bucket, predW)
     20 self.cohort_output=cohort_output
     21 self.impute=impute
---> 22 self.data = self.generate_adm()
     23 print("[ READ COHORT ]")
     25 self.generate_feat()

File ~/Downloads/MIMIC-IV-Data-Pipeline-main/model/data_generation_icu.py:64, in Generator.generate_adm(self)
     62 data['los']=pd.to_timedelta(data['outtime']-data['intime'],unit='h')
     63 data['los']=data['los'].astype(str)
---> 64 data[['days', 'dummy','hours']] = data['los'].str.split(' ', -1, expand=True)
     65 data[['hours','min','sec']] = data['hours'].str.split(':', -1, expand=True)
     66 data['los']=pd.to_numeric(data['days'])*24+pd.to_numeric(data['hours'])
...
    127     )
    128     raise TypeError(msg)
--> 129 return func(self, *args, **kwargs)

TypeError: split() takes from 1 to 2 positional arguments but 3 positional arguments (and 1 keyword-only argument) were given.

I’m assuming the problem lies in the use of the pandas.str.split() function (I’m using pandas version 2.0.3) but when I check the documentation it should accept 3 keyword arguments as far as I can tell.

Since it isn’t my code I’m having a hard time debugging what is going wrong here but maybe I’m missing something. Does anyone know or did anyone run into the same problem when trying to use this pipeline and have any clue how to fix this?

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

>Solution :

In recent pandas versions, many functions switched to keyword only, you can actually see this in str.split documentation.

              # positional    # keyword-only
Series.str.split(pat=None, *, n=-1, expand=False, regex=None)

The * means that onlt pat can be used as a positional parameter, n/expand/regex must be provided as keywords.

You need to use the named parameter:

data[['days', 'dummy','hours']] = data['los'].str.split(' ', n=-1, expand=True)

There actually used to be a FutureWarning about this in previous versions:

In a future version of pandas all arguments of StringMethods.split except for the argument ‘pat’ will be keyword-only.

Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading