Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

Pandas Column Split but ignore splitting on specific pattern

I have a Pandas Series containing Several strings Patterns as below:

stringsToSplit = ['6  Wrap',
                  '1  Salad , 2  Pepsi , 2  Chicken Wrap',
                  '1  Kebab Plate  [1  Bread ]',
                  '1 Beyti Kebab , 1  Chicken Plate  [1  Bread ], 1 Kebab Plate  [1  White Rice ], 1 Tikka Plate  [1  Bread ]',
                  '1 Kebab Plate [1  Bread , 1  Rocca Leaves ], 1  Mountain Dew '
                 ]

s = pd.Series(stringsToSplit)
s

0                                              6  Wrap
1                1  Salad , 2  Pepsi , 2  Chicken Wrap
2                          1  Kebab Plate  [1  Bread ]
3    1 Beyti Kebab , 1  Chicken Plate  [1  Bread ],...
4    1 Kebab Plate [1  Bread , 1  Rocca Leaves ], 1...
dtype: object

I would like to split and explode it such that the result would be as follows:

0    6  Wrap
1    1  Salad
1    2  Pepsi
1    2  Chicken Wrap
2    1  Kebab Plate [1  Bread ]
3    1 Beyti Keba
3    1  Chicken Plate  [1  Bread ]
3    1 Kebab Plate  [1  White Rice ]
3    1  Tikka Plate  [1  Bread ]
4    1 Kebab Plate [1  Bread , 1  Rocca Leaves ]
4    1  Mountain Dew

In order to do the explode I need to first split. However, if I use split(',') that also splits the items between [] which I do not want.
I have tried using split using regex but was not able to find the correct pattern.

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

I would appreciate the support.

>Solution :

You can use a regex with a negative lookahead:

s.str.split(r'\s*,(?![^\[\]]*\])').explode()

output:

0                                        6  Wrap
1                                       1  Salad
1                                       2  Pepsi
1                                2  Chicken Wrap
2                    1  Kebab Plate  [1  Bread ]
3                                  1 Beyti Kebab
3                  1  Chicken Plate  [1  Bread ]
3                1 Kebab Plate  [1  White Rice ]
3                     1 Tikka Plate  [1  Bread ]
4    1 Kebab Plate [1  Bread , 1  Rocca Leaves ]
4                               1  Mountain Dew 
dtype: object

regex demo

Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading