Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

How to replace a row in pandas with multiple rows after applying a function?

I have a pandas dataframe that contains only one column which contains a string. I want to apply a function to each row that will split the string by sentence and replace that row with rows generated from the function.

Example dataframe:

import pandas as pd
df = pd.DataFrame(["A sentence. Another sentence. More sentences here.", "Another line of text"])

Output of df.head():

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

                                                   0
0  A sentence. Another sentence. More sentences h...
1                               Another line of text

I have tried using apply() method as follows:

def get_sentence(row):
    return pd.DataFrame(re.split('\.', row[0]))
df.apply(get_sentence, axis=1)

But then df.head() gives:

0                          0
0            A sentenc...
1                            0
0  Another line of text

I want the output as:

                     0
0            A sentence
1      Another sentence
2   More sentences here
3  Another line of text

What is the correct way to do this?

>Solution :

You can use

df[0].str.split(r'\.(?!$)').explode().reset_index(drop=True).str.rstrip('.')

Output:

0               A sentence
1         Another sentence
2     More sentences here
3     Another line of text

The \.(?!$) regex matches a dot not at the end of the string. The .explode() splits the results across rows and the .reset_index(drop=True) resets the indices. .str.rstrip('.') will remove trailing dots.

You can also use Series.str.findall version:

>>> df[0].str.findall(r'[^.]+').explode().reset_index(drop=True)
0              A sentence
1        Another sentence
2     More sentences here
3    Another line of text

where [^.]+ matches any one or more chars other than . char.

Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading