Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

Replace all occurrences but first of a repeating string in a column using pandas

I have a column, in a pandas dataframe, in which sometimes there is a repeating string:

col1 col2
1 hello
2 bye
3 hello
4 morning
5 night
6 hello

Would I would like to do is to modify all but the first occurence of "hello" in "hello again". So the first occurence of hello remains the same.

col1 col2
1 hello
2 bye
3 hello again
4 morning
5 night
6 hello again

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

>Solution :

You can find the indices of the rows containing "hello" and then modify all but the first occurrence using pandas.DataFrame.loc:

In [1]: import pandas as pd
In [2]: df = pd.DataFrame(data={'col1': [1, 2, 3, 4, 5, 6],
   ...:                         'col2': ['hello', 'bye', 'hello', 'morning', 'night', 'hello']})
In [3]: df
Out[3]: 
   col1     col2
0     1    hello
1     2      bye
2     3    hello
3     4  morning
4     5    night
5     6    hello
In [4]: hello_indices = df.index[df['col2'] == 'hello']
In [5]: hello_indices
Out[5]: Int64Index([0, 2, 5], dtype='int64')
In [6]: df.loc[hello_indices[1:],'col2'] = 'hello again'
In [7]: df
Out[7]: 
   col1         col2
0     1        hello
1     2          bye
2     3  hello again
3     4      morning
4     5        night
5     6  hello again
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading