Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

Split a comma delimited Pandas Column of Type Object

I have a pandas df with a column that have a mix of values like so

| ID       | home_page                                       |
| ---------| ------------------------------------------------|
| 1        | facebook.com, facebook.com, meta.com            |
| 2        | amazon.com                                      |
| 3        | twitter.com, dev.twitter.com, twitter.com       |

I want to create a new column that contain the unique values from home_page column. The final output should be

| ID       | home_page                                       | unique                    |
| -------- | --------------                                  |---------------------------|
| 1        | facebook.com, facebook.com, meta.com            | facebook.com,meta.com     |
| 2        | amazon.com                                      | amazon.com                |
| 3        | twitter.com, dev.twitter.com, twitter.com       |twitter.com,dev.twitter.com|

I tried the following:

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

final["home_page"] = final["home_page"].str.split(',').apply(lambda x : ','.join(set(x)))

But when I do that I get

TypeError: float object is not iterable

The column has no NaN but just in case I tried

final["home_page"] = final["home_page"].str.split(',').apply(lambda x : ','.join(set(x)))

But the entire column return empty when doing that

>Solution :

You are right that this is coming from np.nan values which are of type float. The issue happens here: set(np.nan). The following should work for you (and should be faster).

df["home_page"].str.replace(' ', '').str.split(',').apply(np.unique)

If you actually want a string at the end you can throw the following at the end:

.apply(lambda x: ','.join(str(i) for i in x))
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading