Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

How to apply str.join to a groupby column that contains integers and strings

I have code that imports a file and concatenates the data horizontally. My input file looks like this:

X Y
a hello
a 3
a bye
a hi
b apple
b orange
b 4

and this is the output I need:

X Y
a hello,3,bye,hi
b apple,orange,4

I use this python code on Jupyter:

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

import pandas as pd
# df=pd.read_excel('test.xlsx')
df = pd.DataFrame({"X": ["a", "a", "a", "a", "b", "b", "b"],
                   "Y": ["hello", 3, "bye", "hi", "apple", "orange", 4]})

orden=df.groupby('X').Y.apply(','.join)

error: TypeError: sequence item 0: expected str instance, int found

I have validated other data, and I suspect that it falls by the integers. How could I improve my code so that it also concatenates numbers ans string?

>Solution :

Convert the Y column to a string first:

df = pd.DataFrame({"X": ["a", "a", "a", "a", "b", "b", "b"],
                   "Y": ["hello", 3, "bye", "hi", "apple", "orange", 4]})
df["Y"] = df["Y"].astype(str)
orden=df.groupby('X').Y.apply(','.join)

which gives orden=

X
a    hello,3,bye,hi
b    apple,orange,4
Name: Y, dtype: object
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading