Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

Converting rows to wide columns based on duplicated ids in another column in pandas

My question is similar to this, this, and this questions.

But still cannot resolve it.

I have a dataframe with duplicated ids

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

ID  Publication_type
1   Journal          
1   Clinical study   
1   Guideline        
2   Journal          
2   Letter           

I want to make it wide, but I do not know how many publication type will I have – maybe 2, maybe 20. Thus, I do not know how many columns wide will I need. The max size of wide columns for publication_type must be not be more than the number of types for each id.

Expected output

 ID Publication_type1 Publication_type2 Publication_type 3    etc
 1  Journal           Clinical Study    Guideline
 2  Journal           Letter            NaN

For now I do not need to put the same publication type into the same column. I do not need all articles in the same column. Thanks!

>Solution :

You can group by ID, aggregate via list, and then create a new DataFrame from the results:

col = 'Publication_type'
new_df = pd.DataFrame(df.groupby('ID')[col].agg(lambda x: x.tolist()).tolist()).replace({None: np.nan})
new_df.columns = [f'{col}{i}' for i in new_df.columns + 1]
new_df['ID'] = df['ID'].drop_duplicates().reset_index(drop=True)

Output:

>>> df
  Publication_type1 Publication_type2 Publication_type3  ID
0           Journal    Clinical-study         Guideline   1
1           Journal            Letter               NaN   2
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading