Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

Collapse multiple rows in pandas dataframe and convert data from long to wide afterwards

I have a dataframe with repetitive names of people

Input:

{'name': {0: 'John Smith', 1: 'John Smith', 2: 'John Smith', 3: 'John Doo', 4: 'John Doo', 5: 'John Doo'}, 'journal': {0: 'Journal1', 1: 'Journal2', 2: 'Journal2', 3: 'Journal1', 4: 'Journal2', 5: 'Journal2'}, 'is_editor_in_chief_2019': {0: nan, 1: nan, 2: nan, 3: nan, 4: nan, 5: nan}, 'is_editor_in_chief_2020': {0: nan, 1: nan, 2: nan, 3: nan, 4: nan, 5: nan}, 'is_editor_in_chief_2021': {0: nan, 1: nan, 2: nan, 3: nan, 4: nan, 5: nan}, 'is_editor_in_chief_2022': {0: 1.0, 1: nan, 2: nan, 3: 1.0, 4: nan, 5: nan}, 'is_editorial_board_member_2019': {0: nan, 1: nan, 2: nan, 3: nan, 4: nan, 5: nan}, 'is_editorial_board_member_2020': {0: nan, 1: nan, 2: nan, 3: nan, 4: nan, 5: nan}, 'is_editorial_board_member_2021': {0: nan, 1: nan, 2: 1.0, 3: nan, 4: nan, 5: 1.0}, 'is_editorial_board_member_2022': {0: nan, 1: 1.0, 2: nan, 3: nan, 4: 1.0, 5: nan}}

I want to collapse these rows and convert data from long to wide.

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

Expected output:

{'name': {0: 'John Smith', 1: 'John Doo'}, 'Journal1_is_editor_in_chief_2019': {0: nan, 1: nan}, 'Journal1_is_editor_in_chief_2020': {0: nan, 1: nan}, 'Journal1_is_editor_in_chief_2021': {0: nan, 1: nan}, 'Journal1_is_editor_in_chief_2022': {0: 1, 1: 1}, 'Journal1_is_editorial_board_member_2019': {0: nan, 1: nan}, 'Journal1_is_editorial_board_member_2020': {0: nan, 1: nan}, 'Journal1_is_editorial_board_member_2021': {0: 1, 1: 1}, 'Journal1_is_editorial_board_member_2022': {0: 1, 1: 1}, 'Journal2_is_editor_in_chief_2019': {0: nan, 1: nan}, 'Journal2_is_editor_in_chief_2020': {0: nan, 1: nan}, 'Journal2_is_editor_in_chief_2021': {0: nan, 1: nan}, 'Journal2_is_editor_in_chief_2022': {0: nan, 1: nan}, 'Journal2_is_editorial_board_member_2019': {0: nan, 1: nan}, 'Journal2_is_editorial_board_member_2020': {0: nan, 1: nan}, 'Journal2_is_editorial_board_member_2021': {0: nan, 1: nan}, 'Journal2_is_editorial_board_member_2022': {0: nan, 1: nan}}

The closest solution I found is here:

df_out = df.set_index(['name', df.groupby(['name']).cumcount()+1]).unstack().sort_index(level=1, axis=1)
df_out.columns = df_out.columns.map('{0[0]}_{0[1]}'.format)
df_out.reset_index()

But it does not give me the ideal output.

>Solution :

You can use pivot_table:

df_out = df.pivot_table(index='name', columns='journal', dropna=False)
df_out.columns = df_out.columns.map('{0[1]}_{0[0]}'.format)
df_out.reset_index()

output:

         name  Journal1_is_editor_in_chief_2019  Journal2_is_editor_in_chief_2019  Journal1_is_editor_in_chief_2020  Journal2_is_editor_in_chief_2020  Journal1_is_editor_in_chief_2021  Journal2_is_editor_in_chief_2021  Journal1_is_editor_in_chief_2022  Journal2_is_editor_in_chief_2022  Journal1_is_editorial_board_member_2019  Journal2_is_editorial_board_member_2019  Journal1_is_editorial_board_member_2020  Journal2_is_editorial_board_member_2020  Journal1_is_editorial_board_member_2021  Journal2_is_editorial_board_member_2021  Journal1_is_editorial_board_member_2022  Journal2_is_editorial_board_member_2022
0    John Doo                               NaN                               NaN                               NaN                               NaN                               NaN                               NaN                               1.0                               NaN                                      NaN                                      NaN                                      NaN                                      NaN                                      NaN                                      1.0                                      NaN                                      1.0
1  John Smith                               NaN                               NaN                               NaN                               NaN                               NaN                               NaN                               1.0                               NaN                                      NaN                                      NaN                                      NaN                                      NaN                                      NaN                                      1.0                                      NaN                                      1.0
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading