Home How to use duplicated rows in pandas DF to efficiently create new columns?

Questions

How to use duplicated rows in pandas DF to efficiently create new columns?

September 11, 2023

I am trying to convert the output of the value_counts() pd method into a new df structure.

pd.DataFrame({'Value': {0: 1, 1: 1, 2: 1, 3: 2, 4: 2, 5: 2, 6: 3},
 'Class': {0: 14, 1: 7, 2: 15, 3: 1, 4: 12, 5: 2, 6: 15},
 'counts': {0: 2065, 1: 1268, 2: 78, 3: 54, 4: 40, 5: 33, 6: 6}})

    Value   Class   counts
0   1       14       2065
1   1       7        1268
2   1       15       78
3   2       1        54
4   2       12       40
5   2       2        33
6   3       15       6

I am trying to consolidate to the unique ids from the Value field (1, 2, 3), and then have the unique numbers from Class field for the new fields. The the class values will be the counts.

I am trying to get something like:

     Value    1    2    7    12    14    15 
0    1        0    0    1268 0     2065  78
1    2        54   33   0    40    0     0 
2    3        0    0    0    0     0     6

I know I can brute-force it, but I am working with 10+ million unique "Value" ids, so 50+ million count rows. I am trying to figure out a fast and effective way to do this.

>Solution :

This can be done with a pivot table:

df.pivot(index='Value', columns='Class', values='counts').fillna(0)

pandas

byMR

Published September 11, 2023

Add a comment

Check if element exists or not

byMR

September 11, 2023

Questions

I need a formula for IF a cell contains the word SYSTEM OR is empty then return yes or no

byMR

September 11, 2023

Questions

Why my Shell code always return Even, even if number is odd

byMR

September 11, 2023

Questions

jq result iteration in bash

byMR

September 11, 2023

Questions

Summing up dataframe values and merging into a 1 index with summed up values

byMR

September 11, 2023

Questions

Can I cause BASH to echo all source when it is invoked?

byMR

September 11, 2023

How to use duplicated rows in pandas DF to efficiently create new columns?

MEDevel.com: Open-source for Healthcare and Education

>Solution :

Like this:

Leave a ReplyCancel reply

Read more

Check if element exists or not

I need a formula for IF a cell contains the word SYSTEM OR is empty then return yes or no

Why my Shell code always return Even, even if number is odd

jq result iteration in bash

Summing up dataframe values and merging into a 1 index with summed up values

Can I cause BASH to echo all source when it is invoked?

Keep Up to Date with the Most Important News

How to use duplicated rows in pandas DF to efficiently create new columns?

MEDevel.com: Open-source for Healthcare and Education

>Solution :

Share this:

Like this:

Leave a ReplyCancel reply

Keep Up to Date with the Most Important News

Read more

Check if element exists or not

I need a formula for IF a cell contains the word SYSTEM OR is empty then return yes or no

Why my Shell code always return Even, even if number is odd

jq result iteration in bash

Summing up dataframe values and merging into a 1 index with summed up values

Can I cause BASH to echo all source when it is invoked?

Discover more from Dev solutions