I have created the following pandas dataframe:
import pandas as pd
ds = {'col1' : ['A','A','B','C','C','D'],
'col2' : ['A','B','C','D','D','A']}
df = pd.DataFrame(data=ds)
The dataframe looks like this:
print(df)
col1 col2
0 A A
1 A B
2 B C
3 C D
4 C D
5 D A
The possible values in col1 and col2 are A, B, C and D.
I need to create 4 new columns, called:
countA: it counts how manyAare in each row / recordcountB: it counts how manyBare in each row / recordcountC: it counts how manyCare in each row / recordcountD: it counts how manyDare in each row / record
So, from the example above, the resulting dataframe would look like this:
Can anyone help me please?
>Solution :
Probably this could work for you
from numpy import unique
for k in unique(df.values):
df['count'+k] = df.eq(k).sum(axis=1)
which gives
col1 col2 countA countB countC countD
0 A A 2 0 0 0
1 A B 1 1 0 0
2 B C 0 1 1 0
3 C D 0 0 1 1
4 C D 0 0 1 1
5 D A 1 0 0 1
