Home adding new cloumn to dataframe based on values of columns and values from other dataframe

Questions

adding new cloumn to dataframe based on values of columns and values from other dataframe

March 30, 2022

I have this data frame, df, that has boolean values :

    A  B  C
0   0  1  0
1   0  1  1
2   0  1  1
3   1  0  1
4   0  0  0
5   1  0  0
6   0  0  0
7   0  0  1
8   1  0  0
9   0  0  0
10  1  0  1
11  1  0  1
12  0  1  1
13  1  0  0
14  1  0  0
15  0  1  0
16  1  1  0
17  0  0  1
18  1  0  1
19  1  0  0
20  1  0  1
21  1  1  0
22  1  1  1
23  1  1  1
24  1  0  0
25  1  1  0
26  0  0  1
27  0  1  1
28  0  1  0
29  1  1  0
30  1  0  1
31  0  1  0
32  0  0  1
33  1  1  1
34  0  1  0
35  1  1  0
36  0  1  0
37  0  0  1
38  0  1  1
39  0  1  1

I stored the count of rows as follows :

N = len(df.index) # 40 in this case

Using groupby , I counted each instantiation of df as follows :

    count_series = df.groupby(["A", "B", "C"]).size() #all columns
    new_df = count_series.to_frame(name = 'count').reset_index()
    print(new_df)

The new_df looks like this :

   A  B  C  count
0  0  0  0     3
1  0  0  1     5
2  0  1  0     6
3  0  1  1     6
4  1  0  0     6
5  1  0  1     6
6  1  1  0     5
7  1  1  1     3

Now df row count is N=40 and I want to create a new dataframe ,dfD, that has the same columns as df plus additional column named P(A,B,C) which has the probability of each combination. for example , any row with the values 0,0,0 should have count/N (3/40) which is 0.075
I found these posts but all of them did not help because they are using cases since my df wont just have 3 columns (A,B,C) or just 40 rows. it might be bigger that that
link1 link2
I want something that works with any dataframe of any size

>Solution :

Convert each row into tuple and use groupby

grp = df.apply(tuple, axis=1)
pd.concat([df.groupby(grp).first(),
           grp.groupby(grp).count().div(len(df)).rename("Probs")],
          axis=1).reset_index(drop=True)