Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

How to count unique values in pandas column base on dictionary values

I have the below pandas data frame.

d = {'id1': ['85643', '85644','85643','8564312','8564314','85645','8564316','85646','8564318','85647','85648','85649','85655','56731','34566','78931','78931'],'ID': ['G-00001', 'G-00001','G-00002','G-00002','G-00002','G-00001','G-00001','G-00001','G-00001','G-00001','G-00002','G-00002','G-00002','G-00002','G-00003','G-00003','G-00003'],'col1': [671, 2,5,3,4,5,60,0,0,6,3,2,4,32,3,1,23],'Goal': [np.nan, 56,78,np.nan,89,73,np.nan ,np.nan ,np.nan, np.nan, np.nan, 34,np.nan, 7, 84,np.nan,5 ], 'col2': [793, 4,8,32,43,55,610,0,0,16,23,72,48,3,28,5,3],'col3': [500, 22,89,33,44,55,60,1,5,6,3,2,4,13,12,14,98],'Date': ['2021-06-13', '2021-06-13','2021-06-14','2021-06-13','2021-06-14','2021-06-15','2021-06-15','2021-06-13','2021-06-16','2021-06-13','2021-06-13','2021-06-13','2021-06-16','2021-05-23','2021-05-13','2021-03-26','2021-05-13']}
dff = pd.DataFrame(data=d)
dff

    id1     ID      col1    Goal        col2    col3    Date
0   85643   G-00001 671     NaN         793     500 2021-06-13
1   85644   G-00001 2       56.0000     4       22  2021-06-13
2   85643   G-00002 5       78.0000     8       89  2021-06-14
3   8564312 G-00002 3       NaN         32      33  2021-06-13
4   8564314 G-00002 4       89.0000     43      44  2021-06-14
5   85645   G-00001 5       73.0000     55      55  2021-06-15
6   8564316 G-00001 60      NaN         610     60  2021-06-15
7   85646   G-00001 0       NaN         0       1   2021-06-13
8   8564318 G-00001 0       NaN         0       5   2021-06-16
9   85647   G-00001 6       NaN         16      6   2021-06-13
10  85648   G-00002 3       NaN         23      3   2021-06-13
11  85649   G-00002 2       34.0000     72      2   2021-06-13
12  85655   G-00002 4       NaN         48      4   2021-06-16
13  56731   G-00002 32      7.0000      3       13  2021-05-23
14  34566   G-00003 3       84.0000     28      12  2021-05-13
15  78931   G-00003 1       NaN         5       14  2021-03-26
16  78931   G-00003 23      5.0000      3       98  2021-05-13

Also, I have the below dictionary

dic = {'G-00001':{'aasd':['G-00001','85646','85648','345_2','85655','85659','85647'],
'vaasd':['G-00001','85649','34554','85655','22183','45335','8564316']},
'G-00002':{'aasd2':['G-00002','85343','85649','85655','78931','45121','56731']},
'G-00003':{'gsd3':['G-00003','34566','8564312','45121','78931']}}

I want to get a unique count of id1, based on the list inside the dictionary based on theire ID column. For example, if we consider ID – ‘G-0002’ has only one list aasd2. ‘aasd2’:[‘G-0002′,’85343′,’85649′,’85655′,’78931′,’45121′,’56731’]. I want to get how many unique id1’s are in the ‘G-0002′ ID column in pandas’ data frame. So it should be for aasd2- 3 values (‘85649′,’85655′,’56731’). Those are the only three id1 values in pandas for aasd2 in ‘G-0002’ ID.

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

So I want to create a table just like below for the list name and count of id1s

listName    count of id1s
aasd            2
vaasd           1
aasd2           3
gsd3            2

Is it possible to do this in python? Any suggestion would be appreciated. Thanks in advance!

>Solution :

From my previous answer, I slightly modified the code:

data = []
for g, d in dic.items():
    for k, l in d.items():
        data.extend([(g, v, k) for v in l])
df1 = pd.DataFrame(data, columns=['ID', 'id1', 'id2'])

out = dff.merge(df1, on=['ID', 'id1']) \
         .drop_duplicates(['ID', 'id1']) \
         .value_counts('id2')
print(out)

# Output:
id2
aasd2    3
aasd     2
gsd3     2
vaasd    1
dtype: int64
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading