Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

Get joint-frequency of two binned variables from pandas dataframe?

Let’s say I have a dataframe with two timeseries variables as follows:

Temperature Humidity
17.6 88
22.1 81
13.6 88
26.4 71
25.6 72

Here is the pd.DataFrame for it:

data = {
    'Temperature': [17.6, 22.1, 13.6, 26.4, 25.6],
    'Humidity': [88, 81, 88, 71, 72]
}

df = pd.DataFrame(data)

I would like to obtain a table, which has these two variables, binned as the axes, and the values being the frequency (nominal or relative) of the combined condition.

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

For example, the end output would be formatted with binned temperatures as columns and binned humidity as the index:

0-4 4-8 8-12 12-16 16-20 20-24 24-28 28-32
0-10
10-20
20-30
30-40
40-50
50-60
60-70
70-80
80-90
90-100

I know how to bin and get the counts for one binning, but I’m really not sure how to count the combined occurrences of two binned conditions. Can anyone please point me in the right direction?

df['T_binned'] = pd.cut(df['Temperature'],bins=np.arange(0,32,4))
df['H_binned'] = pd.cut(df['Humidity'],bins=np.arange(0,100,10))
counts_T = df.groupby(by='T_binned').count()
counts_H = df.groupby(by='H_binned').count()

>Solution :

Use a crosstab:

pd.crosstab(pd.cut(df['Temperature'],bins=np.arange(0,32,4)),
            pd.cut(df['Humidity'],bins=np.arange(0,100,10)))

Or for all categories:

pd.crosstab(pd.cut(df['Temperature'],bins=np.arange(0,32,4)),
            pd.cut(df['Humidity'],bins=np.arange(0,100,10)),
            dropna=False
           )

Output:

Humidity     (0, 10]  (10, 20]  (20, 30]  (30, 40]  (40, 50]  (50, 60]  (60, 70]  (70, 80]  (80, 90]
Temperature                                                                                         
(0, 4]             0         0         0         0         0         0         0         0         0
(4, 8]             0         0         0         0         0         0         0         0         0
(8, 12]            0         0         0         0         0         0         0         0         0
(12, 16]           0         0         0         0         0         0         0         0         1
(16, 20]           0         0         0         0         0         0         0         0         1
(20, 24]           0         0         0         0         0         0         0         0         1
(24, 28]           0         0         0         0         0         0         0         2         0
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading