Get joint-frequency of two binned variables from pandas dataframe?

Advertisements

Let’s say I have a dataframe with two timeseries variables as follows:

Temperature	Humidity
17.6	88
22.1	81
13.6	88
26.4	71
25.6	72

Here is the pd.DataFrame for it:

data = {
    'Temperature': [17.6, 22.1, 13.6, 26.4, 25.6],
    'Humidity': [88, 81, 88, 71, 72]
}

df = pd.DataFrame(data)

I would like to obtain a table, which has these two variables, binned as the axes, and the values being the frequency (nominal or relative) of the combined condition.

For example, the end output would be formatted with binned temperatures as columns and binned humidity as the index:

	0-4	4-8	8-12	12-16	16-20	20-24	24-28	28-32
0-10
10-20
20-30
30-40
40-50
50-60
60-70
70-80
80-90
90-100

I know how to bin and get the counts for one binning, but I’m really not sure how to count the combined occurrences of two binned conditions. Can anyone please point me in the right direction?

df['T_binned'] = pd.cut(df['Temperature'],bins=np.arange(0,32,4))
df['H_binned'] = pd.cut(df['Humidity'],bins=np.arange(0,100,10))
counts_T = df.groupby(by='T_binned').count()
counts_H = df.groupby(by='H_binned').count()

>Solution :

Use a crosstab:

pd.crosstab(pd.cut(df['Temperature'],bins=np.arange(0,32,4)),
            pd.cut(df['Humidity'],bins=np.arange(0,100,10)))

Or for all categories:

pd.crosstab(pd.cut(df['Temperature'],bins=np.arange(0,32,4)),
            pd.cut(df['Humidity'],bins=np.arange(0,100,10)),
            dropna=False
           )

Output:

Humidity     (0, 10]  (10, 20]  (20, 30]  (30, 40]  (40, 50]  (50, 60]  (60, 70]  (70, 80]  (80, 90]
Temperature                                                                                         
(0, 4]             0         0         0         0         0         0         0         0         0
(4, 8]             0         0         0         0         0         0         0         0         0
(8, 12]            0         0         0         0         0         0         0         0         0
(12, 16]           0         0         0         0         0         0         0         0         1
(16, 20]           0         0         0         0         0         0         0         0         1
(20, 24]           0         0         0         0         0         0         0         0         1
(24, 28]           0         0         0         0         0         0         0         2         0

Dev solutions

Solutions for development problems

Get joint-frequency of two binned variables from pandas dataframe?

>Solution :

Leave a ReplyCancel reply

>Solution :

Share this:

Leave a ReplyCancel reply

Discover more from Dev solutions