So I have some complicated analysis to do. I have this data in a df:
Date/Time TimeStamp CallerId CallType watts Band slot Channel
0 20:02.0 3113677432 17794800 C1 0.060303 12 1 2
1 20:02.0 3113677432 5520488 OP8 0.302229 12 1 1
2 20:02.0 3113677432 5520488 OP8 0.302229 13 1 1
3 20:02.0 3113677432 5520488 OP8 0.302229 12 2 1
4 20:02.0 3113677432 5520488 OP8 0.302229 13 2 1
5 20:02.0 3113677432 5520488 OP8 0.302229 12 3 1
6 20:02.0 3113677432 5520488 OP8 0.302229 13 3 1
7 20:02.0 3113677432 5520488 OP8 0.302229 12 4 1
8 20:02.0 3113677432 5520488 OP8 0.302229 13 4 1
9 20:07.0 3113677488 17794800 C1 0.151473 12 1 2
10 20:07.0 3113677488 5218651 CC8kds 0.475604 13 4 1
11 20:07.0 3113677488 5514318 BD 1.906933 12 1 6
12 20:11.0 3113677532 17794800 C1 0.038048 12 1 2
13 20:11.0 3113677532 5218651 CC8kds 0.300086 13 4 1
14 20:11.0 3113677532 5501460 PTN3 4.790000 12 1 5
15 21:51.0 3113678643 9895585 CC8kds 0.075378 12 1 1
16 21:51.0 3113678643 5482185 OP8 0.302229 13 1 1
17 21:51.0 3113678643 5482185 OP8 0.302229 13 2 1
18 21:51.0 3113678643 5482185 OP8 0.302229 13 3 1
19 21:51.0 3113678643 5482185 OP8 0.302229 13 4 1
20 21:51.0 3113678643 5513470 PTN3 4.790000 12 3 1
21 21:51.0 3113678643 5518399 PTN3 4.790000 12 3 5
The TimeStamp repeats over many of the items since each row captures a Band (10-20) and a Channel in that band (1-7). The slot further divides the channel into (1-4) time slots. I use a pivot_table to group the watts (power) in each (Band, Channel, slot) as follows:
df = df.pivot_table(index='TimeStamp', columns=['Band','Channel','slot'], values='watts',
aggfunc=sum, fill_value=0)
What I’d like to do is create a new series, freq = 1200+(Band-1)+Channel/12
and then sum the power in all 4 slots … for example:
Band 12, Channel 1 @ TimeStamp 3113677432 results in a total pwr of
0.302229+0.302229+0.302229+0.302229 = 1.208916
for a freq value of 1200+(12-1)+1/12=1211.083
… so I’d have something like this:
TimeStamp Freq Pwr ...
3113677432 1211.083 1.208916 ... # This is the total for Band 12, Channel 1 @ this timestamp
3113677432 1211.640 2.208916 ... # this isn't a real total
I need this for each freq computed at each TimeStamp. It would be nice to have those Freq as their own (multilevel) column headers (I’m making up numbers here):
Freq 1211.083 1211.64 1212.04...
TimeStamp
3113677432 1.208916 2.208916 2.208916...
3113677488 2.058406 0.475604 2.208916...
I can group on one col header like this:
df.groupby(level=0, axis=1).sum() # Groups by Band
or
df.groupby(level=1, axis=1).sum() # Groups by Channel - but across ALL Bands - WRONG
So is there an straightforward way to group by both Band and Channel and sum over all 4 slots?
>Solution :
There’s no need to pivot the table, just do a groupby on multiple columns and you’re good to go!
# loading data using solution from https://stackoverflow.com/a/53692642/8718701
from io import StringIO
import pandas as pd
d = '''
Date/Time TimeStamp CallerId CallType watts Band slot Channel
0 20:02.0 3113677432 17794800 C1 0.060303 12 1 2
1 20:02.0 3113677432 5520488 OP8 0.302229 12 1 1
2 20:02.0 3113677432 5520488 OP8 0.302229 13 1 1
3 20:02.0 3113677432 5520488 OP8 0.302229 12 2 1
4 20:02.0 3113677432 5520488 OP8 0.302229 13 2 1
5 20:02.0 3113677432 5520488 OP8 0.302229 12 3 1
6 20:02.0 3113677432 5520488 OP8 0.302229 13 3 1
7 20:02.0 3113677432 5520488 OP8 0.302229 12 4 1
8 20:02.0 3113677432 5520488 OP8 0.302229 13 4 1
9 20:07.0 3113677488 17794800 C1 0.151473 12 1 2
10 20:07.0 3113677488 5218651 CC8kds 0.475604 13 4 1
11 20:07.0 3113677488 5514318 BD 1.906933 12 1 6
12 20:11.0 3113677532 17794800 C1 0.038048 12 1 2
13 20:11.0 3113677532 5218651 CC8kds 0.300086 13 4 1
14 20:11.0 3113677532 5501460 PTN3 4.790000 12 1 5
15 21:51.0 3113678643 9895585 CC8kds 0.075378 12 1 1
16 21:51.0 3113678643 5482185 OP8 0.302229 13 1 1
17 21:51.0 3113678643 5482185 OP8 0.302229 13 2 1
18 21:51.0 3113678643 5482185 OP8 0.302229 13 3 1
19 21:51.0 3113678643 5482185 OP8 0.302229 13 4 1
20 21:51.0 3113678643 5513470 PTN3 4.790000 12 3 1
21 21:51.0 3113678643 5518399 PTN3 4.790000 12 3 5
'''
df = pd.read_csv(StringIO(d), sep='\s+')
# calculating total pwr -> sum(watts)
agg_df = (
df
.groupby(['TimeStamp', 'Band', 'Channel'], as_index=False)
.agg({'watts': 'sum'})
)
# freq = 1200+(Band-1)+Channel/12
agg_df['freq'] = (
1200
+ agg_df['Band']-1
+ agg_df['Channel']/12
)
print(agg_df.to_markdown(index=False, floatfmt=''))
Result:
| TimeStamp | Band | Channel | watts | freq |
|-------------:|-------:|----------:|---------:|-------------------:|
| 3113677432.0 | 12.0 | 1.0 | 1.208916 | 1211.0833333333333 |
| 3113677432.0 | 12.0 | 2.0 | 0.060303 | 1211.1666666666667 |
| 3113677432.0 | 13.0 | 1.0 | 1.208916 | 1212.0833333333333 |
| 3113677488.0 | 12.0 | 2.0 | 0.151473 | 1211.1666666666667 |
| 3113677488.0 | 12.0 | 6.0 | 1.906933 | 1211.5 |
| 3113677488.0 | 13.0 | 1.0 | 0.475604 | 1212.0833333333333 |
| 3113677532.0 | 12.0 | 2.0 | 0.038048 | 1211.1666666666667 |
| 3113677532.0 | 12.0 | 5.0 | 4.79 | 1211.4166666666667 |
| 3113677532.0 | 13.0 | 1.0 | 0.300086 | 1212.0833333333333 |
| 3113678643.0 | 12.0 | 1.0 | 4.865378 | 1211.0833333333333 |
| 3113678643.0 | 12.0 | 5.0 | 4.79 | 1211.4166666666667 |
| 3113678643.0 | 13.0 | 1.0 | 1.208916 | 1212.0833333333333 |