Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

pandas- create new series from multilevel columns and aggregate

So I have some complicated analysis to do. I have this data in a df:

    Date/Time   TimeStamp   CallerId    CallType    watts   Band    slot    Channel
0   20:02.0 3113677432  17794800    C1  0.060303    12  1   2
1   20:02.0 3113677432  5520488 OP8 0.302229    12  1   1
2   20:02.0 3113677432  5520488 OP8 0.302229    13  1   1
3   20:02.0 3113677432  5520488 OP8 0.302229    12  2   1
4   20:02.0 3113677432  5520488 OP8 0.302229    13  2   1
5   20:02.0 3113677432  5520488 OP8 0.302229    12  3   1
6   20:02.0 3113677432  5520488 OP8 0.302229    13  3   1
7   20:02.0 3113677432  5520488 OP8 0.302229    12  4   1
8   20:02.0 3113677432  5520488 OP8 0.302229    13  4   1
9   20:07.0 3113677488  17794800    C1  0.151473    12  1   2
10  20:07.0 3113677488  5218651 CC8kds  0.475604    13  4   1
11  20:07.0 3113677488  5514318 BD  1.906933    12  1   6
12  20:11.0 3113677532  17794800    C1  0.038048    12  1   2
13  20:11.0 3113677532  5218651 CC8kds  0.300086    13  4   1
14  20:11.0 3113677532  5501460 PTN3    4.790000    12  1   5
15  21:51.0 3113678643  9895585 CC8kds  0.075378    12  1   1
16  21:51.0 3113678643  5482185 OP8 0.302229    13  1   1
17  21:51.0 3113678643  5482185 OP8 0.302229    13  2   1
18  21:51.0 3113678643  5482185 OP8 0.302229    13  3   1
19  21:51.0 3113678643  5482185 OP8 0.302229    13  4   1
20  21:51.0 3113678643  5513470 PTN3    4.790000    12  3   1
21  21:51.0 3113678643  5518399 PTN3    4.790000    12  3   5

The TimeStamp repeats over many of the items since each row captures a Band (10-20) and a Channel in that band (1-7). The slot further divides the channel into (1-4) time slots. I use a pivot_table to group the watts (power) in each (Band, Channel, slot) as follows:

df = df.pivot_table(index='TimeStamp', columns=['Band','Channel','slot'], values='watts', 
                                       aggfunc=sum, fill_value=0) 

enter image description here
What I’d like to do is create a new series, freq = 1200+(Band-1)+Channel/12 and then sum the power in all 4 slots … for example:

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

Band 12, Channel 1 @ TimeStamp 3113677432 results in a total pwr of
0.302229+0.302229+0.302229+0.302229 = 1.208916 for a freq value of 1200+(12-1)+1/12=1211.083 … so I’d have something like this:

TimeStamp   Freq    Pwr ...
3113677432  1211.083    1.208916 ... # This is the total for Band 12, Channel 1 @ this timestamp
3113677432  1211.640    2.208916 ... # this isn't a real total

I need this for each freq computed at each TimeStamp. It would be nice to have those Freq as their own (multilevel) column headers (I’m making up numbers here):

Freq        1211.083    1211.64     1212.04...
TimeStamp 
3113677432  1.208916    2.208916    2.208916...
3113677488  2.058406    0.475604    2.208916...

I can group on one col header like this:

df.groupby(level=0, axis=1).sum() # Groups by Band

or

df.groupby(level=1, axis=1).sum() # Groups by Channel - but across ALL Bands - WRONG

So is there an straightforward way to group by both Band and Channel and sum over all 4 slots?

>Solution :

There’s no need to pivot the table, just do a groupby on multiple columns and you’re good to go!

# loading data using solution from https://stackoverflow.com/a/53692642/8718701
from io import StringIO
import pandas as pd


d = '''
    Date/Time   TimeStamp   CallerId    CallType    watts   Band    slot    Channel
0   20:02.0 3113677432  17794800    C1  0.060303    12  1   2
1   20:02.0 3113677432  5520488 OP8 0.302229    12  1   1
2   20:02.0 3113677432  5520488 OP8 0.302229    13  1   1
3   20:02.0 3113677432  5520488 OP8 0.302229    12  2   1
4   20:02.0 3113677432  5520488 OP8 0.302229    13  2   1
5   20:02.0 3113677432  5520488 OP8 0.302229    12  3   1
6   20:02.0 3113677432  5520488 OP8 0.302229    13  3   1
7   20:02.0 3113677432  5520488 OP8 0.302229    12  4   1
8   20:02.0 3113677432  5520488 OP8 0.302229    13  4   1
9   20:07.0 3113677488  17794800    C1  0.151473    12  1   2
10  20:07.0 3113677488  5218651 CC8kds  0.475604    13  4   1
11  20:07.0 3113677488  5514318 BD  1.906933    12  1   6
12  20:11.0 3113677532  17794800    C1  0.038048    12  1   2
13  20:11.0 3113677532  5218651 CC8kds  0.300086    13  4   1
14  20:11.0 3113677532  5501460 PTN3    4.790000    12  1   5
15  21:51.0 3113678643  9895585 CC8kds  0.075378    12  1   1
16  21:51.0 3113678643  5482185 OP8 0.302229    13  1   1
17  21:51.0 3113678643  5482185 OP8 0.302229    13  2   1
18  21:51.0 3113678643  5482185 OP8 0.302229    13  3   1
19  21:51.0 3113678643  5482185 OP8 0.302229    13  4   1
20  21:51.0 3113678643  5513470 PTN3    4.790000    12  3   1
21  21:51.0 3113678643  5518399 PTN3    4.790000    12  3   5
'''

df = pd.read_csv(StringIO(d), sep='\s+')

# calculating total pwr -> sum(watts)


agg_df = (
    df
    .groupby(['TimeStamp', 'Band', 'Channel'], as_index=False)
    .agg({'watts': 'sum'})
)

# freq = 1200+(Band-1)+Channel/12
agg_df['freq'] = (
    1200
    + agg_df['Band']-1
    + agg_df['Channel']/12
)

print(agg_df.to_markdown(index=False, floatfmt=''))

Result:

|    TimeStamp |   Band |   Channel |    watts |               freq |
|-------------:|-------:|----------:|---------:|-------------------:|
| 3113677432.0 |   12.0 |       1.0 | 1.208916 | 1211.0833333333333 |
| 3113677432.0 |   12.0 |       2.0 | 0.060303 | 1211.1666666666667 |
| 3113677432.0 |   13.0 |       1.0 | 1.208916 | 1212.0833333333333 |
| 3113677488.0 |   12.0 |       2.0 | 0.151473 | 1211.1666666666667 |
| 3113677488.0 |   12.0 |       6.0 | 1.906933 | 1211.5             |
| 3113677488.0 |   13.0 |       1.0 | 0.475604 | 1212.0833333333333 |
| 3113677532.0 |   12.0 |       2.0 | 0.038048 | 1211.1666666666667 |
| 3113677532.0 |   12.0 |       5.0 | 4.79     | 1211.4166666666667 |
| 3113677532.0 |   13.0 |       1.0 | 0.300086 | 1212.0833333333333 |
| 3113678643.0 |   12.0 |       1.0 | 4.865378 | 1211.0833333333333 |
| 3113678643.0 |   12.0 |       5.0 | 4.79     | 1211.4166666666667 |
| 3113678643.0 |   13.0 |       1.0 | 1.208916 | 1212.0833333333333 |
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading