Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

Create new columns in pandas df by grouping and performing operations on an existing column

I have a dataframe that looks like this (Minimal Reproducible Example)

thermometers = ['T-10000_0001', 'T-10000_0002','T-10000_0003', 'T-10000_0004', 
                'T-10001_0001', 'T-10001_0002', 'T-10001_0003', 'T-10001_0004', 
                'T-10002_0001', 'T-10002_0003', 'T-10002_0003', 'T-10002_0004']

temperatures = [15.1, 14.9, 12.7, 10.8,
               19.8, 18.3, 17.7, 18.1,
               20.0, 16.4, 17.6, 19.3]

df_set = {'thermometers': thermometers,
         'Temperatures': temperatures}

df = pd.DataFrame(df_set)
Index Thermometer Temperature
0 T-10000_0001 14.9
1 T-10000_0002 12.7
2 T-10000_0003 12.7
3 T-10000_0004 10.8
4 T-10001_0001 19.8
5 T-10001_0002 18.3
6 T-10001_0003 17.7
7 T-10001_0004 18.1
8 T-10002_0001 20.0
9 T-10002_0002 16.4
10 T-10002_0003 17.6
11 T-10002_0004 19.3

I am trying to group the thermometers (i.e ‘T-10000’, ‘T-10001’, ‘T-10002’), and create new columns with the min, max and average of each thermometer reading. So my final data frame would look like this

Index Thermometer min_temp average_temp max_temp
0 T-10000 10.8 12.8 14.9
1 T-10001 17.7 18.5 19.8
2 T-10002 16.4 18.3 20.0

I tried creating a separate function which I think requires regular expression, but I’m unable to figure out how to go about it. Any help will be much appreciated.

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

>Solution :

Use groupby by splitting with your delimiter _. Then, just aggregate with whatever functions you need.

>>> df.groupby(df['thermometers']\
               .str.split('_').  \
               .str.get(0)).agg(['min', 'mean', 'max'])

                      min    mean   max
thermometers                           
T-10000              10.8  13.375  15.1
T-10001              17.7  18.475  19.8
T-10002              16.4  18.325  20.0
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading