Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

dataframe – how to perform actions on grouped dataframe

I have the following dataframe morphology:

  month site  depth num.core num.plant num.leaf
  <chr> <chr> <dbl>    <dbl>     <dbl>    <dbl>
1 Oct   SB       12        1         1        5
2 Oct   SB       12        1         2       29
3 Oct   SB       12        1         3        7
4 Oct   SB       12        2         1        9
5 Oct   SB       12        2         2        4
6 Oct   SB       12        2         3       13

My aim if to count number of plants (num.plant) per core (num.core), at set date (month), and depth.

I have grouped the dataframe and counted the number of plants per core as I need:

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

morpho.group <- morphology %>%
  group_by(month, site, num.core, depth) %>%
  count(month,site,num.core,depth, name = "plant.count.Xcore") 
  month site   num.core depth plant.count.Xcore
  <chr> <chr>     <dbl> <dbl>             <int>
1 Dec   D           1     3                 4
2 Dec   D           2     3                 2
3 Dec   D           3     3                 3
4 Dec   D           4     3                 3
5 Dec   N           1    12                 1
6 Dec   N           2    12                 5

My issue is that I need to perform more actions on the morphology dataframe such as summing the number of leaves per core such as:

count.morpho <- morphology %>%
  group_by(month, site, num.core, depth) %>%
  summarise_at(vars("num.leaf", "num.roots"), sum)
 month site   num.core depth num.leaf num.roots
  <chr> <chr>     <dbl> <dbl>    <dbl>     <dbl>
1 Dec   D           1     3       11        13
2 Dec   D           2     3       17         8
3 Dec   D           3     3       14         4
4 Dec   D           4     3       40        10
5 Dec   N           1    12        3         2
6 Dec   N           2    12       40        10

I need to perform these actions such that they are continues and adds up to a single dataframe instead of pulling each calculated column to a new dataframe.

Any help is much appreciated 🙂

>Solution :

count is really just a convenience function to look at n() for the groups, you can include it more literally and add other metrics.

(FYI, your data doesn’t include num.roots, so I replaced it with num.plant here just for demonstration.)

morphology %>%
  group_by(month, site, num.core, depth) %>%
  summarize(
    plant.count.Xcore = n(), 
    across(c(num.leaf, num.plant), sum)
  ) %>%
  ungroup()
# # A tibble: 2 x 7
#   month site  num.core depth plant.count.Xcore num.leaf num.plant
#   <chr> <chr>    <int> <int>             <int>    <int>     <int>
# 1 Oct   SB           1    12                 3       41         6
# 2 Oct   SB           2    12                 3       26         6

FYI, summarize_at is "superseded" by across. Notice now the change occurs: use summarize as usual, use across but not assigned to something, by itself; first arg to across is a set of vars to choose, using similar methods as select including c(col1, col2), starts_with("num"), and negation of those options; the second argument is one or more functions in various ways, similar to summarize_at‘s function argument(s). See the colwise vignette for more details.

Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading