Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

Group variable and sum under condition in R

I would like to sum the value of "dollvalue" under the condition that ispurchase==1 is true, however I could not find an efficient solution. I tried solutions from other posts that all seemed somehow too complex and ended up not working. I tried to combine the plyr approach by combining group and aggregate but I get the error argument FUN is missing.

library(plyr)
returntrip <- roundtrips %>%
  group_by(id) %>%
  aggregate(purchcost = sum(dollvalue[ispurchase==1], 
                            FUN = sum)) %>%
  ungroup

Also I tried to simply agregate it and I think it almost works but I get the following error:
Error in aggregate.data.frame(as.data.frame(x), …) :
arguments must have same length

I assume because the list and the data frame have not the same length. Is there any way to fix this?

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

returntrip <- aggregate(x = roundtrips$dollvalue[roundtrips$ispurchase==1],
      by = list(roundtrips$id),
      FUN = sum)

This is how a snippet of the dataframe looks like:

head looks like that:

                   ethamount                               dollvalue     id ispurchase             dollarcum
 1:  0.0000877963125548729991613761125535   -0.0010491659350307322180057001403952    883          1  0.000000000000000000
 2:  0.0010000000000000000208166817117217   -0.0107400000000000012817524819297432  36927          1  0.000000000000000000
 3: 75.4154000000000053205440053716301918 -804.6823180000000093059497885406017303   2637          1  0.000000000000000000
 4:  0.1066286798619889564232465772875003   -1.0662867986198896197436170041328296  72274          1  0.000000000000000000
 5:  0.0100000000000000002081668171172169   -0.1000000000000000055511151231257827  94359          1  0.010899999999999993
 6:  0.1000000000000000055511151231257827   -0.9460000000000001740829702612245455   3083          1  0.000000000000000000
 7:  1.0000000000000000000000000000000000   -9.3499999999999996447286321199499071 102645          1  0.000000000000000000
 8:  0.0000000000000000010000000000000001   -0.0000000000000000098900000000000005 117464          1  0.000000000000000000
 9:  0.0100000000000000002081668171172169   -0.1108999999999999985789145284797996  91239          1 -0.010899999999999993
10: 12.0000000000000000000000000000000000 -144.9600000000000079580786405131220818  52894          1  0.000000000000000000
11: 14.7899999999999991473487170878797770 -207.0600000000000022737367544323205948  80993          1  0.000000000000000000
12: 55.2299999999999968736119626555591822 -689.2703999999999950887286104261875153  74580          1  0.000000000000000000
13:  0.1000000000000000055511151231257827   -1.2480000000000002202682480856310576 116147          1  0.000000000000000000
14:  1.9995590000000000863167315401369706  -37.4517400699999996049882611259818077  36943          1  0.000000000000000000
15:  0.3914821535012809605724726225162158   -5.5786206873932533412130396754946560  86862          1  0.000000000000000000
16:  0.4893235858000000160217268785345368   -6.3122742568200003177025791956111789  88279          1  0.000000000000000000
17:  0.0001392130443151549901940194908789   -0.0016510667055777380248654528926977  72433          1  0.000000000000000000
18:  0.1000000000000000055511151231257827   -1.0160000000000000142108547152020037  68487          1  0.000000000000000000
19:  0.7211898100000000422227230956195854   -8.3946493884000012997148587601259351  28354          1  0.000000000000000000
20:  0.6650000000000000355271367880050093   -8.0265500000000002955857780762016773  80397          1  0.000000000000000000

Many thanks for any type of hint or solution.

>Solution :

Try the following code where you subset your data with a condition:

library(dplyr)
df %>%
  group_by(id) %>%
  summarise(
    purchcost = sum(dollvalue[ispurchase == 1]), .groups = "drop")

Output:

# A tibble: 20 × 2
       id purchcost
    <int>     <dbl>
 1    883 -1.05e- 3
 2   2637 -8.05e+ 2
 3   3083 -9.46e- 1
 4  28354 -8.39e+ 0
 5  36927 -1.07e- 2
 6  36943 -3.75e+ 1
 7  52894 -1.45e+ 2
 8  68487 -1.02e+ 0
 9  72274 -1.07e+ 0
10  72433 -1.65e- 3
11  74580 -6.89e+ 2
12  80397 -8.03e+ 0
13  80993 -2.07e+ 2
14  86862 -5.58e+ 0
15  88279 -6.31e+ 0
16  91239 -1.11e- 1
17  94359 -1   e- 1
18 102645 -9.35e+ 0
19 116147 -1.25e+ 0
20 117464 -9.89e-18
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading