Conditional dplyr::summarize() of a data.frame in R

Advertisements In my DATA below, I wonder how to summarize() the number of 6 different Ethnicities (Hispanic, AmIndian, Asian, White, Pacific, AsiaPacific) chosen ("Y") when Ethinc_overall!="B"? library(tidyverse) DATA <- read.table(h=TRUE,text= "EL_Type Language Black Hispanic AmIndian Asian White Pacific AsiaPacific Ethinc_overall Current English Black Y N N N N N H Current English Black N N… Read More Conditional dplyr::summarize() of a data.frame in R

Summarize with arithmetic operations on rows by column entry

Advertisements library(tidyverse) set.seed(1) start <- mdy("01/01/2022") end <- start + as.difftime(4, units = "days") days <- seq(from = start, to = end, by = 1) days <- sample(days, 100, replace = T) flip <- sample(c("Heads", "Tails"), 100, replace = TRUE) numbers <- rchisq(100, 30) df <- tibble(days, numbers, flip) I have this dataframe and would… Read More Summarize with arithmetic operations on rows by column entry

Mean and sd per group for multiple variables when NAs present

Advertisements I would like to create a table of mean and sd for multiple variables for grouped data. However, the data has NAs, so I need to include the na.rm =T command. Using iris as a MWE, altered to include NAs: irisalt = iris irisalt[1,1] =NA irisalt[52,2] =NA irisalt[103,3]= NA First attempt: irisalt%>% group_by(Species)%>% summarise(count… Read More Mean and sd per group for multiple variables when NAs present

Collapsing multiple observations based on specific parameters in R

Advertisements I am quite new to R. I have a dataset with 8081 observations for 113 variables. The data was collected in 4 waves (panels), with some individuals being interviewed multiple times. They were sometimes asked the same questions, but some questions were only asked during one wave. Most answers were on a scale (e.g.… Read More Collapsing multiple observations based on specific parameters in R

Group and add variable of type stock and another type in a single step?

Advertisements I want to group by district summing ‘incoming’ values at quarter and get the value of the ‘stock’ in the last quarter (3) in just one step. ‘stock’ can not summed through quarters. My example dataframe: library(dplyr) df <- data.frame ("district"= rep(c("ARA", "BJI", "CMC"), each=3), "quarter"=rep(1:3,3), "incoming"= c(4044, 2992, 2556, 1639, 9547, 1191,2038,1942,225), "stock"=… Read More Group and add variable of type stock and another type in a single step?