Suppose I have a dataframe like the following: X Y Z 1 b 3 2 a 8 3 a 7 4 c 1 5 b 6 6 a 4 7 a 9 8 b 5 9 a 4 I want to create columns A and B, which are dummy variables for if the value of… Read More Create dummy variable for below or above median within group r
I have two tables the first is a single column with Subject Ids and the second is a single column with the Visit Names library(tibble) subjects <- tibble("Subject Id" = c("1-1", "1-2", "1-3")) # A tibble: 3 x 1 `Subject Id` <chr> 1 1-1 2 1-2 3 1-3) visits <- tibble("Visits" = c("a", "b", "c"))… Read More How to apply a visit names to subject IDs using data from two tables
I often find myself creating a list from several variables, then renaming the list to use those variable names as the name of the list so I can access them with $ notation or use bind_rows with the .id argument: a <- 1:10 b <- 11:20 mylist <- list(a, b) names(mylist) <- c("a", "b") mylist$a… Read More Create a named list with list object's variable names as the list names
This might seem whimsical to you, because a similar problem is easily solved in dplyr. But I still want to know how to do it. To illustrate, imagine I am looking at employee data and the goal is to find how many records are there for a given employee-date pair. # Mockup employee data df… Read More How do I find the indices given value in base R by object?
This is probably answered elsewhere but I cant figure out the phrasing to look for. I have data like this: df<-structure(list(PROTOCOL_ID = c(124, 124, 38, 762, 74, 146), PROGRAM_AREA = c("LOCR", "CRC", "LOCR", "Pedi", "LOCR", "LOCR")), row.names = c(NA, 6L ), class = "data.frame") As you can see, two rows have the same "protocol_id", they’re… Read More Put duplicates in same row, R
I have wide df with multiple measurements. I would like to change wide to long. How should I do this. I know how to do 2 cols, but not multiples. Could someone guide me on this? Input on the top, and ideal output on the bottom: df<-structure(list(Subject = c("Tom", "Tom", "Tom", "Tom", "Tom", "Tom", "Tom",… Read More how to put multiple cols into long format base on suffix of variable names
Suppose I have the data frame like the one below with two grouping variables "Group" and "Gender" and two additional variables with counts: Group <- c("Group1","Group1","Group2","Group2") Gender <- c("Male","Female","Male","Female") Y <- c(7,5,6,10) N <- c(45,8,2,11) data <- cbind.data.frame(Group,Gender,Y,N) > data Group Gender Y N 1 Group1 Male 7 45 2 Group1 Female 5 8 3… Read More Add total rows to data frame by group using two grouping variables in R
I have a vector of numbers (eg. c(1, 11, 1232, 4221, 2)), and I need a corresponding vector of the sums of digit of each element (c(1, 2, 8, 9, 2), in the previous example). I found some nice solutions for single numbers. the nicest (from Digit sum function in R) is: digitsum <- function(x)… Read More vectorised function to calculate sum of digits
I often find myself wanting to update a data frame based on a separate data frame that has new values for a subset of columns and rows. For example: library(dplyr) df_original <- data.frame( id=c(1,2,3), name=c("John", "Rose", "Kanaya"), address=c("100 Street st.", "413 Old St.", "200 Drive Dr.") ) df_newinfo <- data.frame(id=c(2), address=c("612 New St.")) I want… Read More R pattern for updating a df column based on another df, when present
Apologies if the question isn’t formulated correctly in the title, I am fairly new to this and still not used to the exact terminology. I need to add a column to a data frame that contains the result of operations (e.g., mean, sum, etc.) grouped by values in other columns. At the same time, I… Read More Is there an R function to do grouped operations on a data frame without collapsing it?