Replace all values in column after occurrence of specific value

Advertisements This is probably simple, but I’m missing it. In the example I have several id’s with multiple values each. Within each id, I want to be able to set x to equal 0 only after the occurrence of a 2, leaving the other values untouched. Is there a dplyr way to do this? test… Read More Replace all values in column after occurrence of specific value

May 16, 2024 MRLeave a comment

Using glue on single rows in grouped dataframe in R

Advertisements I have this very simple dataframe in R: df = data.frame( class=c("a", "a", "b") ) Now I want to check if a group has a size larger than one and based on that information create a new column called e.g. class_2 like this: df %>% group_by(class) %>% mutate(class_2 = if_else(n() > 1, glue("{class}_{row_number()}"), class))… Read More Using glue on single rows in grouped dataframe in R

May 14, 2024 MRLeave a comment

how to subset a varible in R inside summarize

Advertisements for some reason this very basic subset is not working. Can anyone replicate the error that I find? I simply trying to summarize a variable that meets a condition. library(tidyverse) dat |> summarise( invoice_usd=sum(invoice_usd,na.rm=TRUE) ,invoice_usd_overdue = sum(invoice_usd[overdue_indicator==0],na.rm = TRUE) ) I get the following results Invoice_usd <dbl> invoice_usd_overdue <dbl> 3525924 3525924 Both are equal… Read More how to subset a varible in R inside summarize

May 10, 2024 MRLeave a comment

Using `case_when` and `mutate` to search multiple columns for conditional

Advertisements I am trying to create a new column in my data frame (NEW) using the case_when functionality in dplyr. I am able to get the code below to run, but I am wondering if there is a way to create this new column based on the four columns that start with COL_ as opposed… Read More Using `case_when` and `mutate` to search multiple columns for conditional

April 30, 2024 MRLeave a comment

Fill NAs when you have zeros in your dataset

Advertisements Suppose you have the following dataframe: df <- data.frame(year=c(rep(2010,12),rep(2011,12),rep(2012,12)), country=c(rep("DEU",4),rep("ITA",4),rep("USA",4), rep("DEU",4),rep("ITA",4),rep("USA",4), rep("DEU",4),rep("ITA",4),rep("USA",4)), industry=c(rep(1:4,9)), stock1=c(rep(0,24),0,0,2,4,1,0,1,2,3,3,3,5), stock2=c(rep(0,24),0,3,3,4,5,0,1,1,2,2,2,5)) and you want to get the following outcome: df2 <- data.frame(year=c(rep(2010,12),rep(2011,12),rep(2012,12)), country=c(rep("DEU",4),rep("ITA",4),rep("USA",4), rep("DEU",4),rep("ITA",4),rep("USA",4), rep("DEU",4),rep("ITA",4),rep("USA",4)), industry=c(rep(1:4,9)), stock1=c(rep(NA,24),0,0,2,4,1,0,1,2,3,3,3,5), stock2=c(rep(NA,24),0,3,3,4,5,0,1,1,2,2,2,5)) The concept is that if, for a particular year, a specific country reports zeros in stock2 across ALL industries, then those zeros… Read More Fill NAs when you have zeros in your dataset

April 27, 2024 MRLeave a comment

ggplot how to plot ribbons in R from dataframe

Advertisements I have dataframe which has Category avg_fc,avg_la,sd_fc,sd_la values in column Category. df = data.frame (Date=c("2020-01-10","2020-01-10","2020-01-10","2020-01-10" ,"2020-01-11","2020-01-11","2020-01-11","2020-01-11", "2020-01-12","2020-01-12","2020-01-12","2020-01-12" ,"2020-01-13","2020-01-13","2020-01-13","2020-01-13"), Category=c("avg_fc","avg_la","sd_fc","sd_la","avg_fc","avg_la","sd_fc","sd_la","avg_fc", "avg_la","sd_fc","sd_la","avg_fc","avg_la","sd_fc","sd_la"), Value=c(25.5,40.5, 8.1,4.3, 29.5 ,31.5,5.6,9.1, 20.5,43.5, 4.1,8.3, 35.5 ,38.5,2.6,3.1)) From the given dataframe I would like to generate a ggplot where Category avg_fc and avg_la are the geom_line and Category sd_fc and sd_la become the… Read More ggplot how to plot ribbons in R from dataframe

April 8, 2024 MRLeave a comment

Selection of days in which one hour meets the condition

Advertisements I have data with this structure (real data is more complicated): df1 <- read.table(text = "DT odczyt.1 odczyt.2 ‘2023-08-14 00:00:00’ 362 1.5 ‘2023-08-14 23:00:00’ 633 4.3 ‘2023-08-15 05:00:00’ 224 1.6 ‘2023-08-15 23:00:00’ 445 5.6 ‘2023-08-16 00:00:00’ 182 1.5 ‘2023-08-16 23:00:00’ 493 4.3 ‘2023-08-17 05:00:00’ 434 1.6 ‘2023-08-17 23:00:00’ 485 5.6 ‘2023-08-18 00:00:00’ 686 1.5… Read More Selection of days in which one hour meets the condition

March 19, 2024 MRLeave a comment

Read multiple values as character vectors from an external file

Advertisements I am currently reading data from a .csv that looks like this: param_file <- tribble( ~variable, ~value, "year", "2023", "version", "Saint_XL, Sinner_XY", "metric", "ATE, OFCE" ) I then need to load the variable and its value as character vectors so I am doing this, with dplyr: year <- param_file %>% filter(variable == "year") %>%… Read More Read multiple values as character vectors from an external file

March 17, 2024 MRLeave a comment

Creating a column by finding its corresponding value in a colname

Advertisements My problem is pretty similar to the following post In my database i have multiple columns containing values like M0, M6, M12… and so on. And i have columns having those names M0, M6, M12… I would like to replace the first columns containing the M0… with the value corresponding in the column As… Read More Creating a column by finding its corresponding value in a colname

March 15, 2024 MRLeave a comment

Extract values from different columns based on ID

Advertisements My data set contains ID and many columns that have ID on their name. data = data.frame(ID = rep(1:3,2), col1 = 1:6, col2 = 7:12, col3 = 13:18) print(data) I’m trying to make a variable based on its ID from different columns. data$new = c(1, 8, 15, 4, 11, 18) I tried it with… Read More Extract values from different columns based on ID

February 16, 2024 MRLeave a comment

Dev solutions

Solutions for development problems

Tag: dplyr