I am using pmax and pmin to extract the max and min values from each row. I have some values that are statistically not significant and these values are surrounded by <>. For some reason, pmax and pmin still take into consideration these values and then I cannot calculate the difference between values that are significant. Below is an example:
| ID | Var1 | Var2 | Var3 | Var4 |
|---|---|---|---|---|
| A | 1 | !5! | NA | 10 |
| B | 20 | NA | NA | 3 |
| C | !20! | 10 | NA | NA |
| D | NA | NA | 30 | NA |
| E | !10! | NA | NA | NA |
I want the !xx! values not included when I do the following:
DF1 = data.frame(ID=c("A","B","C","D","E"),
Var1=c("1","20","!20!","NA","!10!"),
Var2=c("!5!","NA","10","NA","NA"),
Var3=c("NA","NA","NA","30","NA"),
Var4=c("10","NA","NA","NA","NA"),
Var5=c("NA","!50!","20","NA","NA"))
DF1$max <- pmax(DF1$Var1,DF1$Var2,DF1$Var3,DF1$Var4,na.rm = TRUE)
DF1$min <- pmin(DF1$Var1,DF1$Var2,DF1$Var3,DF1$Var4,na.rm = TRUE)
This leads to me getting the following:

When the following is what I want:

How do I prevent the !xx! values from being taken up by pmax and pmin? I appreciate any help!
>Solution :
Assuming your "NA" is really NA (not a string literal):
DF1[-1] <- lapply(DF1[-1], function(z) replace(z, z=="NA", NA))
we can do this:
do.call(pmax, c(lapply(DF1[-1], function(z) replace(z, grepl("!", z), NA)), list(na.rm = TRUE)))
# [1] "10" "20" "20" "30" NA
results stored with:
DF1$max <- do.call(pmax, c(lapply(DF1[-1], function(z) replace(z, grepl("!", z), NA)), list(na.rm = TRUE)))
DF1$min <- do.call(pmin, c(lapply(DF1[-1], function(z) replace(z, grepl("!", z), NA)), list(na.rm = TRUE)))
DF1
# ID Var1 Var2 Var3 Var4 Var5 max min
# 1 A 1 !5! <NA> 10 <NA> 10 1
# 2 B 20 <NA> <NA> <NA> !50! 20 20
# 3 C !20! 10 <NA> <NA> 20 20 10
# 4 D <NA> <NA> 30 <NA> <NA> 30 30
# 5 E !10! <NA> <NA> <NA> <NA> <NA> <NA>
Note that we also need to add na.rm=FALSE.