inconsistent behavior of the sprintf function in R

December 7, 2023

While working, I encountered inconsistent behavior of the sprintf function.
Let’s take the code:

vec = c(0.01, seq(0.1, to = 0.95, by  = 0.05), 0.99)
RET <- lapply(vec, FUN = function(x){
  tryCatch({
    test <- sprintf("%02d", 100*x)
    print(paste0(x, " - correct"))
  },
  error = function(cond) {
    print(paste0(x, " - error"))
  })
})

The result of the above lapply is:

Where error means:
Error in sprintf("%02d", 100 * x) : invalid format '%02d'; use format %f, %e, %g or %a for numeric objects

Let’s move on. Let’s run the sprintf function for invalid values outside the lapply loop:

sprintf("%02d", 100*0.15)
sprintf("%02d", 100*0.30)
sprintf("%02d", 100*0.45)
sprintf("%02d", 100*0.55)

Result:

I know that the solution to this problem is to use the as.integer(100*x) converter, but here I am not talking about obtaining correctly working code (because I have it), but about understanding where the inconsistency comes from. Why sometimes we get an error and sometimes not. I like to understand what I am doing and what the result is, but it is difficult for me to rationally explain the described situation, which automatically generates limited trust in R for me.

>Solution :

This is a variant of the Why are these numbers not equal? FAQ.

The documentation in help("sprintf") says regarding integer format strings:

Numeric variables with exactly integer values will be coerced to
integer.

Let’s check:

vec = c(0.01, seq(0.1, to = 0.95, by  = 0.05), 0.99)
sprintf("%.20f", 100 * vec)
# [1] "1.00000000000000000000"  "10.00000000000000000000" "15.00000000000000177636" "20.00000000000000000000" "25.00000000000000000000"
# [6] "30.00000000000000355271" "35.00000000000000000000" "40.00000000000000000000" "45.00000000000000710543" "50.00000000000000000000"
#[11] "55.00000000000000710543" "60.00000000000000000000" "65.00000000000000000000" "70.00000000000000000000" "75.00000000000000000000"
#[16] "80.00000000000000000000" "85.00000000000000000000" "90.00000000000000000000" "95.00000000000000000000" "99.00000000000000000000"

identical(100 * vec[1], 1)
#[1] TRUE
identical(100 * vec[3], 15)
#[1] FALSE