I have a large number of matrices for which I need to perform a QR factorization and store the resulting Q matrices (normalized such that the R matrix has positive number in its diagonal). Is there another method than using the qr() function?
Here is the working example:
system.time({
# Parameters for the matrix to be generated
row_number <- 1e6/4
col_number <- 4
# Generate large matrix of random numbers normally distributed.
# Basically it's a matrix containing 4x4 matrices for which I will perform the QR factorization:
RND <- matrix(data = rnorm(n=1e6 , mean = 0,sd = 1),nrow = row_number, ncol = col_number)
# Allocate a 0 matrix where the result will be entered
QSTACK <- matrix(0, nrow = row_number, ncol = col_number)
number_of_blocks <- row_number/4 # The number of 4x4 matrices in RND => 62,500
for (k in c(1:number_of_blocks)) {
l1 <- 1 + col_number * (k-1)
l2 <- col_number * k
QR <- qr(RND[l1:l2,]) # Perform QR factorization
R <- qr.R(QR)
QSTACK[l1:l2,] <- qr.Q(QR) %*% diag(sign(diag(R))) # Normalize such that R diagonal elements are positive
}
})
# user system elapsed
# 3.04 0.03 3.07
So that took 3.07 seconds to compute 62,500 QR factorization. I’m wondering if there is something faster?
>Solution :
If you want:
-
the R factor to have positive diagonal elements
-
to explicitly form the Q factor (rather than its sequential Householder vectors format)
you can cheat by using Cholesky factorization:
cheatQR <- function (X) {
XtX <- crossprod(X)
R <- chol(XtX)
Q <- t(forwardsolve(R, t(X), upper.tri = TRUE, transpose = TRUE))
list(Q = Q, R = R)
}
The raw QR:
rawQR <- function (X) {
QR <- qr(X)
Q <- qr.Q(QR)
R <- qr.R(QR)
sgn <- sign(diag(R))
R <- sgn * R
Q <- Q * rep(sgn, each = nrow(Q))
list(Q = Q, R = R)
}
Benchmark:
X <- matrix(rnorm(10000 * 4), nrow = 10000, ncol = 4)
ans1 <- rawQR(X)
ans2 <- cheatQR(X)
all.equal(ans1, ans2)
#[1] TRUE
library(microbenchmark)
microbenchmark(rawQR(X), cheatQR(X))
#Unit: microseconds
# expr min lq mean median uq max neval
# rawQR(X) 3083.537 3109.222 3796.191 3123.2230 4782.583 13895.81 100
# cheatQR(X) 828.025 837.491 1421.211 865.9085 1434.657 32577.01 100
For further speedup, link your R to an optimized BLAS library, like OpenBLAS. In high performance computing (HPC) mode, Cholesky is way much faster than QR.