Why is `colSums(A, ,2)` valid syntax? – looks like an empty argument

I’m reading some code that’s highly optimized for speed on arrays, and it’s using colSums in place of apply in several cases.

First: Can someone explain why this syntax is valid, please? It appears to my eye as if an argument is left empty. Rstudio also flags these lines as missing arguments. I even resorted to an AI chatbot, which incorrectly predicted the results and output dimensions when using colSums this way.

Second: Does anyone have a mnemonic or thinking device to help translate mentally between these two equivalent calls? colSums does not seem an intuitive way to handle arrays higher than two dimensions. I understand it’s an optimized method of summing an array along some dimension, it’s just hard to mentally parse.

Reprex:

A <- array(1:(2*3*4), dim=c(2,3,4))
A
colSums(A, ,2) 
# equivalent apply statement 
apply(A, 3, sum)

>Solution :

Since R functions parameter are evaulated lazily, it’s not a problem to have missing arguments unless you try to use them. For example this will run fine

foo <- function(a, b, c) {
  a + c
}

foo(1, ,5)
# [1] 6

The na.rm parameter isn’t evaluated in the R environment. If you look at the source of colSums you’ll see it makes a call to .Internal to it has slightly different evaluation rules there but the idea is basically the same. It’s using a default so it’s not evaluating the parameter.

I guess your second question is about the dim= parameter. From the help page, it says

dims integer: Which dimensions are regarded as ‘rows’ or ‘columns’ to sum over. For row*, the sum or mean is over dimensions dims+1, …; for col* it is over dimensions 1:dims.

So since you are using colSums a dim of 2 means to sum over dimensions 1:2 which is like the complement of how you would specify it using apply

Leave a Reply