Can you help me understand how R interprets square brackets with forms such as y[i:j - k]?
dummy data:
y <- c(1, 2, 3, 5, 7, 8)
Here’s what I do understand:
y[i]is the ith element of vector y.y[i:j]is the ith to jth element (inclusive) of vector y.y[-i]is vector y without the first i elements. etc. etc.
However, what I don’t understand is what happens when you start mixing these options, and I haven’t found a good resource for explaining it.
For example:
y[1-1:4]
[1] 5 7 8
So y[1-1:4] returns the vector without the first three elements. But why?
and
y[1-4]
[1] 1 2 5 7 8
So y[1-4] returns the vector without the third element. Is that because 1-4 = -3 and it’s interpretting it the same as y[-3]? If so, that doesn’t seem consistent with my previous example where y[1-1:4] would presumably be interpretted as y[0:4], but that isn’t the case.
and
y[1:1+2-1]
[1] 2
Why does this return the second element? I encountered this while I was trying to code something along the lines of: y[i:i + j - k] and it took me a while to figure out that I should write y[i:(i + j - k)] so the parenthesis captured the whole of the right-hand-side of the colon. But I still can’t figure out what logic R was doing when I didn’t have those brackets.
Thanks!
>Solution :
It’s best to look closer at precedence and the integer sequences you use for subsetting. These are evaluated before subsetting with []. Note that - is a function with two arguments (1, 1:4) which are evaluated beforehand and so
> 1-1:4
[1] 0 -1 -2 -3
Negative indices in [] mean exclusion of the corresponding elements. There is no "0" element (and so subsetting at 0 returns an empty vector of the present type — numeric(0)). We thus expect y[1-1:4] to drop the first three elements in y and return the remainder.
As you write correctly y[1-4] is y[-3], i.e. omission of the third element.
Similar as above, in 1:1+2-1, 1:1 evaluates to a one-element vector 1, the rest is simple arithmetic.
For more on operator precedence, see Hadley’s excellent book.