Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

Unable to subset a matrix of data.frames using lapply and sapply

I have a list with Data organized as data.frames in a matrix. For example:

> ls <- list(Dates = seq.Date(as.Date('2023-01-01'), by = 'day', length.out = 3), Pars = seq(0.5, 2.0, 0.5))
> df <- data.frame(X = runif(10,0,1), Y = runif(10,0,1))
> ls$Data <- sapply(ls$Dates, function(d) lapply(ls$Pars, function(p) df -> ls$Data[p][d]))
There were 21 warnings (use warnings() to see them) # *Ignore warnings, this is just an example*
> as.character(ls$Dates) -> colnames(ls$Data); ls$Pars -> rownames(ls$Data)
> ls

$Dates
[1] "2023-01-01" "2023-01-02" "2023-01-03"

$Pars
[1] 0.5 1.0 1.5 2.0

$Data
    2023-01-01   2023-01-02   2023-01-03  
0.5 data.frame,2 data.frame,2 data.frame,2
1   data.frame,2 data.frame,2 data.frame,2
1.5 data.frame,2 data.frame,2 data.frame,2
2   data.frame,2 data.frame,2 data.frame,2

I can easily subset a column in a data.frame:

> ls$Data[['1.5','2023-01-02']]$Y
 [1] 0.78773262 0.54989971 0.29513767 0.42966110 0.01719963 0.87326344 0.85021538 0.16226286 0.76293787
[10] 0.53882718

So building on this, I want to add a matrix like Data to my list with the sum of the Y column in each data.frame. I tried using sapply and lapply and subsetting my list as above, but get an error.

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

> ls$SumY <- sapply(ls$Dates, function(d) lapply(ls$Pars, function(p) sum(ls$Data[[p, d]]$Y)))
Error in ls$Data[[p, d]] :
attempt to select less than one element in get1index <real>

>Solution :

You shouldn’t/can’t use Date and numeric class objects for indexing character class row and column names:

## demonstration
ls$Data[['1.5', ls$Dates[2]]]$Y
# Error in ls$Data[["1.5", ls$Dates[2]]] : subscript out of bounds

## works if you convert to `character`
ls$Data[['1.5', as.character(ls$Dates[2])]]$Y
 # [1] 0.35346265 0.25918428 0.56523229 0.09214479 0.11412712 0.92853271 0.65296477 0.12045425
 # [9] 0.95620851 0.87551876

## whole thing works if you convert to `character`
sapply(as.character(ls$Dates), function(d)
  lapply(as.character(ls$Pars), function(p)
    sum(ls$Data[[p, d]]$Y)
  )
)
#      2023-01-01 2023-01-02 2023-01-03
# [1,] 4.91783    4.91783    4.91783   
# [2,] 4.91783    4.91783    4.91783   
# [3,] 4.91783    4.91783    4.91783   
# [4,] 4.91783    4.91783    4.91783  
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading