I’d like to select the rows interval of a data frame (ds) by some level, in my case STAND. I tried the following:
# Packages
library(dplyr)
# Data
ds <- read.csv("https://raw.githubusercontent.com/Leprechault/trash/main/my_ts_data.csv")
table(ds$STAND)
# ALCINA_001A ALDOSANI_031A AROEIRAB_017A ARROIOXAVIER_012A ARROIOXAVIER_027B
# 182 182 182 182 182
# AZAMBUJAI_001A AZAMBUJAI_018A BARBANEGRA_404F BARRONDAO_026A BOAVISTA_019A
# 182 182 182 364 182
# BOMRECREIO_010A CAMBARA_014A CANAFISTULA_046A CERRODACRUZ_013D CHACARA_002A
# 182 182 2548 182 182
# CORDILHEIRA_004B COXILHADOROSA_006B CUENTRILHOI_019D CUENTRILHOI_028A CUENTRILHOI_040A
# 182 182 182 182 182
# DOISIRMAOS_001A ESTRADAREALI_006A FIGUEIRAS_012A FOLLES_002A FOLLES_011A
# 182 182 546 182 182
# GRAXAIM_002A GUABIJU_007C GUABIJU_010C GUABIJU_011B GUABIJU_016B
# 182 182 182 182 182
# GUABIJU_017B GUAJUVIRAII_002A MONTECASTELO_035A MONTECASTELO_050B PANTERANEGRA_006F
# 364 728 182 182 182
# PARAISO_035D PARAISO_041A PARAISOII_029A PASSODAESTANCIA_002A PINHEIROS_040A
# 182 182 364 182 182
# PONTADASCANAS_021D QUITERIA_225A RAMOS_003B RAMOS_014A RANCHOVELHO_004A
# 182 182 182 182 182
# RINCAODOSSOARES_004A RINCAODOSSOARES_005D RINCAODOSSOARES_015E SANTAROSAI_026A SAOBRAS_004B
# 182 364 546 182 182
# SAOJOAO_004A SONHOMEU_004A TAQUAREMBO_003F TAQUAREMBOII_002A TERRADURA_033B
# 182 182 182 182 182
# VACACAI_009D VENDAVELHA_014B VILAPALMA_012B VILAPALMA_025A VILAPALMA_026A
# 182 182 182 182 182
# VILAPALMA_030A VILAPALMA_030C VILAPALMA_033C VILAPALMA_033K
# 182 182 182 182
This is good not because I need length 182 for all STAND level and for take the correct values I need the interval 1:182 by level (Wrong factors are GUABIJU_017B,GUAJUVIRAII_002A,RINCAODOSSOARES_005D,CANAFISTULA_046A,RINCAODOSSOARES_015E,BARRONDAO_026A). The other values must be ignored.
My desired output is:
table(ds$STAND)
# ALCINA_001A ALDOSANI_031A AROEIRAB_017A ARROIOXAVIER_012A ARROIOXAVIER_027B
# 182 182 182 182 182
# AZAMBUJAI_001A AZAMBUJAI_018A BARBANEGRA_404F BARRONDAO_026A BOAVISTA_019A
# 182 182 182 182 182
# BOMRECREIO_010A CAMBARA_014A CANAFISTULA_046A CERRODACRUZ_013D CHACARA_002A
# 182 182 182 182 182
# CORDILHEIRA_004B COXILHADOROSA_006B CUENTRILHOI_019D CUENTRILHOI_028A CUENTRILHOI_040A
# 182 182 182 182 182
# DOISIRMAOS_001A ESTRADAREALI_006A FIGUEIRAS_012A FOLLES_002A FOLLES_011A
# 182 182 182 182 182
# GRAXAIM_002A GUABIJU_007C GUABIJU_010C GUABIJU_011B GUABIJU_016B
# 182 182 182 182 182
# GUABIJU_017B GUAJUVIRAII_002A MONTECASTELO_035A MONTECASTELO_050B PANTERANEGRA_006F
# 182 182 182 182 182
# PARAISO_035D PARAISO_041A PARAISOII_029A PASSODAESTANCIA_002A PINHEIROS_040A
# 182 182 182 182 182
# PONTADASCANAS_021D QUITERIA_225A RAMOS_003B RAMOS_014A RANCHOVELHO_004A
# 182 182 182 182 182
# RINCAODOSSOARES_004A RINCAODOSSOARES_005D RINCAODOSSOARES_015E SANTAROSAI_026A SAOBRAS_004B
# 182 182 182 182 182
# SAOJOAO_004A SONHOMEU_004A TAQUAREMBO_003F TAQUAREMBOII_002A TERRADURA_033B
# 182 182 182 182 182
# VACACAI_009D VENDAVELHA_014B VILAPALMA_012B VILAPALMA_025A VILAPALMA_026A
# 182 182 182 182 182
# VILAPALMA_030A VILAPALMA_030C VILAPALMA_033C VILAPALMA_033K
# 182 182 182 182
Please, any help with it?
>Solution :
In other words, you want the first 182 rows by groups of STAND? If so, you can use slice_head:
library(dplyr)
slice_head(ds, by = STAND, n = 182)
Note that other syntaxes in dplyr include:
ds |>
group_by(STAND) |>
slice_head(n = 182) |>
ungroup()
ds |>
filter(row_number() %in% 1:182, .by = STAND)