Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

Create runs of repeating values in dplyr

Example data:

example_data <-
  data.frame(value = c(1,3,4,6,7,8,4,6,9,0),
             group = c("Not applicable",
                       "Large group",
                       "Large group",
                       "Not applicable",
                       "Group of 1",
                       "Large group",
                       "Large group",
                       "Large group",
                       "Group of 1",
                       "Not applicable"))

I would like to assign group numbers, starting with 1, to groups (both "Large group" and "Group of 1"), and zeroes to "Not applicable" values, using dplyr.

There can be more than one Not applicable value in a row. Group of 1 alway contains one row. Large group may contain any number of rows.

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

Desired output:

   value          group group_number
1      1 Not applicable            0
2      3    Large group            1
3      4    Large group            1
4      6 Not applicable            0
5      7     Group of 1            2
6      8    Large group            3
7      4    Large group            3
8      6    Large group            3
9      9     Group of 1            4
10     0 Not applicable            0

I tried this solution from the answers to my previous question:

example_data %>%
  mutate(group_number = with(rle(group != "Not applicable"), 
                      rep(cumsum(values) * values, lengths)))

And got

   value          group group_number
1      1 Not applicable            0
2      3    Large group            1
3      4    Large group            1
4      6 Not applicable            0
5      7     Group of 1            2
6      8    Large group            2
7      4    Large group            2
8      6    Large group            2
9      9     Group of 1            2
10     0 Not applicable            0

I would like to get separate numbers for Large group and Group of 1.

>Solution :

example_data %>%
  mutate(gr = data.table::rleid(group)* (group != 'Not applicable'),
         gr = dense_rank(gr) - 1) # or even gr = as.numeric(factor(gr)) - 1

   value          group gr
1      1 Not applicable  0
2      3    Large group  1
3      4    Large group  1
4      6 Not applicable  0
5      7     Group of 1  2
6      8    Large group  3
7      4    Large group  3
8      6    Large group  3
9      9     Group of 1  4
10     0 Not applicable  0
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading