Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

Creating A New Calculated Category Within A Column in R

Suppose I have a data frame similar to this, only with 1000’s of observations:

df <- data.frame(Group = c('A', 'A', 'A', 'B', 'B',
                           'B','B','C','C','C','D','D','D','D','D'),
                 Values=c('5','7','9','0','8','4','5','2','1','3','6','3','1','3','5'))

What I want to do is add a new calculated group to the data frame based on values in a group that already exists in the data frame without replacing the original group’s values. For example, lets say I want to retain group D, but create a new group with all of group D’s values +2.

An example of the resulting dataframe I would like is the following:

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

df <- data.frame(Group = c('A', 'A', 'A', 'B', 'B',
                           'B','B','C','C','C','D','D','D','D','D'
                           ,'Dadjusted','Dadjusted','Dadjusted','Dadjusted','Dadjusted'),
                 Values=c('5','7','9','0','8','4','5','2','1','3','6','3','1','3','5',
                          '8','5','3','5','7'))

I have tried using ifelse statements like the following:

   df$adjustedvalues<-ifelse(Group=='D', df$Values+2, df$Values)

but this approach results in data frames that look like the following:

df <- data.frame(Group = c('A', 'A', 'A', 'B', 'B',
                           'B','B','C','C','C','D','D','D','D','D'),
                 Values=c('5','7','9','0','8','4','5','2','1','3','6','3','1','3','5')
                 adjustedvalues=c('5','7','9','0','8','4','5','2','1','3','8','5','3','5','7')

Which is less than ideal for my purposes.

>Solution :

Here is a possible base R option:

rbind(df, data.frame(Group = "Dadjusted", 
                     Values = as.integer(df$Values)[df$Group == "D"]+2))

Output

       Group Values
1          A      5
2          A      7
3          A      9
4          B      0
5          B      8
6          B      4
7          B      5
8          C      2
9          C      1
10         C      3
11         D      6
12         D      3
13         D      1
14         D      3
15         D      5
16 Dadjusted      8
17 Dadjusted      5
18 Dadjusted      3
19 Dadjusted      5
20 Dadjusted      7
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading