Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

Assign a value based on the first letter of each word in another column

I want to create a column 2 (i.e., firstletter) with a numeric value (e.g., 1) assigned depending on the first letter of a word in column 1 (i.e., catname). In the sample dataset, column 1 has a list of cats’ names and I want to assign 1 to cats whose first letter of the name starts with A, 2 to cats whose first letter of the name starts with B, 3 to C, and so forth until the letter Z.

df <- data.frame(catname=c("Ave", "Ares", "Aze", "Bill", "Buz", "Chris", "Chase", "Charlie", "Coco"))

At the moment, I can only think of doing this using case_when() function, e.g.,

df %>% mutate(firstletter = case_when(str_start(catname) == "A" ~ "1",
                                      str_start(catname) == "B" ~ "2",
                                      str_start(catname) == "C" ~ "3"))

So the resulting outcome I hope is

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

| catname  | firstletter    |
| -------- | -------------- |
| Ave      | 1              |
| Ares     | 1              |
| Aze      | 1              |
| Bill     | 2              |
| Buz      | 2              |
| Chris    | 3              |
| Chase    | 3              |
| Charlie  | 3              |
| Coco     | 3              |

I would appreciate your insights if there is another way to approach my problem.

>Solution :

You can subset to the first character, and then match against the build in LETTER array if you want the values to always be 1…26 even if some letters might be missing

df %>% mutate(first=match(substr(catname,1,1), LETTERS))

If you only want numbers for observed values, you can use the factor trick:

df %>% mutate(first=as.numeric(factor(substr(catname,1,1))))
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading