I have the dataframe below and I want to create a new one with 2 columns one named Amenities
which will include either "nn_bank" or "nn_hospital" as value and the other column "Name" with the name of the nn_bank
or nn_hospital
.
df<-structure(list(state = c("West Bengal", "West Bengal", "West Bengal",
"West Bengal", "West Bengal"), nn_hospital = c("Khundkuri Hospital",
"Khundkuri Hospital", "Mankar Rural Hospital", "Khundkuri Hospital",
"Khundkuri Hospital"), distance_nn_hospital = c(8949.68646563084,
17217.1419457099, 16939.2318150416, 15812.9872649418, 1408.372117616
), nn_bank = c("contai", "contai", "Allahabad Bank", "contai",
"contai"), distance_nn_bank = c(13959.9950089655, 20598.4763042432,
19688.6296071566, 20537.3799009137, 11385.8738290783)), class = "data.frame", row.names = c(NA,
5L))
result should be like
>Solution :
When you have multiple value columns one option would be to use the names_pattern
argument of tidyr::pivot_longer
along with the special .value
to reshape your data. Afterwards you have to do some additional cleaning.
library(tidyr)
library(dplyr, warn = FALSE)
library(stringr)
df |>
pivot_longer(-state,
names_to = c(".value", "Amenities"),
names_pattern = "(.*?nn)_(.*)"
) |>
select(-c(state, distance_nn), Name = nn) |>
mutate(across(c(Amenities, Name), str_to_title))
#> # A tibble: 10 × 2
#> Amenities Name
#> <chr> <chr>
#> 1 Hospital Khundkuri Hospital
#> 2 Bank Contai
#> 3 Hospital Khundkuri Hospital
#> 4 Bank Contai
#> 5 Hospital Mankar Rural Hospital
#> 6 Bank Allahabad Bank
#> 7 Hospital Khundkuri Hospital
#> 8 Bank Contai
#> 9 Hospital Khundkuri Hospital
#> 10 Bank Contai