I have a dataframe
df <- data.frame(
cities = c(
rep("Atlanta", 3),
rep("Seattle", 4),
rep("Paris", 2)
),
area = c(
c("A", "C", "X"),
c("B", "H", "I", "D"),
c("Z", "F")
)
)
and I would to create a new dataframe with one column that represents a joined column of the columns cities and area. This is the output I would like to obtain:
data.frame(
new_column = c(
"Atlanta",
"A",
"C",
"X",
"Paris",
"F",
"Z",
"Seattle",
"B",
"D",
"H",
"I"
)
)
i.e. I want to keep the column cities and area alphabetically within each category but order that into one column in a new dataframe.
Is there an elegant way with dplyr to do that?
Thank you!
>Solution :
This might be good enough for you:
library(dplyr)
group_by(df, cities) %>%
reframe(new_column = c(first(cities), sort(area)))
# # A tibble: 12 × 2
# cities new_column
# <chr> <chr>
# 1 Atlanta Atlanta
# 2 Atlanta A
# 3 Atlanta C
# 4 Atlanta X
# 5 Paris Paris
# 6 Paris F
# 7 Paris Z
# 8 Seattle Seattle
# 9 Seattle B
# 10 Seattle D
# 11 Seattle H
# 12 Seattle I
(and then remove the cities column).