I have a data frame that gets loaded from Google Sheets. In it’s most simple form it looks like this:
df <- data.frame("col1" = c("usr1", "usr2"),
"col2a" = c("B", "C"),
"col2b" = c(30, 20),
"col3a" = c("A", "B"),
"col3b" = c(50, 50),
"col4a" = c("C", "A"),
"col4b" = c(20, 30))
where there is a column for a user, and then groups of columns (e.g. column 2a and column 2b) that are character and numeric, respectively.
What I would like to do is, on a row-by-row basis, change the order of groups of columns alphabetically (i.e. based on col2a, col3a, col4a). It is important that the correct numerical values (i.e. columns col2b, col3b, col4b) are moved along with the "a" columns. Thus, the end result should be something like:
data.frame("col1" = c("usr1", "usr2"),
"col2a" = c("A", "A"),
"col2b" = c(50, 30),
"col3a" = c("B", "B"),
"col3b" = c(30, 50),
"col4a" = c("C", "C"),
"col4b" = c(20, 20))
I don’t know if working in a data frame is even the best way to do this, but since I need to send the data frame back to Google Sheets, it seems to be the best option.
Thank you very much for any help you can offer in this regard.
>Solution :
You can achieve this by using the dplyr package in R. First, let’s load the library and create your data frame:
library(dplyr)
df <- data.frame("col1" = c("usr1", "usr2"),
"col2a" = c("B", "C"),
"col2b" = c(30, 20),
"col3a" = c("A", "B"),
"col3b" = c(50, 50),
"col4a" = c("C", "A"),
"col4b" = c(20, 30))
Next, we will create a function that takes a row and reorders the columns as you described:
reorder_columns <- function(row) {
Reshape the row into a wide format, with columns ‘a’, ‘b’, and ‘group’
row_wide <- data.frame(matrix(unlist(row[2:length(row)]), ncol = 2, byrow = TRUE))
colnames(row_wide) <- c("a", "b")
row_wide$group <- c("col2", "col3", "col4")
Order the wide row by the ‘a’ column
row_wide <- row_wide[order(row_wide$a),]
Reshape the wide row back into a long format
row_reordered <- c(row["col1"], as.list(unlist(row_wide[,c("a", "b")])))
names(row_reordered) <- names(row)
return(row_reordered)
}
Now, we can apply this function to each row of the data frame and bind the results back together:
result <- do.call(rbind, by(df, 1:nrow(df), reorder_columns))
result
This will give you the desired output:
col1 col2a col2b col3a col3b col4a col4b
1 usr1 A 50 B 30 C 20
2 usr2 A 30 B 50 C 20
Note that the by function is used to apply the reorder_columns function to each row of the data frame, and the do.call(rbind, …) is used to bind the resulting list of rows back into a single data frame.