Home Separate entries in dataframe in new rows in R

Questions

Separate entries in dataframe in new rows in R

January 23, 2023

I have data.frame df below.

 df <- data.frame(id = c(1:12),
               A = c("alpha", "alpha", "beta", "beta", "gamma", "gamma", "gamma", "delta", 
                     "epsilon", "epsilon", "zeta", "eta"),
               B = c("a", "a; b", "a", "c; d; e", "e", "e", "c; f", "g", "a", "g; h", "f", "d"),
               C = c(NA, 4, 2, 7, 4, NA, 9, 1, 1, NA, 3, NA),
               D = c("ii", "ii", "i", "iii", "iv", "v", "viii", "v", "viii", "i", "iii", "i"))

Column ‘B’ contains four entries with semicolons. How can I copy each of these rows and enter in column ‘B’ each of the separate values?

The expected result df2 is:

 df2 <- data.frame(id = c(1, 2, 2, 3, 4, 4, 4, 5, 6, 7, 7, 8, 9, 10, 10, 11, 12),
               A = c(rep("alpha", 3), rep("beta", 4), rep("gamma", 4), "delta", rep("epsilon", 3), 
                     "zeta", "eta"),
               B = c("a", "a", "b", "a", "c", "d", "e", "e", "e", "c", "f", "g", "a", "g", "h", "f", "d"),
               C = c(NA, 4, 4, 2, 7, 7, 7, 4, NA, 9, 9, 1, 1, NA, NA, 3, NA),
               D = c("ii", "ii", "ii", "i", "iii", "iii", "iii", "iv", "v", "viii", "viii", "v", "viii", "i", "i", "iii", "i"))

I tried this, but no luck:

 df2 <- df
 # split the values in column B
 df2$B <- unlist(strsplit(as.character(df2$B), "; "))
 # repeat the rows for each value in column B
 df2 <- df2[rep(seq_len(nrow(df2)), sapply(strsplit(as.character(df1$B), "; "), length)),]
 # match the number of rows in column B with the number of rows in df2
 df2$id <- rep(df2$id, sapply(strsplit(as.character(df1$B), "; "), length))
 # sort the dataframe by id
 df2 <- df2[order(df2$id),]

>Solution :

We may use separate_rows here – specify the sep as ; followed by zero or more spaces (\\s*) to expand the rows

library(tidyr)
df_new <- separate_rows(df, B, sep = ";\\s*")

-checking with OP’s expected

> all.equal(df_new, df2, check.attributes = FALSE)
[1] TRUE

In the base R, we may replicate the sequence of rows by the lengths of the list output

lst1 <- strsplit(df$B, ";\\s+")
df_new2 <- transform(df[rep(seq_len(nrow(df)), lengths(lst1)),], B = unlist(lst1))
row.names(df_new2) <- NULL

componentsseparatedbystring

byMR

Published January 23, 2023

Add a comment

How to perform majority voting from a data frame with ranking criteria

byMR

January 23, 2023

Questions

golang ioutil.ReadAll / ioutil.ReadFile / ioutil.ReadDir deprecated

byMR

January 23, 2023

Questions

Recursively transform JSON using jolt

byMR

January 23, 2023

Questions

Overwrite css classes in antd select

byMR

January 23, 2023

Questions

PHP Fatal error: Uncaught Error: Undefined constant "monthly_orders" – in functions.php WordPress php 8 update

byMR

January 23, 2023

Questions

Visual Studio code activity bar – Missing changed files tab

byMR

January 23, 2023

Separate entries in dataframe in new rows in R

MEDevel.com: Open-source for Healthcare and Education

>Solution :

Like this:

Leave a ReplyCancel reply

Read more

How to perform majority voting from a data frame with ranking criteria

golang ioutil.ReadAll / ioutil.ReadFile / ioutil.ReadDir deprecated

Recursively transform JSON using jolt

Overwrite css classes in antd select

PHP Fatal error: Uncaught Error: Undefined constant "monthly_orders" – in functions.php WordPress php 8 update

Visual Studio code activity bar – Missing changed files tab

Keep Up to Date with the Most Important News

Separate entries in dataframe in new rows in R

MEDevel.com: Open-source for Healthcare and Education

>Solution :

Share this:

Like this:

Leave a ReplyCancel reply

Keep Up to Date with the Most Important News

Read more

How to perform majority voting from a data frame with ranking criteria

golang ioutil.ReadAll / ioutil.ReadFile / ioutil.ReadDir deprecated

Recursively transform JSON using jolt

Overwrite css classes in antd select

PHP Fatal error: Uncaught Error: Undefined constant "monthly_orders" – in functions.php WordPress php 8 update

Visual Studio code activity bar – Missing changed files tab

Discover more from Dev solutions