Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

Unlisting lists inside data frame and put them in different columns in r

I used the Twitter API to get lots of tweets. What I did was to create a df with the data I want:

  preprocess <- function(df) {
  df_tw <- do.call(rbind,lapply(df, function (m)
    data.frame(text = df$text,
               lang = df$lang,
               geo = df$geo,
               date = df$created_at)))
  # Select unique rows based on the text column only
  df_u <- df_tw %>% distinct(text, .keep_all=TRUE)
  return(df)
}

However, the coordinates look like this: c(14.4865036, 35.85288308). How can I put them in different columns in the same df?

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

> dput(head(df_mt))
structure(list(text = c("A tiny little fish dish to round off the day. ", 
"Sharing Music #dj #Malta #house #housemusic #pioneer #xemxija #venezuelanDj ", 
"Dj Abraham Sound en Malta #dj #pioneer #paceville #Malta #VenezuelanDj ", 
"Nature’s very own private pool, the blue hole in Gozo is a place that you can enjoy all year round. 📸: @chrissefarbi and @ch.farbmacher \n\n#Malta #VisitMalta #MoreToExplore ", 
"London’s first EV rapid charging hub opened by TfL and Engenie  #Taxi #Chauffeur #Malta ", 
"Incredible to see this in Malta 🇲🇹🇵🇱@FlightPolish "
), lang = c("en", "en", "en", "en", "en", "en"), geo.place_id = c("0fc3ac0d6915e000", 
"1d834adff5d584df", "07d9d2902f483001", "1d834adff5d584df", "1d834adff5d584df", 
"0fc2ecc63cd4c000"), geo.coordinates = structure(list(type = c(NA, 
NA, NA, NA, "Point", NA), coordinates = list(NULL, NULL, NULL, 
    NULL, c(14.4865036, 35.85288308), NULL)), row.names = c(NA, 
6L), class = "data.frame"), date = c("2022-12-30T20:00:29.000Z", 
"2022-12-30T17:21:44.000Z", "2022-12-30T17:16:15.000Z", "2022-12-30T15:54:39.000Z", 
"2022-12-30T14:57:34.000Z", "2022-12-30T14:32:18.000Z"), row.names = c("attachments.3", 
"attachments.4", "attachments.5", "attachments.6", "attachments.7", 
"attachments.8"), class = "data.frame")

Thank you.

>Solution :

With unnest_wider:

library(tidyr)
data.frame(df) |>
  unnest_wider(geo.coordinates.coordinates, names_sep = ".")

output

## A tibble: 6 × 9
#  text                                                            lang  geo.p…¹ geo.c…² geo.c…³ geo.c…⁴ date  row.n…⁵ class
#  <chr>                                                           <chr> #<chr>   <chr>     <dbl>   <dbl> <chr> <chr>   <chr>
#1 "A tiny little fish dish to round off the day. "                en    0fc3ac… NA         NA      NA   2022… attach… data…
#2 "Sharing Music #dj #Malta #house #housemusic #pioneer #xemxija… en    1d834a… NA         NA      NA   2022… attach… data…
#3 "Dj Abraham Sound en Malta #dj #pioneer #paceville #Malta #Ven… en    07d9d2… NA         NA      NA   2022… attach… data…
#4 "Nature’s very own private pool, the blue hole in Gozo is a pl… en    1d834a… NA         NA      NA   2022… attach… data…
#5 "London’s first EV rapid charging hub opened by TfL and Engeni… en    1d834a… Point      14.5    35.9 2022… attach… data…
#6 "Incredible to see this in Malta \U0001f1f2\U0001f1f9\U0001f1f… en    0fc2ec… NA         NA      NA   2022… attach… data…
## … with abbreviated variable names ¹​geo.place_id, ²​geo.coordinates.type, ³​geo.coordinates.coordinates.1,
##   ⁴​geo.coordinates.coordinates.2, ⁵​row.names
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading