Pivot dataframe with uneven categories

enter image description here

Hello everyone,

I have this dataframe from data I scraped. I wanted to pivot_wider but doing that shifts data down. This might be due to the types column having differents lengths of unique values (eg. price appears 12 times and étage 6 times) but I can’t seem to find a way to do what I want it to do.

enter image description here

In this picture it looks correct but in the column bains/douches the data has shifted down so it’s not correct. If someone could help me out I would really appreciate it. Thanks

>Solution :

The problem here is that there is no indicator where a new entry begins. Below, I assume that each new "price" row marks the start of a new entry. That way, the missing bathroom appears as such.

library(tidyverse)

# create dummy data
df <- tibble(
  type_info = c(1890000, 4, 148, 2, 4, 905351, 2, 89, 2),
  types = c("price", "chambres", "superficie", "bains", "etage", "price", "chambres", "superficie", "etage")
)

# create ID, assuming that each price row marks a new entry
df |> 
  mutate(id = cumsum(types == "price")) |> 
  pivot_wider(
    names_from = types,
    values_from = type_info
  )
#> # A tibble: 2 × 6
#>      id   price chambres superficie bains etage
#>   <int>   <dbl>    <dbl>      <dbl> <dbl> <dbl>
#> 1     1 1890000        4        148     2     4
#> 2     2  905351        2         89    NA     2

Created on 2023-03-09 with reprex v2.0.2

Leave a Reply