I have a dataset with building IDs, each ID represents a structure (building, house and etc.). Now if a building has multiple units/condos/apartments then the IDs will be repeated for each of the units.
For each district, 1 and 6 in this case:
- I am trying to separate repetitive IDs from nonrepetitive IDs.
- And then count the total number of each repetitive ID and sum their
dollar value. - Lastly create two new columns,
ID_CountandFloor. Whenever an
ID is repeated more than once,ID_Countwill be greater than 1
and Floor will have a valuemultistory
Based on the sample data below how can I do this?
structure(list(ID = c(1, 1, 1, 2, 2, 3, 4, 4, 4, 5, 6, 11, 11,
11, 22, 22, 33, 44, 44, 44, 55, 66), Zoning = c("Residential",
"Residential", "Residential", "Commercial", "Commercial", "Residential",
"Miscellaneous", "Miscellaneous", "Miscellaneous", "Commercial",
"Residential", "Residential", "Residential", "Residential", "Commercial",
"Commerical", "Residential", "Miscellaneous", "Miscellaneous",
"Miscellaneous", "Commercial", "Residential"), District = c(1,
1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6
), Value = c(11111, 12111, 13111, 14111, 16111, 17111, 18111,
19111, 22222, 33333, 11111, 12111, 13111, 14111, 16111, 17111,
18111, 19111, 22222, 33333, 44444, 55555)), class = "data.frame", row.names = c(NA,
-22L))
Desired Output
ID Zoning District Value ID_Count Floor
1 Residential 1 36333 3 multistory
2 Commercial 1 30222 2 multistory
3 Residential 1 17111 1 singlestory
4 Miscellaneous 1 59444 3 multistory
5 Commercial 1 33333 1 singlestory
6 Residential 1 11111 1 singlestory
11 Residential 6 36333 3 multistory
22 Commercial 6 30222 2 multistory
33 Residential 6 17111 1 singlestory
44 Miscellaneous 6 59444 3 multistory
55 Commercial 6 33333 1 singlestory
66 Residential 6 11111 1 singlestory
>Solution :
With dplyr
library(dplyr)
df %>% group_by(ID,Zoning,District) %>% summarise(n = n(), Value = sum(Value), Floor = ifelse(n>1,"multistory","singlestory"))
# A tibble: 12 × 6
# Groups: ID, Zoning [12]
# ID Zoning District n Value Floor
# <dbl> <chr> <dbl> <int> <dbl> <chr>
# 1 1 Residential 1 3 36333 multistory
# 2 2 Commercial 1 2 30222 multistory
# 3 3 Residential 1 1 17111 singlestory
# 4 4 Miscellaneous 1 3 59444 multistory
# 5 5 Commercial 1 1 33333 singlestory
# 6 6 Residential 1 1 11111 singlestory
# 7 11 Residential 6 3 39333 multistory
# 8 22 Commercial 6 2 33222 multistory
# 9 33 Residential 6 1 18111 singlestory
#10 44 Miscellaneous 6 3 74666 multistory
#11 55 Commercial 6 1 44444 singlestory
#12 66 Residential 6 1 55555 singlestory