My dataset:
dt<-data.frame(GrossIncome=seq(0, 10000, by = 1000),
Turnover= seq(0, 100000, by = 10000),
Sellers= seq(0, 1, by = 0.1),
Buyers=seq(0, 1, by = 0.1))
So I now I want to summarize this data and divide by 1000 GrossIncome and Turnover.
OUTPUT<-data.frame(
"GrossIncome"=round(sum(dt$GrossIncome)/1000,1),
"Turnover"=round(sum(dt$Turnover)/1000,1),
"GrossIncomeAndTurnover"=round(((sum(dt$Turnover)+sum(dt$Turnover))/1000),1),
"Sellers"=round(sum(dt$Sellers),1),
"Buyers"=round(sum(dt$Buyers),1))
Output
GrossIncome Turnover GrossIncomeAndTurnover Sellers Buyers
1 55 550 1100 5.5 5.5
So any suggestion for a more elegant solution then solution above ? I tried with the code below but this code only works for first two items (GrossIncome and Turnover) but not for rest of items.
dt %>%
dplyr::select(GrossIncome,Turnover)%>%
dplyr:: summarise_all(sum,na.rm=TRUE)/1000
So can anybody help me how to solve this problem?
>Solution :
We can use across() to apply different functions to different columns.
dt %>%
summarize(
across(c(GrossIncome, Turnover), ~ round(sum(.) / 1000, 1)),
GrossIncomeAndTurnover = GrossIncome + Turnover,
across(c(Sellers, Buyers), ~round(sum(.), 1))
)
# GrossIncome Turnover GrossIncomeAndTurnover Sellers Buyers
# 1 55 550 605 5.5 5.5
Note that in both our codes, the GrossIncome and Turnover summaries are computed first and these newly created variables are used in the GrossIncomeAndTurnover calculation. My code accounts for this, simply adding them.