R: how to merge the results in a unique output?

Advertisements

I need to merge the results of my regressions in a unique table. To give you an idea:

This is a sample of my dataset

> head(final, 20)
   nquest nord sex anasc    ireg eta staciv studio  tpens
1     173    1   1  1948      18  64      3      5  2500
2     375    1   2  1925      16  87      4      2  1340
3     629    1   1  1939       5  73      4      3  1188
4     632    1   1  1950       5  62      1      3  1320
5     633    1   2  1934       5  78      4      2  350
6    1238    1   1  1937      15  75      4      3  1000
7    7886    1   1  1950       9  62      1      5  2000
8   11972    2   1  1938      17  74      1      2  750
9   20174    1   1  1941       8  71      1      5  2000
10  20174    2   2  1942       8  70      1      3  132
11  20223    1   2  1938       3  74      1      5  800
12  20223    2   1  1939       3  73      1      4  980
13  20711    2   1  1944       4  68      1      2  1900
14  20837    1   1  1931       8  81      1      4  1600
15  20837    2   2  1928       8  84      1      2  430
16  21461    1   2  1918       5  94      4      2  600
17  22173    1   1  1938      15  74      1      2  1200
18  22208    1   2  1935       5  77      4      2  700
19  22222    1   1  1927       5  85      4      2  1100
20  22276    1   1  1949       8  63      2      5  1170

and I’m running these kind of regressions

lm(log(tpens) ~ sex, data = final)
lm(log(tpens) ~ sex + eta, data = final)
lm(log(tpens) ~ sex + eta + ireg, data = final)
lm(log(tpens) ~ sex + eta + ireg + studio, data = final)

Is there a way to have the outputs next to each other??

This is an example of what I’m looking for (if it can include more information from the regressions, it would be better)

             Estimate                     
(Intercept)  7.47635***   8.5236948***     8.5814025***   7.4580630***
sex         -0.42052***  -0.4170048***    -0.4229487***  -0.4153185***      
eta                      -0.0146341***    -0.0145885***  -0.0068207***
ireg                                      -0.0057238***  -0.0035033***
studio                                                    0.1624156***
....                                                      .....

I’ve seen from past questions that mapply and do.call can be used, but I’m not able to set them in the correct way… Can anyone help me? Is there another way to do it?

UPDATE

thanks to @akrun

# A tibble: 14 x 6
   fmla                                   term        estimate std.error statistic   p.value
   <chr>                                  <chr>          <dbl>     <dbl>     <dbl>     <dbl>
 1 log(tpens) ~ sex                       (Intercept)  7.48     0.0221      339.   0        
 2 log(tpens) ~ sex                       sex         -0.421    0.0148      -28.3  5.53e-162
 3 log(tpens) ~ sex + eta                 (Intercept)  8.52     0.0648      132.   0        
 4 log(tpens) ~ sex + eta                 sex         -0.417    0.0144      -29.0  2.92e-169
 5 log(tpens) ~ sex + eta                 eta         -0.0146   0.000854    -17.1  1.12e- 63
 6 log(tpens) ~ sex + eta + ireg          (Intercept)  8.58     0.0658      130.   0        
 7 log(tpens) ~ sex + eta + ireg          sex         -0.423    0.0144      -29.4  4.43e-173
 8 log(tpens) ~ sex + eta + ireg          eta         -0.0146   0.000853    -17.1  1.44e- 63
 9 log(tpens) ~ sex + eta + ireg          ireg        -0.00572  0.00125      -4.57 5.06e-  6
10 log(tpens) ~ sex + eta + ireg + studio (Intercept)  7.46     0.0602      124.   0        
11 log(tpens) ~ sex + eta + ireg + studio sex         -0.415    0.0119      -34.8  9.36e-234
12 log(tpens) ~ sex + eta + ireg + studio eta         -0.00682  0.000729     -9.36 1.24e- 20
13 log(tpens) ~ sex + eta + ireg + studio ireg        -0.00350  0.00104      -3.37 7.68e-  4
14 log(tpens) ~ sex + eta + ireg + studio studio       0.162    0.00367      44.2  0     

UPDATE 2

# A tibble: 5 x 5
  term        `log(tpens) ~ sex` `log(tpens) ~ sex + eta` `log(tpens) ~ sex + eta + ireg` log(tpens) ~ sex + eta + ireg ~1
  <chr>                    <dbl>                    <dbl>                           <dbl>                            <dbl>
1 (Intercept)              7.48                    8.52                           8.58                             7.46   
2 sex                     -0.421                  -0.417                         -0.423                           -0.415  
3 eta                     NA                      -0.0146                        -0.0146                          -0.00682
4 ireg                    NA                      NA                             -0.00572                         -0.00350
5 studio                  NA                      NA                             NA                                0.162  
# ... with abbreviated variable name 1: `log(tpens) ~ sex + eta + ireg + studio`

>Solution :

We could loop over the formulas, create the linear model, extract the info needed (tidy) and return a single data

library(dplyr)
library(purrr)
library(broom)
out <- lst(log(tpens) ~ sex, log(tpens) ~ sex + eta,
  log(tpens) ~ sex + eta + ireg,
  log(tpens) ~ sex + eta + ireg + studio) %>% 
  map(lm, data = final) %>% 
  map(tidy) %>%
  list_rbind(names_to = 'fmla')

If we want a wide format of ‘estimate’

library(tidyr)
out %>%
  select(fmla, term, estimate) %>% 
  pivot_wider(names_from = fmla, values_from = estimate)

Leave a ReplyCancel reply