Count number of filled out columns in R dataframe to create new column

March 20, 2023

I have a dataframe df_team which has multiple Team-Columns, of which of course not all have values. The df looks like this, just with over 100 of these Team_ columns:

dput(df_team)

structure(list(Project = c("etwbv", "werg", "sdfg", "qwreg", 
"cae", "refdc"), Team_1 = c("ewrg", "werg", "asd", "qwe", NA, 
"vsfd"), Team_URL_1 = c("abc", "bfh", "fse", "rege", NA, "vsefr"
), Team_2 = c("abc1", "bfh", "fse", "rege1", NA, NA), Team_URL_2 = c("abc", 
"bfh", "fse", "rege", NA, NA), Team_3 = c("abc1", "bfh", NA, 
NA, NA, NA), Team_URL_3 = c("abc", "bfh", NA, NA, NA, NA)), class = "data.frame", row.names = c(NA, 
-6L))

I want to create a new column Team_Count which counts the filled Team_x columns and disregards the NAs, so that I know how many team members one ““Project“` has.

The new dataframe should look like this:

Name   Team_1  Team_URL_1 Team_2  Team_URL_2  Team_3  Team_URL_3 Team_Count
etwbv  ewrg    abc        abc1    abc         abc1    abc        3
werg   werg    bfh        bfh     bfh         bfh     bfh        3
sdfg   asd     fse        fse     fse         NA      NA         2
qwreg  qwe     rege       rege1   rege        NA      NA         2
cae    NA      NA         NA      NA          NA      NA         0
refdc  vsfd    vsefr      NA      NA          NA      NA         1

>Solution :

library(dplyr)

df %>% 
  mutate(Team_Count = rowSums(!is.na(pick(matches("Team_\\d+")))))

If you are using dplyr < 1.1.0 then replace pick with across.

This works by creating a data frame with variables that match the regular expression "Team_\\d+" ("Team_" followed by one or more digits), creating logical values by testing if values are NA, and then summing the number of columns that are not NA by row.

Output

  Project Team_1 Team_URL_1 Team_2 Team_URL_2 Team_3 Team_URL_3 Team_Count
1   etwbv   ewrg        abc   abc1        abc   abc1        abc          3
2    werg   werg        bfh    bfh        bfh    bfh        bfh          3
3    sdfg    asd        fse    fse        fse   <NA>       <NA>          2
4   qwreg    qwe       rege  rege1       rege   <NA>       <NA>          2
5     cae   <NA>       <NA>   <NA>       <NA>   <NA>       <NA>          0
6   refdc   vsfd      vsefr   <NA>       <NA>   <NA>       <NA>          1