Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

How to create, name, and populate new column with output Using For loop R

I’m trying to write a loop that takes variables from some columns and calculates using a formula, then populate results in different columns with similar column names with a suffix.

I have a dataframe with 100000 rows 53 columns; 3-29 cols will be used for calculation ..

So far what I did …

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

GC05cr_h16_dat2$ln1 <- 0

for(i in 3:29) {      
  log_read <- log(GC05cr_h16_dat2[ , i] +1) /max(GC05cr_h16_dat2[ , i])
  GC05cr_h16_dat2$ln1[i] <- log_read
}

The table:

  head(GC05cr_h16_dat2)
    # A tibble: 6 Ă— 54
        chr     start EE87893 EE87894 EE87895 EE87896 EE87897 EE87898 EE87899 EE87900 EE87901 EE87902 EE87903
          <chr>   <dbl>   <dbl>   <dbl>   <dbl>   <dbl>   <dbl>   <dbl>   <dbl>   <dbl>   <dbl>   <dbl>   <dbl>
        1 chr3   1.45e8       4       2       4       2       4       2       5       5       4       1       4
        2 chr4   1.63e8       2       4       3       1       1       4       5       5       5       4       5
        3 chr4   3.57e7       3       5       3       1       6       6       5      10       4       6       3
        4 chr18  6.58e7       2       1       6       6       2       1       3       5       3       4       1
        5 chr10  8.43e7       5       1       4       3       1       5       0      11       2       4       8
        6 chr3   1.84e8       5       1       3       5       5       4       3       9       4       3       6
        #

The results of all columns under consideration of for loop are printed in a single column under column name ln1.

My expected results of any single column would be printed in a separate column and the name of the column would be suffixed by ln1.

EE87893ln1 EE87894ln1 EE87895ln1 ..... 

>Solution :

Using your existing code, you can tweak it to change your for loop index from an integer to the column names of interest, then directly create a new column using paste0:

for (i in names(GC05cr_h16_dat2)[grep("EE", names(GC05cr_h16_dat2))]){
  GC05cr_h16_dat2[, paste0(i, "ln1")] <- log(GC05cr_h16_dat2[ , i] + 1) / max(GC05cr_h16_dat2[ , i])
}

Output (you may need to scroll over because the output is wide):

    chr    start EE87893 EE87894 EE87895 EE87896 EE87897 EE87898 EE87899 EE87900 EE87901 EE87902 EE87903 EE87893ln1 EE87894ln1 EE87895ln1 EE87896ln1 EE87897ln1 EE87898ln1 EE87899ln1 EE87900ln1 EE87901ln1 EE87902ln1 EE87903ln1
1  chr3 1.45e+08       4       2       4       2       4       2       5       5       4       1       4  0.3218876  0.2197225  0.2682397  0.1831020  0.2682397  0.1831020  0.3583519  0.1628872  0.3218876  0.1155245  0.2011797
2  chr4 1.63e+08       2       4       3       1       1       4       5       5       5       4       5  0.2197225  0.3218876  0.2310491  0.1155245  0.1155245  0.2682397  0.3583519  0.1628872  0.3583519  0.2682397  0.2239699
3  chr4 3.57e+07       3       5       3       1       6       6       5      10       4       6       3  0.2772589  0.3583519  0.2310491  0.1155245  0.3243184  0.3243184  0.3583519  0.2179905  0.3218876  0.3243184  0.1732868
4 chr18 6.58e+07       2       1       6       6       2       1       3       5       3       4       1  0.2197225  0.1386294  0.3243184  0.3243184  0.1831020  0.1155245  0.2772589  0.1628872  0.2772589  0.2682397  0.0866434
5 chr10 8.43e+07       5       1       4       3       1       5       0      11       2       4       8  0.3583519  0.1386294  0.2682397  0.2310491  0.1155245  0.2986266  0.0000000  0.2259006  0.2197225  0.2682397  0.2746531
6  chr3 1.84e+08       5       1       3       5       5       4       3       9       4       3       6  0.3583519  0.1386294  0.2310491  0.2986266  0.2986266  0.2682397  0.2772589  0.2093259  0.3218876  0.2310491  0.2432388

If you wanted to specify the columns by numbers directly (ie, columns 3 though 29), just use:

for (i in names(GC05cr_h16_dat2)[3:29]){...}

You could also use lapply for this:

GC05cr_h16_dat2[paste0(names(GC05cr_h16_dat2)[3:13], "ln1")] <- 
  lapply(GC05cr_h16_dat2[3:13], function(x) log(x + 1) / max(x))
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading