Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

How to create a cross tabulation table between two variables with the counts in R?

I have a data frame with two columns that I want to cross tabulate. The data also includes the counts for the combination. I am trying to create the cross table and include those counts within the table. I am struggling to use the counts from the dataframe into the cross table.

> df %>% arrange(d1)%>%  head()
  count d1 d2
1     3  1 15
2    86  1 14
3    13  1 12
4   186  1 16
5    29  1  9
6    86  1 13


> table(df$d1,df$d2)
    
     1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17
  1  0 1 1 1 1 1 1 1 1  1  1  1  1  1  1  1  1
  2  1 0 1 1 1 1 1 1 1  1  1  1  1  1  1  1  1
  3  1 1 0 1 1 1 1 1 1  1  1  1  1  1  1  1  1
  

Expecting [1,15] and [1,14] to show 3, 86 based on the counts in df table.
Right now it shows 0s and 1s only based on if the combinations exists.

Here is my sample data:

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

structure(list(count = c(37L, 6L, 44L, 21L, 8L, 3L, 9L, 17L, 
13L, 32L, 106L, 34L, 505L, 173L, 12L, 2L, 4L, 45L, 3L, 43L, 5L, 
16L, 1L, 27L, 17L, 3L, 4L, 1L, 27L, 86L, 79L, 10L, 161L, 32L, 
3L, 209L, 9L, 83L, 23L, 108L, 161L, 22L, 4L, 16L, 2L, 6L, 67L, 
86L, 3L, 1L, 14L, 14L, 111L, 5L, 5L, 44L, 105L, 13L, 269L, 186L, 
3L, 5L, 5L, 27L, 3L, 186L, 58L, 29L, 34L, 43L, 8L, 92L, 9L, 455L, 
22L, 32L, 4L, 14L, 58L, 22L, 190L, 94L, 27L, 152L, 264L, 36L, 
1L, 505L, 86L, 44L, 3L, 1L, 79L, 75L, 12L, 32L, 11L, 197L, 90L, 
269L, 9L, 6L, 47L, 14L, 158L, 303L, 335L, 37L, 33L, 3L, 83L, 
15L, 31L, 124L, 146L, 26L, 36L, 27L, 37L, 31L, 108L, 121L, 111L, 
11L, 5L, 26L, 166L, 11L, 18L, 11L, 8L, 15L, 18L, 165L, 80L, 14L, 
5L, 3L, 492L, 7L, 90L, 146L, 130L, 197L, 165L, 34L, 22L, 122L, 
29L, 74L, 455L, 303L, 45L, 5L, 173L, 33L, 24L, 229L, 79L, 43L, 
68L, 16L, 10L, 73L, 35L, 99L, 229L, 94L, 23L, 492L, 18L, 84L, 
92L, 86L, 35L, 31L, 1L, 23L, 8L, 121L, 1L, 173L, 400L, 124L, 
20L, 11L, 6L, 3L, 166L, 84L, 31L, 122L, 15L, 24L, 70L, 43L, 74L, 
209L, 45L, 158L, 44L, 15L, 37L, 35L, 27L, 68L, 20L, 15L, 11L, 
21L, 4L, 18L, 44L, 234L, 80L, 10L, 44L, 4L, 47L, 7L, 67L, 10L, 
3L, 173L, 99L, 79L, 130L, 3L, 75L, 1L, 335L, 14L, 106L, 15L, 
34L, 190L, 152L, 16L, 73L, 45L, 1L, 3L, 264L, 160L, 23L, 1L, 
160L, 400L, 105L, 234L, 70L, 35L), d1 = c(10L, 17L, 5L, 3L, 12L, 
1L, 10L, 10L, 12L, 7L, 14L, 6L, 16L, 3L, 7L, 9L, 7L, 13L, 4L, 
8L, 9L, 2L, 7L, 16L, 8L, 15L, 12L, 12L, 2L, 1L, 16L, 15L, 14L, 
5L, 8L, 14L, 11L, 11L, 4L, 4L, 13L, 7L, 12L, 11L, 17L, 8L, 4L, 
13L, 15L, 15L, 12L, 13L, 4L, 5L, 5L, 5L, 2L, 1L, 2L, 1L, 2L, 
13L, 12L, 5L, 3L, 16L, 10L, 1L, 14L, 2L, 7L, 9L, 15L, 16L, 3L, 
11L, 8L, 12L, 9L, 9L, 14L, 11L, 8L, 11L, 16L, 10L, 17L, 6L, 1L, 
3L, 5L, 1L, 3L, 11L, 10L, 14L, 5L, 3L, 6L, 16L, 15L, 15L, 4L, 
14L, 14L, 16L, 16L, 8L, 3L, 7L, 1L, 15L, 6L, 11L, 6L, 5L, 1L, 
15L, 2L, 7L, 14L, 2L, 13L, 10L, 6L, 1L, 3L, 15L, 2L, 3L, 9L, 
7L, 11L, 3L, 10L, 16L, 17L, 7L, 3L, 15L, 1L, 2L, 10L, 13L, 4L, 
5L, 8L, 4L, 9L, 16L, 13L, 4L, 10L, 17L, 6L, 8L, 7L, 11L, 8L, 
9L, 16L, 7L, 14L, 9L, 4L, 3L, 13L, 4L, 8L, 16L, 8L, 6L, 14L, 
14L, 9L, 13L, 17L, 12L, 10L, 1L, 17L, 11L, 16L, 2L, 1L, 7L, 14L, 
12L, 2L, 9L, 8L, 6L, 4L, 13L, 9L, 6L, 5L, 6L, 12L, 11L, 4L, 2L, 
14L, 12L, 11L, 7L, 8L, 6L, 1L, 12L, 9L, 12L, 5L, 3L, 6L, 15L, 
13L, 8L, 10L, 4L, 1L, 13L, 17L, 13L, 1L, 10L, 14L, 17L, 9L, 2L, 
10L, 17L, 2L, 12L, 5L, 3L, 6L, 7L, 3L, 16L, 15L, 5L, 9L, 2L, 
6L, 5L, 13L, 11L, 4L, 6L, 13L, 4L), d2 = c(2L, 14L, 4L, 12L, 
10L, 15L, 15L, 8L, 1L, 14L, 2L, 5L, 6L, 11L, 10L, 17L, 8L, 10L, 
17L, 6L, 5L, 7L, 15L, 15L, 10L, 1L, 9L, 17L, 5L, 14L, 8L, 14L, 
13L, 11L, 5L, 6L, 15L, 1L, 8L, 14L, 14L, 3L, 8L, 7L, 9L, 15L, 
1L, 1L, 2L, 5L, 13L, 12L, 13L, 12L, 9L, 3L, 4L, 12L, 16L, 16L, 
15L, 17L, 5L, 2L, 17L, 1L, 9L, 9L, 5L, 9L, 9L, 14L, 11L, 13L, 
7L, 5L, 12L, 14L, 10L, 8L, 3L, 4L, 11L, 6L, 9L, 1L, 1L, 16L, 
13L, 5L, 8L, 17L, 10L, 9L, 7L, 7L, 10L, 13L, 1L, 2L, 10L, 8L, 
10L, 12L, 11L, 4L, 10L, 14L, 8L, 12L, 11L, 6L, 7L, 2L, 2L, 1L, 
10L, 16L, 10L, 6L, 4L, 1L, 4L, 5L, 17L, 5L, 2L, 3L, 8L, 15L, 
7L, 4L, 12L, 4L, 6L, 17L, 6L, 5L, 16L, 4L, 6L, 6L, 14L, 3L, 3L, 
14L, 9L, 6L, 1L, 5L, 16L, 16L, 13L, 13L, 13L, 3L, 13L, 13L, 16L, 
2L, 7L, 2L, 15L, 3L, 12L, 1L, 11L, 11L, 4L, 3L, 2L, 9L, 9L, 1L, 
4L, 8L, 12L, 6L, 12L, 2L, 2L, 3L, 11L, 11L, 8L, 1L, 17L, 7L, 
3L, 6L, 13L, 4L, 7L, 7L, 13L, 8L, 16L, 14L, 16L, 14L, 5L, 12L, 
8L, 4L, 8L, 16L, 1L, 15L, 7L, 3L, 12L, 11L, 13L, 6L, 10L, 13L, 
5L, 7L, 4L, 15L, 4L, 15L, 4L, 6L, 3L, 3L, 10L, 3L, 11L, 17L, 
16L, 16L, 14L, 2L, 6L, 14L, 11L, 11L, 9L, 12L, 7L, 7L, 16L, 13L, 
12L, 15L, 2L, 16L, 2L, 3L, 9L, 9L)), row.names = c(NA, 252L), class = "data.frame")

>Solution :

xtabs may be useful here

> xtabs(count ~ d1 + d2, df)
    d2
d1     1   2   3   4   5   6   7   8   9  10  11  12  13  14  15  16  17
  1    0 121  99  67  26  90  11  20  29  36  83  13  86  86   3 186   1
  2  121   0 166 105  27 146  16  18  43  37 124  15 160 106   3 269   1
  3   99 166   0 165  44 234  22  33  73  79 173  21 197 190  11 492   3
  4   67 105 165   0  44 122  15  23  35  47  94  35 111 108   7 303   3
  5   26  27  44  44   0  34   3   3   5  11  32   5  44  34   1  74   0
  6   90 146 234 122  34   0  31  43  84  80 152  23 173 209  15 505   5
  7   11  16  22  15   3  31   0   4   8  12  16   3  24  32   1  68   0
  8   20  18  33  23   3  43   4   0  22  17  27   4  31  37   6  79   0
  9   29  43  73  35   5  84   8  22   0  58  75   4  70  92   0 264   2
  10  36  37  79  47  11  80  12  17  58   0   0   8  45 130   9 335   0
  11  83 124 173  94  32 152  16  27  75   0   0  18 229 158   9 400   0
  12  13  15  21  35   5  23   3   4   4   8  18   0  14  14   0  45   1
  13  86 160 197 111  44 173  24  31  70  45 229  14   0 161  10 455   5
  14  86 106 190 108  34 209  32  37  92 130 158  14 161   0  10   0   6
  15   3   3  11   7   1  15   1   6   0   9   9   0  10  10   0  27   0
  16 186 269 492 303  74 505  68  79 264 335 400  45 455   0  27   0  14
  17   1   1   3   3   0   5   0   0   2   0   0   1   5   6   0  14   0

Convert to data.frame if required

as.data.frame.matrix(xtabs(count ~ d1 + d2, df)) 
   1   2   3   4  5   6  7  8   9  10  11 12  13  14 15  16 17
1    0 121  99  67 26  90 11 20  29  36  83 13  86  86  3 186  1
2  121   0 166 105 27 146 16 18  43  37 124 15 160 106  3 269  1
3   99 166   0 165 44 234 22 33  73  79 173 21 197 190 11 492  3
4   67 105 165   0 44 122 15 23  35  47  94 35 111 108  7 303  3
5   26  27  44  44  0  34  3  3   5  11  32  5  44  34  1  74  0
6   90 146 234 122 34   0 31 43  84  80 152 23 173 209 15 505  5
7   11  16  22  15  3  31  0  4   8  12  16  3  24  32  1  68  0
8   20  18  33  23  3  43  4  0  22  17  27  4  31  37  6  79  0
9   29  43  73  35  5  84  8 22   0  58  75  4  70  92  0 264  2
10  36  37  79  47 11  80 12 17  58   0   0  8  45 130  9 335  0
11  83 124 173  94 32 152 16 27  75   0   0 18 229 158  9 400  0
12  13  15  21  35  5  23  3  4   4   8  18  0  14  14  0  45  1
13  86 160 197 111 44 173 24 31  70  45 229 14   0 161 10 455  5
14  86 106 190 108 34 209 32 37  92 130 158 14 161   0 10   0  6
15   3   3  11   7  1  15  1  6   0   9   9  0  10  10  0  27  0
16 186 269 492 303 74 505 68 79 264 335 400 45 455   0 27   0 14
17   1   1   3   3  0   5  0  0   2   0   0  1   5   6  0  14  0

Or may use dcast

library(data.table)
 dcast(df, d1 ~ d2, value.var = 'count')
Key: <d1>
       d1     1     2     3     4     5     6     7     8     9    10    11    12    13    14    15    16    17
    <int> <int> <int> <int> <int> <int> <int> <int> <int> <int> <int> <int> <int> <int> <int> <int> <int> <int>
 1:     1    NA   121    99    67    26    90    11    20    29    36    83    13    86    86     3   186     1
 2:     2   121    NA   166   105    27   146    16    18    43    37   124    15   160   106     3   269     1
 3:     3    99   166    NA   165    44   234    22    33    73    79   173    21   197   190    11   492     3
 4:     4    67   105   165    NA    44   122    15    23    35    47    94    35   111   108     7   303     3
 5:     5    26    27    44    44    NA    34     3     3     5    11    32     5    44    34     1    74    NA
 6:     6    90   146   234   122    34    NA    31    43    84    80   152    23   173   209    15   505     5
 7:     7    11    16    22    15     3    31    NA     4     8    12    16     3    24    32     1    68    NA
 8:     8    20    18    33    23     3    43     4    NA    22    17    27     4    31    37     6    79    NA
 9:     9    29    43    73    35     5    84     8    22    NA    58    75     4    70    92    NA   264     2
10:    10    36    37    79    47    11    80    12    17    58    NA    NA     8    45   130     9   335    NA
11:    11    83   124   173    94    32   152    16    27    75    NA    NA    18   229   158     9   400    NA
12:    12    13    15    21    35     5    23     3     4     4     8    18    NA    14    14    NA    45     1
13:    13    86   160   197   111    44   173    24    31    70    45   229    14    NA   161    10   455     5
14:    14    86   106   190   108    34   209    32    37    92   130   158    14   161    NA    10    NA     6
15:    15     3     3    11     7     1    15     1     6    NA     9     9    NA    10    10    NA    27    NA
16:    16   186   269   492   303    74   505    68    79   264   335   400    45   455    NA    27    NA    14
17:    17     1     1     3     3    NA     5    NA    NA     2    NA    NA     1     5     6    NA    14    NA
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading