Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

Cross table of several columns in R

I have this dataset called x for example:

client_id a b c
1 1 0 0
2 0 1 1
3 0 0 1
4 1 1 1

I then want to create another table, that calculates the number of times each column from a to c had 1, in combination of other columns having 1 too.

So the result would be this table:

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

a b c
a 1 1 1
b 1 0 2
c 1 2 1

So the first line of the second table would read like:
How many times there was only column a having a 1? it’s once only.
How many times there was 1 in both a and b? it’s only once.
How many times there was 1 in both a and c? it happened only once.

How can I do this easily in R? note that I only want to focus on when the column had 1.

>Solution :

You can use crossprod to build the off-diagonal entries. For later referencing, this is sometimes called a co-occurrence matrix.

To create the diagonal, it’s a bit more tricky: find rows that sums up to 1 and apply a column-wise sum.

mat <- as.matrix(df[-1])
out <- crossprod(mat)  # Same as: t(mat) %*% mat
diag(out) <- colSums(mat[rowSums(mat) == 1, ])
out

#   a b c
# a 1 1 1
# b 1 0 2
# c 1 2 1
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading