Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

How to transform table data to dataframe in R

I have csv file containing 3 columns: name, position and plays (plays has value 1 if player plays at that position, otherwise has value 0). One player can play more than one position. There are no two players with the same name. Sample data:

Brian,goalkeeper,0
Brian,fullback,1
Brian,playmaker,1
Brian,centerback,0
Brian,striker,0
Teddy,goalkeeper,0
Teddy,fullback,0
Teddy,playmaker,1
Teddy,centerback,1
Teddy,striker,1
Alan,goalkeeper,1
Alan,fullback,0
Alan,playmaker,0
Alan,centerback,1
Alan,striker,0

I want to transform this data to data frame, to have names in rows, positions as column names, and values 0/1 as data in cells.

players <-read.csv("players.csv", sep=",", stringsAsFactors = FALSE)

command

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

table1 <-    table(players$name,players$position,players$plays)

gives me two tables

        centerback fullback goalkeeper playmaker striker
  Alan           0        1          0         1       1
  Brian          1        0          1         0       1
  Teddy          0        1          1         0       0

and 
        centerback fullback goalkeeper playmaker striker
  Alan           1        0          1         0       0
  Brian          0        1          0         1       0
  Teddy          1        0          0         1       1

(Why is table function doing this?, second one is correct for me). How is it possible to transform table1 to data frame, to have player names in rows, positions as column names, and values 0/1 as data in cells?

>Solution :

table is giving you an 3-D array because you have 3-variables to take into consideration, You need to drop one of them, ie only subset to get where plays == 1 then do a table on that.

as.data.frame.matrix(table(subset(players, plays == 1, -plays)))
      centerback fullback goalkeeper playmaker striker
Alan           1        0          1         0       0
Brian          0        1          0         1       0
Teddy          1        0          0         1       1

Or you could also use xtabs as below:

as.data.frame.matrix(xtabs(plays~name + position, players))
      centerback fullback goalkeeper playmaker striker
Alan           1        0          1         0       0
Brian          0        1          0         1       0
Teddy          1        0          0         1       1

If you need the names as a column, use reshape:

reshape(df, timevar = 'position', idvar = 'name', direction = 'wide', sep='_')
    name plays_goalkeeper plays_fullback plays_playmaker plays_centerback plays_striker
1  Brian                0              1               1                0             0
6  Teddy                0              0               1                1             1
11  Alan                1              0               0                1             0
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading