Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

How to turn the item in the column to multiple columns?

I am trying to do this.
So, currently my df look like this.

col_names = ['movie_id', 'movie_title', 'genres']
df = pd.read_csv('/content/drive/MyDrive/testing/ml-1m/movies.csv', sep='::', names=col_names, encoding='latin-1')

1::Toy Story (1995)::Animation|Children's|Comedy
2::Jumanji (1995)::Adventure|Children's|Fantasy
3::Grumpier Old Men (1995)::Comedy|Romance
4::Waiting to Exhale (1995)::Comedy|Drama
5::Father of the Bride Part II (1995)::Comedy
6::Heat (1995)::Action|Crime|Thriller
7::Sabrina (1995)::Comedy|Romance
8::Tom and Huck (1995)::Adventure|Children's
9::Sudden Death (1995)::Action
10::GoldenEye (1995)::Action|Adventure|Thriller

while the columns are such "movie_id, name, and genre".

I want it to look like this.

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

The columns would be movie_id, name,Action,Adventure,Animation,Children’s,Comedy,Crime,Documentary,Drama,Fantasy,Film-Noir,Horror,Musical,Mystery,Romance,Sci-Fi,Thriller,War,Western


1::Toy Story (1995)::0::0::1::1::1::0::0::0::0::0::0::0::0::0::0::0::0::0::0
2::Jumanji (1995)::0::1::0::1::0::0::0::0::1::0::0::0::0::0::0::0::0::0::0
.
.
...

Basically, I want to turn genre column to multiple columns and 1 will be there if it match the columns.

Is there anyway to do this with pandas?

>Solution :

IIUC you can do it like that:

res = (df
       .join(df['genre'].str.get_dummies())
       .drop('genre',axis=1))

print(res)

print(res)

   movie_id                                name  Action  Adventure  Animation  Children's  Comedy  Crime  Drama  Fantasy  Romance  Thriller
0         1                    Toy Story (1995)       0          0          1           1       1      0      0        0        0         0
1         2                      Jumanji (1995)       0          1          0           1       0      0      0        1        0         0
2         3             Grumpier Old Men (1995)       0          0          0           0       1      0      0        0        1         0
3         4            Waiting to Exhale (1995)       0          0          0           0       1      0      1        0        0         0
4         5  Father of the Bride Part II (1995)       0          0          0           0       1      0      0        0        0         0
5         6                         Heat (1995)       1          0          0           0       0      1      0        0        0         1
6         7                      Sabrina (1995)       0          0          0           0       1      0      0        0        1         0
7         8                 Tom and Huck (1995)       0          1          0           1       0      0      0        0        0         0
8         9                 Sudden Death (1995)       1          0          0           0       0      0      0        0        0         0
9        10                    GoldenEye (1995)       1          1          0           0       0      0      0        0        0         1

If you don’t want to delete the original genre column, just remove the last part with drop

Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading