Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

The elements in the dictionary are constantly updated. how can i prevent this?

Here is the full code;

import re

f = open('movies.item','r') 
# First three item of movies.item below:
#1|Toy Story (1995)|01-Jan-1995||http://us.imdb.com/M/title-exact?Toy%20Story%20(1995)|0|0|0|1|1|1|0|0|0|0|0|0|0|0|0|0|0|0|0 
#2|GoldenEye (1995)|01-Jan-1995||http://us.imdb.com/M/title-exact?GoldenEye%20(1995)|0|1|1|0|0|0|0|0|0|0|0|0|0|0|0|0|1|0|0
#3|Four Rooms (1995)|01-Jan-1995||http://us.imdb.com/M/title-exact?Four%20Rooms%20(1995)|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|1|0|0
empty_list = []
for items in f:
    new_item = re.sub(r'\n', '', items)
    empty_list.append(new_item)
movie_names = []
splitted_list = None
for i in range(len(empty_list)):

    splitted_list = empty_list[i].split("|")
    movie_names.append(splitted_list[1])

genres = ["Unknown", "Action", "Adventure", "Animation", "Children's","Comedy", "Crime", "Documentary", "Drama",
"Fantasy", "Film-Noir", "Horror", "Musical", "Mystery","Romance", "Sci-Fi", "Thriller", "War", "Western"]
genres.reverse()
genredict = {}
last_dict = {}
reversegenresum = []
for i in range(len(empty_list)):
    x = list(empty_list[i])
    claer_list = []
    for k in range(len(x)):
        if x[k] != "|":
            claer_list.append(x[k])

    claer_list.reverse()
    reverse_genre_data = claer_list[0:19]
    reversegenresum.append(reverse_genre_data)
    

for i in range(3): #trying for 3 movie
    for j in range(len(genres)): 
        if reversegenresum[i][j] == '1':      
            genredict[genres[j]] = '1'
        last_dict[movie_names[i]] = genredict    


print(last_dict) 

What am i trying to do?
-I try to match data from the file that named ‘movies.item’. There are movies and their data informations like ‘0|0|1|0’. If the value of the data is equal to 1 I need to match it with the corresponding category. But I can only do this for 1 movie. Although I do not get an error when I try to do it otherwise, all my data is shaped according to the last data. If you don’t understand what I mean, please copy the code and try it yourself.

Input :

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

{'Toy Story (1995)': {'Comedy': '1', "Children's": '1', 'Animation': '1', 'Thriller': '1', 'Adventure': '1', 'Action': '1'}, 
 'GoldenEye (1995)': {'Comedy': '1', "Children's": '1', 'Animation': '1', 'Thriller': '1', 'Adventure': '1', 'Action': '1'}, 
 'Four Rooms (1995)': {'Comedy': '1', "Children's": '1', 'Animation': '1', 'Thriller': '1', 'Adventure': '1', 'Action': '1'}}  

What I want:

{'Toy Story (1995)': {'Animation': 1, "Children's": 1, 'Comedy': 1}, 
 'GoldenEye (1995)': {'Action': 1, 'Adventure': 1, 'Thriller': 1}, 
 'Four Rooms (1995)': {'Thriller': 1},

>Solution :

Problem

You keep saving the exact same dict genredict to every film

last_dict[movie_names[i]] = genredict  

Simple fix

Use a new dict for each film, and assign it after the j loop it’s enough

for i in range(3):  
    genredict = {}
    for j in range(len(genres)):
        if reversegenresum[i][j] == '1':
            genredict[genres[j]] = '1'
    last_dict[movie_names[i]] = genredict

Improve

You have basically 4 loops that iterate over the same thing : the films, instead of doing the actions one by one on each films, do them together on the films one by one

genres = ["Unknown", "Action", "Adventure", "Animation", "Children's", "Comedy", "Crime", "Documentary", "Drama",
          "Fantasy", "Film-Noir", "Horror", "Musical", "Mystery", "Romance", "Sci-Fi", "Thriller", "War", "Western"]
result = {}
with open('movies.item', 'r') as f:
    for items in f:
        index, name, date, url, _, *values = items.rstrip("\n").split("|")
        item_genre = dict(zip(genres, values))
        result[name] = {genre: value for genre, value in item_genre.items() if value == '1'}
  • split the line once and retrieve all the elements you need : name and values
  • dict(zip( , )) to pair the genres and the values
  • {genre: value for genre, value in item_genre.items() if value == '1'} to keep only the genre with 1

Note that final line should better the following, you don’t need a dict where all values are the same (1), just keep a list

result[name] = [genre for genre, value in item_genre.items() if value == '1']

# {'Toy Story (1995)': ['Animation', "Children's", 'Comedy'], 'GoldenEye (1995)': ['Action', 'Adventure', 'Thriller'], 'Four Rooms (1995)': ['Thriller']}
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading