Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

Extracting dictionary values into .txt files

I am looking to create .txt files from a dictionary, extracting text into new lines of each txt file – dictionary structure looks like:

{'id': 0,
 'text': 'Mtendere Village was inspired by the vision'}

I am using this code:

from tqdm.auto import tqdm  #loading bar

text_data = []
file_count = 0

for sample in tqdm(new_dict):
    # remove newline characters from each sample as we need to use exclusively as seperators
    sample = sample['text'].replace('\n', '\s')
    text_data.append(sample)
    if len(text_data) == 5_000:
        # once we hit the 5K mark, save to file
        with open('file_path\oscar_data\oscar_%s.txt' %file_count, 'w', encoding='utf-8') as fp:
            fp.write('\n'.join(text_data)) 
        text_data = []
        file_count += 1

However this gives me an error;

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

---> 12     sample = sample['text'].replace('\n', '\s') 
TypeError: 'int' object is not subscriptable

Although I understand what the error is telling me, I’m not sure how to correct it…

>Solution :

I think you’re trying to pass a list of dictionaries to the loop, but actually passed a dictionary.

from tqdm.auto import tqdm  #loading bar

new_dict = [
    {
        'id': 0,
        'text': 'Mtendere Village was inspired by the vision'
    }
]

text_data = []
file_count = 0

for sample in tqdm(new_dict):
    # remove newline characters from each sample as we need to use exclusively as seperators
    sample = sample['text'].replace('\n', '\s')
    text_data.append(sample)
    if len(text_data) == 5000:
        # Once we hit the 5K mark, save it to file
        with open('file_path\oscar_data\oscar_%s.txt' %file_count, 'w', encoding='utf-8') as fp:
            fp.write('\n'.join(text_data)) 
        
        text_data = []
        file_count += 1

I have updated new_dict to a list of dictionaries and it fixed the issue.

Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading