Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

Vectorized alternative for itertuples using file.write()

Suppose we have a pandas dataframe:

import pandas as pd

data = pd.DataFrame({'columnNM': ['Jerry', 'Bob', 'Phil', 'Bill', 'Mickey', 'Pigpen', 'Robert'], 
                     'columnNM2': ['John', 'Tom', 'Donna', 'Keith', 'Brent', 'Vince', 'Bruce']})

Also suppose we have an open file we are writing to, something opened using:

file = open('myPathExample', 'w')

I want to perform comparison operations, control flow on the data and write back to that file. A simple example would be:

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

for row in data.itertuples():
    file.write('%s was friends with %s \n' %(row.columnNM, row.columnNM2))

Now, I am a beginner level in python and I have read all over that looping or iterating over rows in a pandas dataframe is not ideal, especially for large datasets. I don’t have the knowledge to understand the full details of why.

Is a good vectorized alternative to itertuples for this example even possible? If so, what is it?

>Solution :

The vectorial alternative would be to build a single string and write once to the file:

file.write('\n'.join(data['columnNM']+' was friends with '+data['columnNM2']))

Or, if you want to keep the loop:

for line in (data['columnNM']+' was friends with '+data['columnNM2']+' \n'):
    file.write(line)
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading