Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

Function to take a list of spark dataframe and convert to pandas then csv

import pyspark
            
dfs=[df1,df2,df3,df4,df5,df6,df7,df8,df9,df10,df1,df12,df13,df14,df15]
    
for x in dfs:
    y=x.toPandas()
    y.to_csv("D:/data")

This is what I wrote, but I actually want the function to take this list and convert every df into a pandas df and then convert it to csv and save it in the order as it appears on dfs list and save it to a particular directory in the order of name. Is there a possible way to write such function?
PS D:/data is just an imaginary path and is used for explanation.

>Solution :

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

If you will convert a dataframe to a csv, you still need to state it in df.to_csv. So, try:

for x in dfs:
       y=x.toPandas()
       y.to_csv(f"D:/data/df{dfs.index(x) + 1}.csv")

I set it as df{dfs.index(x) + 1} so that the file names will be df1, df2, ... etc.

Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading