Follow

Follow

Contact

Home How to load all csv files in a folder with pyspark

Questions

How to load all csv files in a folder with pyspark

byMR

September 29, 2022

I have a folder which has

Sales_December.csv
Sales_January.csv
Sales_February.csv
etc.

How can i make pyspark read all of them into 1 dataframe?

>Solution :

create an empty list
read your csv files one by one and append DataFrames to the list
use reduce(DataFrame.unionAll, <list>) to combine them
into one single DataFrame

pyspark

byMR

Published September 29, 2022

Add a comment

Leave a ReplyCancel reply

Read more

Questions

Elasticsearch 'match' doesn't work with Dynamic Templates

byMR

September 29, 2022

Questions

How to compare two datetime form Json data with the current datetime in PHP 8

byMR

September 29, 2022

Questions

Why does the variable name in my imported dataset have random characters added to it?

byMR

September 29, 2022

Questions

How make this background blue properly?

byMR

September 29, 2022

Questions

PHP NumberFormatter always gives me the default en_US

byMR

September 29, 2022

Questions

Which is better to use useMemo or useCallback when the function comes as a dependency on useMemo dependency array

byMR

September 29, 2022