I have a folder which has
Sales_December.csv
Sales_January.csv
Sales_February.csv
etc.
How can i make pyspark read all of them into 1 dataframe?
>Solution :
- create an empty list
- read your csv files one by one and append DataFrames to the list
- use
reduce(DataFrame.unionAll, <list>)to combine them
into one single DataFrame