Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

pyspark – getting error 'list' object has no attribute 'write' when attempting to write to a delta table

I am attempting to read the first X number of rows of a delta table into a dataframe, and then write (overwrite) that back to the delta table. Here is code:

# read from entire delta table into dataframe
revEnrichRef = spark.read.format("delta").load("/mnt/tables/myTable")

# retrieve first 5 rows
dfSubset = revEnrichRef.head(5)
dfSubset.write.format("delta").mode("overwrite").save("/mnt/tables/myTable")

at this point I get the error: ‘list’ object has no attribute ‘write’

I guess that means head returns list rather than a new dateframe. What I really want is a solution that will return x rows to a dataframe. Alternatively, have a way to do this without an intermediary dataframe is just as good. Any help is appreciated. Thanks

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

>Solution :

You can do so with the limit method. This returns a dataframe limited to the number of rows passed as the argument.

dfSubset = revEnrichRef.limit(5)

The head method is an action which will collect 5 rows from your dataframe as a list. (or a single Row object if n = 1)

Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading