I have Pyspark Dataframe named df as below,
I need to pivot the data based on ProducingMonth and classification column and need to produce the following output
I am using the following pyspark code
pivotDF = df.groupBy("WELL_ID","CLASSIFICATION").pivot("CLASSIFICATION")
while I am displaying the data I am getting error "’GroupedData’ object has no attribute ‘display’"
You need to perform the aggregation after.
from pyspark.sql import functions as F pivotDF = df.groupBy("WELL_ID","producing_month").pivot("CLASSIFICATION").agg( F.first("OIL"), F.first("GAS"), )
Then you can probably use display