I have a spark dataframe:
+------+----------------+-------+------+---------+
|name | pDate| status|user |userCount|
+------+----------------+-------+------+---------+
|janani|2022-03-04 |active |qwe123|1 |
|raj |2022-03-03 |active |qwe123|1 |
|ram |2022-03-01 |active |qwe123|1 |
|ram |2022-03-02 |active |qwe123|1 |
+------+----------------+-------+------+---------+
when I pivoted the dataframe with the following code
pvtcountuser = countuser.groupby('pDate','name').pivot('pDate').max('userCount').fillna(0)
I get:
+----------------+------+----------+----------+----------+----------+
|pDate |name |2022-03-01|2022-03-02|2022-03-03|2022-03-04|
+----------------+------+----------+----------+----------+----------+
|2022-03-04 |janani|0 |0 |0 |1 |
|2022-03-03 |raj |0 |0 |1 |0 |
|2022-03-01 |ram |1 |0 |0 |0 |
|2022-03-02 |ram |0 |1 |0 |0 |
+----------------+------+----------+----------+----------+----------+
but the required solution needed is:
+----------------+------+----------+----------+----------+----------+
|pDate |name |2022-03-01|2022-03-02|2022-03-03|2022-03-04|
+----------------+------+----------+----------+----------+----------+
|2022-03-04 |janani|0 |0 |0 |1 |
|2022-03-03 |raj |0 |0 |1 |0 |
|2022-03-01 |ram |1 |1 |0 |0 |
+----------------+------+----------+----------+----------+----------+
Please help on this
>Solution :
Since you want a record per name, the group by should be done only on name
sql_stmt = '''
select *
from values ('janani',date '2022-03-04','active','qwe123',1)
,('raj' ,date '2022-03-03','active','qwe123',1)
,('ram' ,date '2022-03-01','active','qwe123',1)
,('ram' ,date '2022-03-02','active','qwe123',1) as t (name,pDate,status,user,userCount)
'''
countuser = spark.sql(sql_stmt)
pvtcountuser = countuser.groupby('name').pivot('pDate').max('userCount').fillna(0)
pvtcountuser.show()
+------+----------+----------+----------+----------+
| name|2022-03-01|2022-03-02|2022-03-03|2022-03-04|
+------+----------+----------+----------+----------+
| raj| 0| 0| 1| 0|
| ram| 1| 1| 0| 0|
|janani| 0| 0| 0| 1|
+------+----------+----------+----------+----------+