Select most frequent value along with group by

I have following SQL table

eventdate   userid  traffic location    
18.09.2023  user_1  10      A
18.09.2023  user_1  20      A
18.09.2023  user_2  10      B
18.09.2023  user_2  20      B
18.09.2023  user_2  30      B
18.09.2023  user_3  100     A
19.09.2023  user_1  50      B
19.09.2023  user_2  10      B
19.09.2023  user_2  20      B
19.09.2023  user_3  150     C
19.09.2023  user_3  250     C
20.09.2023  user_1  50      A
20.09.2023  user_1  20      A
20.09.2023  user_2  30      B
20.09.2023  user_3  110     C
20.09.2023  user_3  120     C

I want to have result as follows
eventdate – beginning of each week,
userid – unique user id per week,
traffic – sum of all traffics,
location – the most frequent location appeared with the week

For example

    eventdate   userid  traffic location
    18.09.2023  user_1  150    A
    18.09.2023  user_2  120    B
    18.09.2023  user_3  730    C

I was able to achieve the result with following query

SELECT t1.eventdate, t1.userid, t1.traffic, t2.location
  FROM (SELECT TO_CHAR(TRUNC(TO_DATE('2023-09-18', 'yyyy-mm-dd'), 'IW'),
                       'yyyy-mm-dd') AS eventdate,
               tk.userid,
               SUM(tk.traffic) AS traffic
          FROM test_kt tk
         GROUP BY tk.userid) t1
  JOIN (
         WITH cte AS 
        (
         SELECT tk2.userid,
                tk2.location,
                ROW_NUMBER() OVER 
                (PARTITION BY tk2.userid ORDER BY COUNT(tk2.location) DESC) rn
           FROM test_kt tk2
          GROUP BY tk2.userid, tk2.location
        )
        SELECT userid, location 
          FROM cte 
         WHERE rn = 1
       ) t2 
    ON t1.userid = t2.userid;

Are there any efficient ways to do so?

>Solution :

You may achieve the same with STATS_MODE aggregate function:

with a(eventdate, userid, traffic, location) as (
  select to_date('18.09.2023', 'dd.mm.yyyy'), 'user_1', 10,  'A' from dual union all
  select to_date('18.09.2023', 'dd.mm.yyyy'), 'user_1', 20,  'A' from dual union all
  select to_date('18.09.2023', 'dd.mm.yyyy'), 'user_2', 10,  'B' from dual union all
  select to_date('18.09.2023', 'dd.mm.yyyy'), 'user_2', 20,  'B' from dual union all
  select to_date('18.09.2023', 'dd.mm.yyyy'), 'user_2', 30,  'B' from dual union all
  select to_date('18.09.2023', 'dd.mm.yyyy'), 'user_3', 100, 'A' from dual union all
  select to_date('19.09.2023', 'dd.mm.yyyy'), 'user_1', 50,  'B' from dual union all
  select to_date('19.09.2023', 'dd.mm.yyyy'), 'user_2', 10,  'B' from dual union all
  select to_date('19.09.2023', 'dd.mm.yyyy'), 'user_2', 20,  'B' from dual union all
  select to_date('19.09.2023', 'dd.mm.yyyy'), 'user_3', 150, 'C' from dual union all
  select to_date('19.09.2023', 'dd.mm.yyyy'), 'user_3', 250, 'C' from dual union all
  select to_date('20.09.2023', 'dd.mm.yyyy'), 'user_1', 50,  'A' from dual union all
  select to_date('20.09.2023', 'dd.mm.yyyy'), 'user_1', 20,  'A' from dual union all
  select to_date('20.09.2023', 'dd.mm.yyyy'), 'user_2', 30,  'B' from dual union all
  select to_date('20.09.2023', 'dd.mm.yyyy'), 'user_3', 110, 'C' from dual union all
  select to_date('20.09.2023', 'dd.mm.yyyy'), 'user_3', 120, 'C' from dual
)
select
  trunc(eventdate, 'iw') as eventdate,
  userid,
  sum(traffic) as traffic,
  stats_mode(location) as location
from a
group by
  trunc(eventdate, 'iw'),
  userid
EVENTDATE USERID TRAFFIC LOCATION
2023-09-18 user_1 150 A
2023-09-18 user_2 120 B
2023-09-18 user_3 730 C

fiddle

UPD: You may also consult the documentation for this function that shows sample query to obtain multiple modes if they are exist (for which you will use DENSE_RANK in your original query in place of ROW_NUMBER).

Leave a Reply