I have an already simplified table of transactions with the customers and the unique items they purchased. Example:
| Customer email | Item |
| ---------------- | ---------------- |
| First | row |
| a@hotmail.com | 111 |
| a@hotmail.com | 112 |
| a@hotmail.com | 113 |
| b@gmail.com | 111 |
| b@gmail.com | 112 |
| c@aol.com | 110 |
| c@aol.com | 111 |
| c@aol.com | 113 |
I want to get a list of popular pairs (combinations) within one client with a number of occurrences:
| item1 | item2 | number of occurrences|
| --------| ----- | -------------------- |
| '111' | '112' | 2 |
| '111' | '113' | 2 |
| '112' | '113' | 1 |
| '110' | '111' | 1 |
| '110' | '113' | 1 |
Is it possible to achieve using SQL? Or I should use something else.
Many thanks for your help in advance.
>Solution :
Consider below query:
WITH purchased AS (
SELECT customer_email, ARRAY_AGG(item ORDER BY item) items
FROM sample GROUP BY 1
),
associations AS (
SELECT customer_email, STRUCT(first, second) AS item_pair
FROM purchased,
UNNEST(items) first WITH OFFSET o1
JOIN UNNEST(items) second WITH OFFSET o2 ON o1 < o2
)
SELECT FORMAT('%t', item_pair) item_pair, COUNT(1) cnt
FROM associations
GROUP BY 1 ORDER BY 2 DESC;
output will be:
with sample:
CREATE TEMP TABLE sample AS
SELECT 'a@hotmail.com' customer_email, '111' item UNION ALL
SELECT 'a@hotmail.com', '112' UNION ALL
SELECT 'a@hotmail.com', '113' UNION ALL
SELECT 'b@gmail.com', '111' UNION ALL
SELECT 'b@gmail.com', '112' UNION ALL
SELECT 'c@aol.com', '110' UNION ALL
SELECT 'c@aol.com', '111' UNION ALL
SELECT 'c@aol.com', '113';
