I have two tables;
- db_user (user_id,create_date,country_code)
- db_payment (user_id,pay_amount,pay_date)
I am trying to find what percentage of users are active in 2021 Feb, among all the users joined in 2021 Jan.
I can separately find the total users created in Jan 2021 and who is active in Feb 2021 (see my below query). Then simply divide the numbers. However, I am trying to find all in one query.
SELECT
count(distinct u.user_id) as active_users_jan_to_feb
from db_user as u
left join db_payment as p
ON u.user_id = p.user_id
where YEAR(STR_TO_DATE(u.create_date, "%Y-%m-%d"))=2021
and MONTH (STR_TO_DATE(u.create_date, "%Y-%m-%d"))=01
and YEAR(STR_TO_DATE(p.pay_date, "%Y-%m-%d"))=2021
and MONTH(STR_TO_DATE(p.pay_date, "%Y-%m-%d"))=02
SELECT
count(distinct u.user_id) as total_users_jan_2021
from db_user as u
where YEAR(STR_TO_DATE(u.create_date, "%Y-%m-%d"))=2021
and MONTH (STR_TO_DATE(u.create_date, "%Y-%m-%d"))=01
I joined two tables to create a master view of what users created in what year/month and their payment year/month. However, I am not sure how to go from this master view to find the percentage of users are created in 2021 Jan and they are active in 2021 Feb. Can you please help me understand how I should approach it?
SELECT
u.user_id
,YEAR(STR_TO_DATE(u.create_date, "%Y-%m-%d")) as create_year
,MONTH (STR_TO_DATE(u.create_date, "%Y-%m-%d")) as create_month
,YEAR(STR_TO_DATE(p.pay_date, "%Y-%m-%d")) as pay_year
,MONTH(STR_TO_DATE(p.pay_date, "%Y-%m-%d")) as pay_month
,p.payment_amount
from db_user as u
left join db_payment as p
ON u.user_id = p.user_id
Here is the table creation and sample data import;
CREATE TABLE db_user
(
user_id int PRIMARY KEY,
create_date TEXT,
country_code TEXT
);
INSERT INTO db_user (user_id, create_date, country_code)
VALUES
(1, '2019-01-01', 'US'),
(2, '2020-02-01', 'US'),
(3, '2021-01-01', 'US'),
(4, '2021-02-01', 'TR'),
(5, '2021-03-01', 'FR'),
(6, '2021-06-01', 'FR'),
(7, '2021-02-11', 'US'),
(8, '2021-02-19', 'TR'),
(9, '2021-01-10', 'US');
CREATE TABLE db_payment
(
user_id int,
payment_amount double,
pay_date TEXT
);
INSERT INTO db_payment (user_id, payment_amount, pay_date)
VALUES
(1, 10, '2019-01-01'),
(1, 10, '2019-02-01'),
(1, 10, '2019-03-01'),
(3, 10, '2021-01-01'),
(3, 10, '2021-02-01'),
(4, 10, '2021-02-01'),
(4, 10, '2021-03-01');
>Solution :
SELECT 100 * COUNT(DISTINCT db_payment.user_id) / COUNT(DISTINCT db_user.user_id) AS percent_active_in_Feb_from_joined_in_Jan
FROM db_user
LEFT JOIN db_payment ON db_payment.user_id = db_user.user_id
AND db_payment.pay_date BETWEEN '2021-02-01' AND '2021-02-28'
WHERE db_user.create_date BETWEEN '2021-01-01' AND '2021-01-31';
https://dbfiddle.uk/?rdbms=mysql_8.0&fiddle=bb1ac14b5410f65b2922a9f771a8c6db