I have an employee timeclock table that will store multiple clock-in and clock-out dates per day. So the table will looks something like this:
| punch_timestamp | user_id |
|---|---|
| 2023-04-01 08:00 | 1 |
| 2023-04-01 12:00 | 1 |
| 2023-04-01 13:00 | 1 |
| 2023-04-01 17:00 | 1 |
| 2023-04-01 08:00 | 2 |
| 2023-04-01 14:00 | 2 |
| 2023-04-01 08:00 | 3 |
I know I can do this easily in code with loops and logic, but I am wondering if it is possible in a single SQL query to return the expected results below? A user may just clock in and out in one day for a total of 2 punches, or clock out for lunch which would result in 2 sets of punches (in-out, in-out), or even possible to have 3 or 4 sets of punches. It is possible they forget to clock in or out for an odd number of punches, in when the result would just be "Error".
The expected result would be something like this:
| date | user_id | hours_worked |
|---|---|---|
| 2023-04-01 | 1 | 8 |
| 2023-04-01 | 2 | 6 |
| 2023-04-01 | 3 | Error – missing punch |
Is there any SQL query that can grab the sum between timestamps assuming there are multiple begin-end sets of timestamps?
>Solution :
You can produce row numbers for each user, partitioned by user_id, which can then be used to find the alternating series of "punch in", "punch out" hourly intervals for the user:
with cte as (
select t1.*, (t1.r%2 = 0)::int p from (
select t.*, row_number() over (partition by t.user_id order by t.punch_timestamp) r
from timeclock t) t1
)
select t3.date, t3.user_id, sum(t3.hours) hours from (
select date(c.punch_timestamp) date, c.user_id, case when c1.punch_timestamp is null
then null -- in the event of an error, produce null
else extract(hours from c1.punch_timestamp - c.punch_timestamp) end hours
from cte c
left join cte c1 on c.user_id = c1.user_id and c.r + 1 = c1.r
where c.p = 0) t3
group by t3.date, t3.user_id