How do I produce a new column with values based on a Partition of one column while also based on the values of two additional columns (BigQuery)?

I have a table that records all the different statuses for a list of Jobs with timestamps. So the ID column has many Ids that appear several times as their status changes such as with the ‘xyz’ job below that went through several status changes.

JobId Status Timestamp
xyz pending 1:00
xyz reviewed 1:02
xyz cancelled 1:04
abc pending 4:30
abc active 5:30

I want to add on a 4th column called CurrentStatus that will show the latest status for each Id, so in the end the table would look like this.

JobId Status Timestamp CurrrentStatus
xyz pending 1:00 cancelled
xyz reviewed 1:02 cancelled
xyz cancelled 1:04 cancelled
abc pending 4:30 active
abc active 5:30 active

So far I wrote the following query to create a LatestTimeStamp column that prints out the latest timestamp partitioned by each Id.

SELECT *, MAX(Timestamp) OVER(PARTITION BY JobId) AS LatestTimeStamp
  FROM `data.sample`
JobId Status Timestamp LatestTimeStamp
xyz pending 1:00 1:04
xyz reviewed 1:02 1:04
xyz cancelled 1:04 1:04
abc pending 4:30 5:30
abc active 5:30 5:30

It produced the table above but now how do I use that info to grab the actual status instead of the timestamp? or is there a more efficient method to produce a column with the current status for each id, maybe using correlated queries or CASE statements?

I can’t figure out a query that works so any help would be appreciated.

>Solution :

Use below

select *, 
  first_value(Status) over(partition by JobId order by Timestamp desc) as CurrrentStatus
from your_table                

if applied to sample data in your question – output is

enter image description here

Leave a Reply