I have the following pandas dataframe df:
timestamp col1
2021-01-11 11:00 0
2021-01-11 12:00 0
2021-01-11 13:00 1
2021-01-11 14:00 1
2021-01-11 15:00 0
I need to get a timestamp of the first row when col1 is equal to 1. The expected answer is 2021-01-11 13:00.
This is my current solution:
first = None
for index,row in df.iterrows():
if row["col1"] == 1:
if not first:
first = row["timestamp"]
break
How can I simplify it and make it faster?
>Solution :
Solutions if match at least one value:
If there is only 0 and 1 values use Series.idxmax:
out = df.loc[df['col1'].idxmax(),'timestamp']
Or if possible another values like 0, 1 compare by 1:
out = df.loc[df['col1'].eq(1).idxmax(),'timestamp']
Or create DatetimeIndex first:
out = df.set_index('timestamp')['col1'].idxmax()
print (out)
2021-01-11 13:00:00
Solution for any values – if no match idxmax return first value, so possible solutions:
print (df)
timestamp col1
0 2021-01-11 11:00:00 0
1 2021-01-11 12:00:00 0
2 2021-01-11 13:00:00 0
3 2021-01-11 14:00:00 0
4 2021-01-11 15:00:00 0
out = df.set_index('timestamp')['col1'].eq(1).idxmax()
print (out)
2021-01-11 11:00:00
s = df.set_index('timestamp')['col1'].eq(1)
out = s.idxmax() if s.any() else None
print (out)
None