I’m new in python and I have a list of dates as shown below.
'2021-01-23 06:14:16'
'2021-01-23 06:14:17'
'2021-01-23 06:19:35'
'2021-01-23 07:05:03'
'2021-01-23 07:05:04'
'2021-01-23 07:05:33'
'2021-01-23 07:05:34'
What I’m looking for is to remove the dates with the closest times and save it in a CSV file:
'2021-01-23 06:14:16'
#####'2021-01-23 06:14:17'
'2021-01-23 06:19:35'
'2021-01-23 07:05:03'
#####'2021-01-23 07:05:04'
'2021-01-23 07:05:33'
#####'2021-01-23 07:05:34'
Resulting in something like this:
'2021-01-23 06:14:16'
'2021-01-23 06:19:35'
'2021-01-23 07:05:03'
'2021-01-23 07:05:33'
I have tried with this code, but it doesn’t work correctly. Could someone help me
data_detections = pd.read_csv('path/to/file.csv')
date_detec = data_detections.Fecha_UTC.values.tolist()
time_detec = data_detections.Hora_UTC.values.tolist()
com_date = data_detections.Fecha_UTC.values.tolist()
com_time = data_detections.Hora_UTC.values.tolist()
file_csv = open('path/to/file2.csv', "w")
file_csv.write('Fecha_UTC,Hora_UTC\n')
for t1,d1 in zip(time_detec,date_detec):
h1 = "".join(t1[0:2])
m1 = "".join(t1[3:5])
s1 = "".join(t1[6:])
tm_comp = d1+','+h1+':'+m1
for t2,d2 in zip(com_time,com_date):
h2 = "".join(t2[0:2])
m2 = "".join(t2[3:5])
s2 = "".join(t2[6:])
dd = "".join(d2[8:])
mm = "".join(d2[5:7])
yy = "".join(d2[0:4])
time_UTC = d2+','+h2+':'+m2
if tm_comp == time_UTC:
if s1 == s2:
pass
elif int(s2)-10 < int(s1) < int(s2)+10:
if s1 == s2:
print('Same')
elif int(s1)-10 < int(s2) < int(s1)+10:
if s1 != s2:
print(tm_comp+':'+str(s1))
file_csv.write(tm_comp+':'+str(s1)+'\n')
else:
continue
>Solution :
To achieve your goal of removing dates with the closest times and saving the remaining dates in a CSV file:
import pandas as pd
from datetime import datetime
# Load the data
data_detections = pd.read_csv('path/to/file.csv')
# Combine date and time into a single datetime column
data_detections['datetime'] = pd.to_datetime(data_detections['Fecha_UTC'] + ' ' + data_detections['Hora_UTC'])
# Sort the datetime column
data_detections = data_detections.sort_values(by='datetime').reset_index(drop=True)
# Initialize a list to store the filtered dates
filtered_dates = []
# Iterate over the datetime column and filter out close times
previous_time = None
for current_time in data_detections['datetime']:
if previous_time is None or (current_time - previous_time).total_seconds() > 10:
filtered_dates.append(current_time)
previous_time = current_time
# Convert the filtered dates back to the original format
filtered_dates_str = [(dt.strftime('%Y-%m-%d'), dt.strftime('%H:%M:%S')) for dt in filtered_dates]
# Create a new DataFrame for the filtered dates
filtered_df = pd.DataFrame(filtered_dates_str, columns=['Fecha_UTC', 'Hora_UTC'])
# Save the filtered dates to a new CSV file
filtered_df.to_csv('path/to/file2.csv', index=False)
print('Filtered dates saved to file2.csv')