Looking for alternative to nested loops in Python

December 11, 2023

I have developed the following code to check if groups of three people are conected at the same time

import pandas as pd
from itertools import combinations

data = {
    'User': ['Esther','Jonh', 'Ann', 'Alex', 'Jonh', 'Alex', 'Ann', 'Beatrix'],
    'InitialTime': ['01/01/2023  00:00:00','01/01/2023  00:00:00', '01/01/2023  00:00:05', '01/01/2023  00:00:07', '01/01/2023  00:00:12', '01/01/2023  00:00:14', '01/01/2023  00:00:15', '01/01/2023  00:00:16'],
    'FinalTime': ['01/01/2023  00:10:00','01/01/2023  00:00:10', '01/01/2023  00:00:12', '01/01/2023  00:00:12','01/01/2023  00:00:16', '01/01/2023  00:00:16', '01/01/2023  00:00:17', '01/01/2023  00:00:17']
}
df=pd.DataFrame(data)

def calculate_overlapped_time(df):
    df['InitialTime'] = pd.to_datetime(df['InitialTime'], format='%d/%m/%Y %H:%M:%S')
    df['FinalTime'] = pd.to_datetime(df['FinalTime'], format='%d/%m/%Y %H:%M:%S')

    overlapped_time = {}

    for i, row_i in df.iterrows():
        for j, row_j in df.iterrows():
            for k, row_k in df.iterrows():
                if i != j and i != k and j != k:
                    initial_time = max(row_i['InitialTime'], row_j['InitialTime'], row_k['InitialTime'])
                    final_time = min(row_i['FinalTime'], row_j['FinalTime'], row_k['FinalTime'])
                    superposicion = max(0, (final_time - initial_time).total_seconds())

                    clave = f"{row_i['User']}-{row_j['User']}-{row_k['User']}"
                    if clave not in overlapped_time:
                        overlapped_time[clave] = 0
                    overlapped_time[clave] += superposicion

    results = pd.DataFrame(list(overlapped_time.items()), columns=['Group', 'OverlappingTime'])
    results['OverlappingTime'] = results['OverlappingTime'].astype(int)

    return results

results_df = calculate_overlapped_time(df)

I want to calculate the overlaping time for groups of roughly 10 people, thus, a code with so many overlapping loops becomes impractical.

Can somebody please tell me if there is an alternative to make this code more scalable to be able to find groups of a bigger size without for loops?

>Solution :

Looks like you’re just pulling up combinations of rows from the same Dataframe. In that case, you can just itertools.combination and use only one loop:

import itertools as it
for [i, row_i], [j, row_j], [k, row_k] in it.combinations(df.iterrows(), 3):
    # Loop code here

scalability

byMR

Published December 11, 2023

Add a comment

Automatic Generic callback

byMR

December 11, 2023

Questions

pgsql – return row containing data from multiple rows

byMR

December 11, 2023

Questions

Can't use default inequality operator inside structure member function

byMR

December 11, 2023

Questions

How to get value from XML Tag using Python?

byMR

December 11, 2023

Questions

can't use method removeAll in Java list

byMR

December 11, 2023

Questions

How to Create a Measure that works Against Whatever the Selected Row Is?

byMR

December 11, 2023

Looking for alternative to nested loops in Python